Basic DCGAN and Home PC Limitations

In a previous post, I ran through a tutorial set where I built a Basic DCGAN. Similar to the PyTorch tutorial. Both use 64x64 images. I had some decent results but a 64x64 image isn't very useful and with upscaling I started seeing dilution of pixels.

Scaling up the image size to 128x128 or larger

What needs to be changed?

First, change the constants in the trainer, I'll go over why I needed to lower the batch size later. Make sure that your training data set is sized to match 128x128 or larger. This is done in a separate file included in the author's code.  

#trainer Constants
BATCH_SIZE = 16
IMAGE_SIZE = 128
FEATURES_DISC = 128
FEATURES_GEN = 128

Next, the Discriminator and Generator need to have layers added. The discriminator adds 1 Conv2d layer and the generator adds 1 Conv2dTranspose layer. This gets us from 64 to 128 on the generator. The author makes a "_block" section to actually create the conv2d layers (not seen in the snippet).

#DISCRIMINATOR CODE SNIPPET
self._block(features_d, features_d * 2, 4, 2, 1),
self._block(features_d * 2, features_d * 4, 4, 2, 1),
self._block(features_d * 4, features_d * 8, 4, 2, 1),
# added by me to sample up , i think this gets us to 128
self._block(features_d * 8, features_d * 16, 4 ,2,1),
# added by me to sample up , i think this gets us to 256
self._block(features_d * 16, features_d * 32, 4, 2, 1), 
# added by me to sample up , i think this gets us to 512
self._block(features_d * 32, features_d * 64, 4, 2, 1), 
self._block(channels_noise, features_g * 16, 4, 1, 0),  # img: 4x4
self._block(features_g * 16, features_g * 8, 4, 2, 1),  # img: 8x8
self._block(features_g * 8, features_g * 4, 4, 2, 1),  # img: 16x16
self._block(features_g * 4, features_g * 2, 4, 2, 1),  # img: 32x32
self._block(features_g * 2, features_g * 2, 4, 2, 1),  # img: 64x64  
self._block(features_g * 2, features_g * 2, 4, 2,1),   # i think 256x256
self._block(features_g * 2, features_g * 2, 4, 2, 1),  # should go 512x512
128 and 256 are working! ... sort of

CUDA and GPU limitations

This is where a 'newb' like me will struggle. Going to 128x128 i already ran into memory limitations.

RuntimeError: CUDA out of memory. Tried to allocate 8.00 GiB (GPU 0; 11.00 GiB total capacity; 5.53 GiB already allocated; 4.45 GiB free; 5.53 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

This is a pretty common beginner error with Cuda. The most accepted solution is to lower your batch size. The computational complexity of the model is higher and the trainer is trying to allocate more memory than the graphics card has. Lowering the batch size should keep the trainer from trying to allocate as much memory on the graphics card at once.

To get to 128x128 I had to lower my batch size from 64 to 16.
To get to 256x256 I had to lower my batch size to 1

Workaround using CPU only

Let me start by saying, don't do this. My reasoning is that my desktop has around 48 gigs of RAM available on the motherboard whereas my 1080ti has 12 gigs. Attempting to get a 512x512 image this seemed to be my only option.

#device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# forcing the use of "cpu" even though cuda is available
device = torch.device("cpu")

This used up about 47 gigs of my 48 gigs of ram. My computer didn't seize but it was close. Mouse inputs and audio went out for a bit. The python code kept churning though and eventually failed out asking for more ram than I had.

Time for some reading

I am expecting a tutorial model to upscale to an unreasonable degree. I know StyleGan and StyleGan2 have good results on larger images and I may see what I can do with them. The issue is how much data is going into ram and I have to assume these other architectures have figured out a workaround.

Two promising papers:
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Progressive Growing of GANs for Improved Quality, Stability, and Variation

This page does a good job of explaining "Brittleness" in Gan training
A Gentle Introduction to BigGAN the Big Generative Adversarial Network

Upscaling with "Bigjpg"

I looked at a couple of algorithms to use deep learning to upscale some images and stumbled on this program.  The results were interesting. The result using 4 different methods are shown below. The Bigjpg algorithm which uses "deep learning" does a decent job of unblurring the image at a larger scale.  The noise reduction looks like it smooths a bit too much.  
1 - Using Pillow library to 'resize' the original 64x64 image
2 - Using Bigjpg to upscale with no noise reduction
3 - Using Bigjpg to upscale with medium settings for noise reduction
4 - Using Bigjpg to upscale with hi settings for noise reduction

Scaled using pillow library resize
Bigjpg - no noise reduction
Bigjpg - medium noise reduction
Bigjpg - high noise reduction