Image Restauration
Last updated
Last updated
THese are generative techniques.
Start with an image - but we're not gonna create a segmentation mask, we're gonna try to create a better image. It can be low-res to high res, color an image, image where somehting has been cut out and you replace the cut out thing. Take a photo and make it look like a Monnet painting.
These are all image to image generation tasks.
We want to turn a bad image to a good one. First step is to take a good image and pass it into a crappify funciton where you deteriorate it any way you like. If your end goal is to colorise a B&W photo, then your crappify function would turn the good image in B&W, if you want to fix a whole in the image, you'd create a big black box, you can also reduce the size, add text to it, etc.
Anything you don't include in crappify, your model wont learn to fix.
To do this we're using a U-net. See code that create the model.
We end up with this - It does a good job at removing the numbers, but a bad job at making high res image.
That is because of the loss function that we use. We used a MSE, which doesnt work cuz when you think about it most of the pixels are about the same color so the MSE is already very low.
We want a loss function that does a better job of saying - is this a good quality picture of this thing.
There's a general way of answering that question - it's called a GAN - generative adversarial network. A GAN solves this problem by using a loss function that calls another model.
We have a crappy image that gets passed into a generator (the Unet). This gives us a pred - the image. Together with the high res image that gets passed into pixel MSE.
We can also train another model - called the discriminator or the Critic, we can try to build a binary classification model that takes all the the pairs of generated image, and the real high res image and tries to learn to classify which is which. This is a regular standard binary cross entropy classifier.
With one of those, and fine tune the generator, and rather than choosing the pixel MSE as the loss, the loss could be: how good are we at fooling the critic. Can we create generated images that the critic thinks are real. If it can do that, then it's gonna learn to create generate image where the critic can't tell whether they are real of fake.
But then the critic is gonna suck and not tell the diff, so we're gonna stop training the generator and train the critic some more on the newly generated images. Now that the generator is better there's a tougher task of deciding which is real and fake, so we're training that a bit more.
Once we've done that, we'll fine tune the generator some more using the better critic.
This is the fastai way - where they pretrain the generator and pretrain the critic. It's because the difficulty of training GAN are at the start - it could take a very long time to train. if you dont have a pretrained generator and critic, then you're tryin to train but the critic really sucks, etc.
Therefore, if you can try to find a way to generate images without using a GAN - like MSEPixel loss - and discriminate things without using a GAN - like predict on that first generator, you can make a lot of progress.
So let's create the critic. We need 2 folders - 1 with the generated images, the other with the real images. So we have to save our generated images.
To create a gan critic you have to wrap its loss function with AdaptiveLoss(nn.BCEWithLogisticLoss). = binary cross entropy with lo..