Documenting the process behind making a GAN based on Haeckel’s lithographs for ‘Kunstformen der Natur’
There are over 100 plates involved in the making of the 2 volumes but within each plate there are between 3 and 15 organic structures. Many of them based on the microscopic mineral skeletons of protozoa. I thought they’d make a great base for a GAN, but only if each organism could be isolated.
Creating the Haeckel Dataset
There’s no real way around this, a dataset like this has to be hand curated. Each organism was cut from its surroundings, cleaned up and placed onto a black background. Because some of the plates are colour against white I either desaturated them and then hand cut each one out or else I desaturated and inverted them. I eventually ended up with a set of 1000 images at 1024 x 1024.
The GAN model
I used Stylegan 2 ADA running on 1 x 16GB GPU over at Paperspace. Initially I was over optimistic and tried to train on the 1000 image dataset. I’d had some good results from a previous 2000-ish dataset and just thought, why the hell not? I was wrong and after a 70ish ticks it went into mode collapse.
What you can do in this situation is find the same model trained on a larger but similar enough set of data, you can then use the weights from this model to prime your model, so I went hunting and found a set of pre-trained Stylegan 2 models. I went down a few dead alleys in there until I eventually stumbled upon my personal nightmare data. Trypohobia. I don’t even know why this model exists or what purpose it serves but Trypophobia refers to a fear of too many holes close together. A bit of weird info on Trypophobia here.
It was the closest set of data I could find, so I started the model running on this with limited optimism.
Results on Haeckel Data Pretrained with Trypophobia
So much better and fast (comparatively), obviously there’s a preponderance towards- holes close together which is present within a fair amount of the Haeckel images as well.
Latent Space Interpolation
Pretrained Mk 2
I decided to augment the initial pretrain data, what i’d been looking for initially was electron microscopy databases and eventually came across something called the NFFS-Europe — 100% SEM Dataset. So I used my Trypophobia pretrained set, trained for a short period on Electron microscopy images and then threw in the Haeckel Dataset. What this did was give the results more of a matt grey effect.
Generate a Haeckel
I’ve hosted a version of the Haeckel generator here (reduced - as the large pkl file would not work inside Runway where I deployed it)