D.B. Morris

Bachelor thesis (1)

1 records found

Synthesizing Comics via Conditional Generative Adversarial Networks

Bachelor thesis (2021) - D.B. Morris (author) , Lydia Chen (mentor) , Zilong Zhao (mentor) , Arie van Deursen (graduation committee member)

The creation of comic illustrations is a complex artistic process resulting in a wide variety of styles, each unique to the artist. Conditional image synthesis refers to the generation of de novo images based on certain preconditions. Applying machine learning to conditionally generate novel comics proves an intriguing yet difficult task. This paper aims to answer whether Generative Adversarial Networks (GANs) can be used for conditional comic synthesis. Recent advancements in Generative Adversarial Networks have increased the capability of image synthesis to hyper-realistic levels. Despite this, the performance of GAN models is almost always assessed on photo-realistic images. To extend experimental knowledge of unconditional GAN performance into the domain of comics, an empirical analysis was performed on the unconditioned generative performance of three cutting edge GAN architectures: Deep Convolutional GAN (DCGAN), Wasserstein GAN (WGAN), and Stability GAN (SGAN). This paper showed that the SGAN implementation far outperforms both the DCGAN and WGAN architectures on a dataset of Dilbert comics, achieving an FID score of 89.1. Due to their relative simplicity, comics provide an intriguing candidate for conditional generation. A comic panel can likely be described using a few specific labels (eg. background and characters). Two conditional networks were created, using the SGAN architecture as a baseline. Multi Class SGAN (MC-SGAN) used a traditional multi-class conditional approach while the Multi Label SGAN (ML-SGAN) utilized a multi-label auxiliary classification approach. Multiple experiments were performed between these two networks resulting in hundreds of hours of training. While performance between the networks was quite similar on simple conditional tasks, on more complex tasks MC-SGAN outperformed ML-SGAN. MC-SGAN was able to conditionally generate comics based on character and color, with desired conditions distinguishable in almost all outputs. Issues with traditional methods of auxiliary classifier training in the MC-SGAN implementation are additionally identified and discussed.