stylegan2-pytorch

Python ★ 3.8k updated 1y ago

Simplest working implementation of Stylegan2, state of the art generative adversarial network, in Pytorch. Enabling everyone to experience disentanglement

This repository is a PyTorch implementation of StyleGAN2, a machine learning model that generates realistic images of things that do not exist. StyleGAN2 is well-known for producing convincing photographs of imaginary faces, flowers, cities, and hands. The sample images in the README demonstrate outputs trained on those subjects.

Unlike many deep learning tools that require writing Python code to train, this implementation is designed to work entirely from the command line. You point it at a folder of images with a single command and it trains itself, periodically saving sample images and model checkpoints. No additional code is needed to get started.

Training requires a machine with a GPU and CUDA, which is Nvidia's software for running computations on a graphics card. Once training finishes, you can generate new images from the latest checkpoint, or create an interpolation video that smoothly transitions between two randomly chosen points in the model's learned space. A truncation parameter controls the trade-off between image quality and variety in the outputs.

The library supports a few additional scenarios. Multiple GPUs on a single machine can be used together with a flag. If your dataset is small, a differentiable augmentation technique developed in 2020 can improve results with as few as 1,000 to 2,000 images by randomly transforming images during training without those changes leaking into the final outputs. Self-attention layers can be added to specific network layers to improve generation quality. Transparent PNG images are also supported with a flag.

GPU memory is the main constraint on image resolution and network size. The README includes guidance on reducing batch size and network capacity to fit training onto smaller GPUs.

Open on GitHub → Full breakdown on explaingit →