Stable Diffusion on WSL2 with Docker

I’ve been having a lot of fun over the last couple of weeks exploring Stable Diffusion, a deep learning text-to-image AI model. One key difference between it and similar models such as DALL-E and Midjourney is that you can download and use Stable Diffusion on your own machine, offline! In this post, I’ll go over how I set up my computer to work with SD running on WSL2 using Docker.

Please note: this information will likely become outdated quickly, as things are moving very fast in the SD world in terms of development, forks, and requirements. I might make some updates to this post, but I’m not guaranteeing that.

Hardware requirements

Computer hardware, artstation, 4k seed:1867467475 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

There’s no way around this: you need a good graphics card. A recent Nvidia card with 10GB+ of video RAM is best. I was able to run a low VRAM optimized SD fork on a laptop with a GTX 1060, but generating a single image took nearly a minute. Today, I have an RTX 3060 installed in my main desktop, and image generation takes 8-10 seconds using similar settings. (This desktop previously had a GTX 960, so I didn’t even bother trying with that card.)

Some SD forks can work with AMD and Intel GPUs more or less out of the box, but since I’m not using that hardware, I won’t be focusing on those scenarios. Setup for them will be significantly different (likely easier?) because they do not require the installation of Nvidia-specific Docker runtime components. There are also some forks that should work on M1 Macs, but I couldn’t say how well or what it might take to get them running.

You also need plenty of disk space. Having 30GB+ free is probably a reasonable minimum. PyTorch, CUDA runtimes, and the ML models all take up multiple gigabytes of space. So does a Linux installation on top of WSL2. Make sure you have a decent amount of free space.

Software choices/requirements

Synthwave illustration of floppy disks and CD-ROMs, artstation, 4k seed:4034571955 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

I wanted to run SD in Docker because there is a lot of configuration and software installation that happens when setting it up, and, even with Anaconda isolating some of the aspects of this setup, there is a good chance of messing up the host system. Using Docker would limit the necessary host changes to just supporting the use of Docker, which is fine by me, since I like using Docker for a lot of experimentation already.

Since I run Windows 11, I’ve been enjoying WSL2 for its ever-increasing ability to run almost any Linux-based workload. It even has support for systemd in preview now. One unusual thing I like to do, though, is use Docker directly inside WSL2 instead of installing Docker Desktop, because I have full control of the configuration, and it’s completely isolated inside a WSL2 environment.

As an aside, the parts of this post that are not WSL2-specific should work just fine in a native Linux environment, assuming the necessary graphics drivers are installed and configured correctly.

Prerequisites

Illustration of a checklist, artstation, 4k seed:501404732 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

You should have a system configured for WSL2 with a clean install of Ubuntu 22.04. While Docker works with most other distros, this post assumes the use of Ubuntu 22.04 inside WSL2. You should also have a recent Nvidia Game Ready Driver installed in Windows. Obviously, an Nvidia graphics card should be present as well. As I mentioned earlier, a modern one with 10GB+ VRAM will work much better than older models or models with less VRAM.

Step 1: installing Docker inside Ubuntu 22.04

Illustration of a shipping dock, artstation, 4k seed:2392583295 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

Because of some kernel weirdness in WSL2, modern versions of Ubuntu and Docker can’t quite deal with iptables on a default installation. Per this helpful blog post, you should run sudo update-alternatives --config iptables and select iptables-legacy before starting. This way, Docker won’t have trouble configuring its firewall rules.

Now, we can pretty much just follow the official Docker guide for installing Docker CE:

# Get the necessary components for installation
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg lsb-release

# Register Docker's GPG key
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# Add Docker's apt repo
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker CE and Docker Compose
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin

# Add your current user to the `docker` group
sudo usermod -aG docker $USER

Restart your terminal to get that last usermod command to work, and a basic Docker installation should be up and running. Now, because systemd isn’t used by WSL2 (at time of writing), launching the Docker daemon isn’t quite as straightforward as enabling it for startup via systemctl. Instead, you must manually launch it every time WSL2 starts up:

# Launch Docker
sudo /etc/init.d/docker start

# Verify its status
sudo /etc/init.d/docker status
# Output should be:
# * Docker is running

Step 2: installing Nvidia Container Toolkit

Illustration of stacked graphics cards, artstation, 4k seed:168331480 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

To get Docker to work with CUDA, you need to install Nvidia’s special runtime for Docker. Again, we’ll mostly be following the official guide for doing this:

# Add Nvidia's apt repo
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# Install the container toolkit
sudo apt-get update
sudo apt-get install nvidia-docker2

At this point, it’s a good idea to modify the Docker configuration to enable log rotation. Normally, that would be the first post-installation step for Docker, but the Nvidia Container Toolkit modifies the same file, so we can make both changes take effect simultaneously.

The /etc/docker/daemon.json file should contain the following configuration data:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

To add log rotation, we can modify it like so:

{
    "log-driver": "json-file",
    "log-opts": {
        "max-size": "10m",
        "max-file": "3"
    },
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

Next, it’s a good idea to verify that everything works correctly:

# Restart the Docker daemon for configuration changes to take effect
sudo /etc/init.d/docker restart

# Verify the Nvidia CUDA functionality
docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

If the last command’s output contains information about the current graphics driver and CUDA versions, then everything is good to go!

Step 3: choosing a Stable Diffusion fork

Illustration of a road with multiple paths, artstation, 4k seed:1545215604 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

Here’s where things get complicated. It’s a bit of a wild west right now, in terms of choices for how to run SD and what features you might get. The original repository is very basic, with only command-line tools available. A lower VRAM optimized version with a basic web UI came out shortly after. Today, people seem to use either the AUTOMATIC1111 fork, which has neat scripting abilities, or the hlky fork, which simplifies getting started and the overall usage of SD.

We’ll be using the hlky fork, since it has a very nice feature of automatically downloading the model files if you don’t already have them.

EDIT: After encountering unexpected breakage in the hlky fork and seeing some difficulties in getting other forks working, I decided to create a meta-fork, which allows for somewhat more controlled ways to run different forks. The text below, however, does not reflect this update…

Step 4: running Stable Diffusion

Illustration of running on a track, artstation, 4k seed:425084975 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

The basic steps for running the hlky fork (at time of writing) are:

Clone the repo
Create and edit (if necessary) the .env_docker file
Run Docker Compose

Pretty simple, right? Let’s do this!

# Go to the home directory
cd

# Clone the repo
git clone https://github.com/sd-webui/stable-diffusion-webui.git

# Note: if the above command fails, you might need to `sudo apt install git`

# Create the env file
cd stable-diffusion-webui
cp .env_docker_sample .env_docker

# Edit the `.env_docker` file if needed with vim, nano, or your preferred editor
nano .env_docker

# IMPORTANT: if you have a low VRAM GPU, you MUST edit this file and change
# `WEBUI_ARGS=` to `WEBUI_ARGS=--optimized`

# It's a good idea to make Python output text without buffering:
echo PYTHONUNBUFFERED=1 >> .env_docker

# Launch it!
docker compose up --build

The first time you run docker compose up, it will take a long time. Not only will Docker be downloading Nvidia’s multi-gigabyte CUDA runtime images, but the hlky SD fork will then be downloading all of the ML models that it needs.

When the downloading, setting up, and loading of models is complete, you should see something like this in your terminal output:

sd-webui  | Running on local URL:  http://0.0.0.0:7860
sd-webui  |
sd-webui  | To create a public link, set `share=True` in `launch()`.

At this point, you should be able to navigate using your favorite web browser to http://localhost:7860 and see something like this:

Screenshot of Gradio using the hlky fork of Stable Diffusion

Step 5: having fun!

Illustration of triumphantly finishing a marathon, artstation, 4k seed:91719444 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a, upscaled with GoBig

Now that you have SD running, try some inputs! For inspiration, you can check out Lexica, which has lots of images together with their prompts. When you’ve gotten the hang of it, you can also play around with img2img, which can be used to modify existing images based on text prompts, as well as “the lab”, which allows you to upscale images.

Trying out other forks of Stable Diffusion is also interesting. While setting them up can be a challenge at times, they often have unique and fascinating features. Go ahead and explore!

P.S. The featured image, like the other images in this post, was generated using Stable Diffusion. The following parameters were used before upscaling with GoBig:

Exploring the universe, artstation, 4k seed:4027549823 width:512 height:512 steps:50 cfg_scale:7.5 sampler:k_euler_a