stable-diffusion-training-tutorial/AWS.md at main · jehna/stable-diffusion-training-tutorial

This is a tutorial to get teach you how to teach a custom face to Stable Diffusion using AWS P3 instance.

Note that AWS has long wait times for P3 instances, and then instance types are not that cost effective for large models as with GCP. Because of this, we need to use a different repo for training on AWS that is compatible with running the model with 16gb of memory per GPU.

Prerequisites

AWS account with a GPU instance limit >= 1
HuggingFace account's API key (create a free account if you don't have one)

Setup the VM

(These steps you'll only need to do once)

Create a EC2 (AWS) instance.

Choose an OS with pre-installed PyTorch like "Deep Learning AMI GPU PyTorch 1.12.1"
Choose an instance with at least 1 GPU with 16gb memory (like p3)
Choose a storage size of at least 100gb

SSH into the instance
Clone ShivamShrirao/diffusers repo

git clone https://github.com/ShivamShrirao/diffusers.git
cd diffusers/examples/dreambooth

Install required packages

pip3 install git+https://github.com/ShivamShrirao/diffusers.git
pip3 install -U -r requirements.txt
pip3 install bitsandbytes
accelerate config # No distributed training, no deepspeed, no fp16/bf16

Create an image for training images and output images

mkdir training-images
mkdir output-model
mkdir class-images
mkdir output-images

Create a script for inference and converting to .ckpt

cat << EOF > inference.py
from diffusers import StableDiffusionPipeline
import torch
import time
import sys
import re

model_id = "output-model"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")

# load prompt from arguments
prompt = sys.argv[1]
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0]

image.save("output-images/{}_{}.png".format(re.sub(r'\W+', '_', prompt).lower(), int(time.time())))
EOF
wget -q https://github.com/ShivamShrirao/diffusers/raw/main/scripts/convert_diffusers_to_original_stable_diffusion.py

You're done! Log out and now you can start training the model with your own images

Train the model

Crop your training images (5-8 pictures) to squares with 512x512 pixels dimensions, with the face in the center
Move your training images into the VM

scp training-images/* ec2-user@<VM_IP>:~/diffusers/examples/dreambooth/training-images/

Make sure you don't show teeth in the training images — Stable Diffusion is not that good in drawing teeth and having teeth in the training images will make Stable Diffusion to draw teeth in the final image.

SSH into the instance
Train the model

cd diffusers/examples/dreambooth
accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path='CompVis/stable-diffusion-v1-4' \
  --instance_data_dir=training-images \
  --class_data_dir=class-images \
  --output_dir=output-model \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="sks <subject>" \
  --class_prompt="a photo of <subject>" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=2 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=200 \
  --max_train_steps=800

Run inference

python3 inference.py "a photo of sks <subject>"

Load images back to your own machine

scp ec2-user@<VM_IP>:~/diffusers/examples/dreambooth/output-images/* ./

Convert to .ckpt

Run the conversion script

python3 convert_diffusers_to_original_stable_diffusion.py --model_path output-model  --checkpoint_path output-model/latest.ckpt

Download the model from the VM

scp ec2-user@<VM_IP>:~/diffusers/examples/dreambooth/output-model/latest.ckpt ./

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prerequisites

Setup the VM

Train the model

Convert to .ckpt

FilesExpand file tree

AWS.md

Latest commit

History

AWS.md

File metadata and controls

Prerequisites

Setup the VM

Train the model

Convert to .ckpt