DreamBooth fine-tuning with HRA
This guide demonstrates how to use Householder reflection adaptation (HRA) method, to fine-tune Dreambooth with stabilityai/stable-diffusion-2-1
model.
HRA provides a new perspective connecting LoRA to OFT and achieves encouraging performance in various downstream tasks. HRA adapts a pre-trained model by multiplying each frozen weight matrix with a chain of r learnable Householder reflections (HRs). HRA can be interpreted as either an OFT adapter or an adaptive LoRA. Consequently, it harnesses the advantages of both strategies, reducing parameters and computation costs while penalizing the loss of pre-training knowledge. For further details on HRA, please consult the original HRA paper.
In this guide we provide a Dreambooth fine-tuning script that is available in PEFT's GitHub repo examples. This implementation is adapted from peft's boft_dreambooth.
You can try it out and fine-tune on your custom images.
Set up your environment
Start by cloning the PEFT repository:
git clone --recursive https://github.com/huggingface/peft
Navigate to the directory containing the training scripts for fine-tuning Dreambooth with HRA:
cd peft/examples/hra_dreambooth
Set up your environment: install PEFT, and all the required libraries. At the time of writing this guide we recommend installing PEFT from source. The following environment setup should work on A100 and H100:
conda create --name peft python=3.10
conda activate peft
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install xformers -c xformers
pip install -r requirements.txt
pip install git+https://github.com/huggingface/peft
Download the data
dreambooth dataset should have been automatically cloned in the following structure when running the training script.
hra_dreambooth
├── data
│ └── dreambooth
│ └── dataset
│ ├── backpack
│ └── backpack_dog
│ ...
You can also put your custom images into hra_dreambooth/data/dreambooth/dataset
.
Fine-tune Dreambooth with HRA
class_idx=0
bash ./train_dreambooth.sh $class_idx
where the $class_idx
corresponds to different subjects ranging from 0 to 29.
Launch the training script with accelerate
and pass hyperparameters, as well as LoRa-specific arguments to it such as:
use_hra
: Enables HRA in the training script.hra_r
: the number of HRs (i.e., r) across different layers, expressed inint
. As r increases, the number of trainable parameters increases, which generally leads to improved performance. However, this also results in higher memory consumption and longer computation times. Therefore, r is usually set to 8. Note, please set r to an even number to avoid potential issues during initialization.hra_apply_GS
: Applies Gram-Schmidt orthogonalization. Default isfalse
.hra_bias
: specify if thebias
parameters should be trained. Can benone
,all
orhra_only
.
If you are running this script on Windows, you may need to set the --num_dataloader_workers
to 0.
To learn more about DreamBooth fine-tuning with prior-preserving loss, check out the Diffusers documentation.
Generate images with the fine-tuned model
To generate images with the fine-tuned model, simply run the jupyter notebook dreambooth_inference.ipynb
for visualization with jupyter notebook
under ./examples/hra_dreambooth
.