Style Transfer with Dreambooth
Last updated
Last updated
Today, there is a unique way to modify, enhance, and generate images using neural networks. One of the ways to do this is to use Dreambooth, which allows users to fine-tune models for different styles. This article is about training your model using artifactory.ai.
The Drow Ranger character from Dota 2 was the basis for this tuning. First, we will show you how and what kind of images the original model produces. Then you will see the output of the trained model. It demonstrates an enormous improvement over the original model.
At this stage, we create a new copy of the DreamShaper neural network (a set of files weights ~4 GB, based on Stable Diffusion 1.5) to train it for a particular style. To continue, follow the next steps:
Proceed to the Dreambooth tab
In the Model section, go to the Create tab
Create an arbitrary Name for your model
Select the DreamShaper inside of the Source Checkpoint dropdown
Press Create Model button
By the way, there is an option to check the processing progress in the Logs tab. While the generation is in progress, we can upload the images to the server.
It is worth starting with a small dataset, in our case: 1 picture and 150 training steps. If you have a good-quality image in the result, you can add more images, going from 1 to 4, from 4 to 10, and further. Note that the uploaded images should look as different as possible but in the right style for the user. One last thing to mention is that the data set must contain 1:1 images, e.g. 1024x1024.
Go to the Files tab
Create New Folder
Set Folder name
Press Create button
Now the folder has been successfully created, and the last thing to do is to upload the selected files:
Click the Upload button
Select the desired File
Upload files
The most important part of the process is setting the learning parameters. Here we set the token used to identify the style and the number of steps.
Now we return to the WebUI section. As you can see, our model has already been successfully generated.
In the Model section, we can now go to the Select tab and check that the correct one has been selected. The next step is to set the prompt responsible for the style it will learn:
Select the Concepts tab
Go to Directories
Type “mydataset” into Dataset Directory
Go to Training Prompts
Type “mskl” into Instance Prompt Note: you can use any rare combination of characters. In our case it is "mskl". It indicates your style for the model.
Go to Sample Prompts
Type “mskl character, monotonic background” into Sample Image Prompt Note: this is a Prompt that will generate test images during the training phase so that we can do a quality check.
This one is the hardest part. Try to go through all the steps carefully. Training Steps Per Image (Epochs) parameter determines the total number of steps the model will learn. If the dataset contains one image, a value of 150 will result in 150 training steps; if there are 10 images, 1.500 training steps will be performed. Training affects how well the model can learn from the given dataset.
Proceed to Settings
Set Training Steps Per Image (Epochs) to 150
Remove the checkbox from Gradient Checkpointing
Type Sanity Sample Prompt in the Image Processing section
Type mskl, pencil art. Note: this step is critical. Without it, the model will learn the image from the training data set.
Scroll down, and select the Advanced dropdown to perform the following actions:
Press the Use Ema checkbox
Select bf16 in Mixed Precision
Set xformers in Memory Attention inside the Tuning section
The trained model is compiled into a .ckpt / .safetensors file located in the Models/Stable-diffusion folder. The file contains all the weights of the neural network used to generate the images.
Proceed to Saving
Activate Half model checkbox
Click on Generate a .ckpt file when training is canceled checkbox
Mark Generate a .ckpt when saving checkbox
This section describes how to configure the generation of pictures during training. The current step allows us to control the quality of the training.
Proceed to Generate
Choose the Image Generation Library inside Class Generation Schedulers.
Select A1111 txt2img (Euler a)
Save settings
Press the Train button and wait for the generating to complete.
After the training, you can select the file with the trained model in the upper left corner and use it to generate images.
To download the generated model, do the following actions:
Go to the Models section
Locate and proceed to the Stable-diffusion folder
Find mskl_150.safetensors file
Select it, then press Download (in the upper right corner of the screen)
Let's compare the results produced by the original model and the trained version on the same prompt.
The first test of the prompt was in the standard DreamShaper model.
Prompt: hero character by world of warcraft, in the style of light black and dark azure, realistic chiaroscuro lighting, frances macdonald, dynamic angles, uhd image, light silver and dark purple, contoured shading
If you use a tuned model, you will get a result like this. The result is a mix of WoW and Dota 2 styles (Drow Ranger). The parameter (mskl:0.1) is set to provide a better match to the trained style.
Prompt: (mskl:0.1), hero character by world of warcraft, in the style of light black and dark azure, realistic chiaroscuro lighting, frances macdonald, dynamic angles, uhd image, light silver and dark purple, contoured shading
Setting the processing parameter (mskl:1) will further adjust the image to the desired style.
Prompt: (mskl:1), hero character by world of warcraft, in the style of light black and dark azure, realistic chiaroscuro lighting, frances macdonald, dynamic angles, uhd image, light silver and dark purple, contoured shading