Train Z Image Turbo LoRA on Windows/Linux

Now, training Z Image Turbo model can be done on average consumer grade GPUs, thanks to its lightweight size (6B parameters). After training multiple LoRAs, we’re here to bring you the sweet spot.

We just need to make sure some of the parameters is correctly selected and rest will be as usual as we do with normal lora training. Here, we will be using AI-Toolkit that's well updated WeBUI for training a bunch of diffusion based models.

Table of Contents:

Requirements

1. NVIDIA RTX based minimum 16GB VRAM (without memory offloading), 10-12 GB VRAM with offloading technique.

If you do not have GPUs or you are a Mac user, try Runpod Cloud that will cost you hardly around $2/hr on RTX 5090 card.

2. Operating System- Windows/Linux
3. Python greater than 3.10, Git, Pytorch
4. NodeJS installed

Installing AI Toolkit

Before we start training, we need to install AI Toolkit UI for Windows/Linux system. Select any drive location.

People already installed AI toolkit need to update it. To do this, move inside the root folder(ai-toolkit), open command prompt and use the command git pull to update it.

1. New user need to install Ai toolkit UI. Open terminal and use following commands.

(a) For Windows:

Use the AI-Toolkit Automatic installation bat setup file from github repository. This handles auto-updates and download all the required files (python, cuda, git, NodeJs etc) automatically.

Just download AI-Toolkit-Easy-Install.bat file and click to start installation. After installation it will open AI toolkit inside your browser on address (http://localhost:8675).

(b) For linux system:

-Clone AI toolkit repo:

~~git clone https://github.com/ostris/ai-toolkit.git~~

-Move into folder:

~~cd ai-toolkit~~

-Create virtual environment:

~~python3 -m venv venv~~

-Activate virtual env:

~~source venv/bin/activate~~

-Install torch:

~~pip3 install --no-cache-dir torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126 pip3 install -r requirements.txt~~

-Install requirements dependencies:

~~pip3 install -r requirements.txt~~

2. After installation, open move inside folder and execute the command:

~~cd ui~~

~~npm run build_and_start~~

3. Manually Open Ai-toolkit UI using following address inside your browser:

~~http://localhost:8675~~

Training process

You can train it directly as this the distilled variant and not the base one. As diffusion models need CFG with almost 20-50 steps to get good result. Usually if you do the training on these models it will lose its original feature and model do not learn new data in its learning process. In simple words, the models tries to generate artifacts in the end result.

So, basically Ostris built the de-distillation training adapter that will going to help into the training process without breaking the model distillation. You do not need to do something new as its embedded inside AI-toolkit and you just have to select the related parameters.

With 5000-20000 steps you can do style lora, character lora etc, its pretty doable. We are using NVIDIA RTX 4090 with 24GB VRAM.

1. Preparing Dataset

You can skip this step if you already know the Ai Toolkit data preparation.

Here, we want to train images for character lora on Karina-a south korean singer. We found related images on search engine and social media with high quality.

Create a folder, save images (with high quality) in same formats with txt file captions. Basically 15-25 is great for lora. Do captioning for each images and store them in each respective txt file and name it with same filename (either png or jpg). For example-

image1.png >> image1.txt
image2.png >> image2.txt
image3.png >> image3.txt

and so on.

You can do auto-captioning using LLMs(GPT, gemini) to save time for bulk images. Be simple and relative while prompting. Do not go for over detailing longer prompt as these can disturb your end result.

Now, add trigger words inside captioning with [ ] brackets. Adding trigger word is not necessary, but to follow the process we just stick to the traditional way. While captioning, we are using Kariiina as trigger word with [ ] . For ex- [Kariiina] dancing in the Halloween party gathering.

Next, head over to AI Toolkit dashboard. Select "Datasets" option, then click New Dataset and create new dataset folder with same name as above. Drag and drop all those files from the folder into this respective dataset folder.

2. Setting Parameters

JOB
Training Name- Z_Image_Turbo_kariiina (Choose any thing relevant)
Trigger word- kariiina (Choose your trigger word)

MODEL
Here, add the Hugging Face model repo path ID. This will download the models with training adapters.

Model Architecture- Select Z Image Turbo W/ training adapter (for normal training); Choose Z Image De-Turbo De-distilled (for faster process).
Name or path- Tongyi-MAI/Z-Image-Turbo (access path from official repo)
Training adapter path- ostris/zimage_turbo_training_adapter or ostris/zimage_turbo_training_adapterV2 (choose adapter V2 for refined results; access path from repo)
Options-Low VRAM enable if using low VRAM (10-12 GB)
Layer Offloading-Disable, Enable if using low VRAM

QUANTIZATION
Transformer- Float 8(default), None (for 24Gb or more VRAMs)
Text Encoder-Float 8(default)

TARGET
Target Type- LoRA
Linear rank-32

SAVE
Data type- BF16; Use- Fp8 (for low VRAMs); means higher levels decreases the hallucinations with quality enhancements.
Save every-250
Max Steps save to keep-4

TRAINING
Learning rate- 0.0001
Steps-select 3000 (normal); 5000 ( will work great with good captioning and dataset)
Cache text Embedding- Enable if having low VRAM. This load/unloads text encoder from memory.
Timestamp Bias- Balanced (for character lora), High Noise (for style lora)
Leave rest as default.

ADVANCED
Differential guidance- Enable/disable; its experimental. Disable if you want use traditional way.
This is the new way of training data while coming to more closer and faster to the actual result.
Diff guidance scale-3 (default if enabled)

DATASETS
Target dataset-Choose your relevant data set explained above.
Resolutions- Enable 512 (for low VRAM) only; Enable 512,768,1024 (for high VRAMs)
Leave rest as default.

SAMPLE
Sample Prompts- Add prompts captioning with different perspective. You can add any type of prompts. But for smoothing the process, we changed each subject in default prompts with our trigger word.

For ex- our trigger word is "kariiina" so the prompt will be- "kariiina with red hair playing chess at the park...". Do this for rest of the prompts.

During training process, the AI toolkit will generate images using your trained lora model with these prompts. This will help you to identify how much better your lora model has been trained and where to stop your training process.

Leave rest as default. Finally select Create Job option available on top right.

3. Start Your training process

After setting up the parameters, a new job will be created on the list. Hit play button(at the top right) to do the execution. This will download the models(Z image turbo, adapters etc) from its official hugging face repository.

You can track the realtime status on the dashboard. Trainng can be controlled (play/pause/stop) whenever required. After completion, you can find the lora files(.safetensors) inside AI-Toolkit/output folder.

Time take to train Z Image Turbo loras-

RTX 3090- 1 Hour (approx)

RTX 4090 - 40 minutes (approx)

RTX 5090- 25 minutes (approx)

After training, you can push trained lora to your Hugging face repository and share with the community.

Use and test Lora in ComfyUI.

1. After training put the lora inside ComfyUI/models/loras folder. Download Z Image Turbo Lora workflow from our hugging face repository.

2. Drag and drop into comfyui.

3. Put your text prompt. Use the trigger word if you used while training otherwise leave it.

For Ex- Kriiina taking selfie shot using her Iphone.

4. Set the values for KSampler-

Sampler-Euler

Steps-20

CFG-2 or 3

5. Hit Run to start generation

Some of the images generated using the lora model.

Train Z Image Turbo LoRA on Windows/Linux

Requirements

Installing AI Toolkit