Z Image Turbo ControlNetUnion (Canny, Depth, Hed, Pose) in ComfyUI

 Z Image Turbo ControlNetUnion in comfyui

Getting consistent, detailed, and stable results with ControlNet is not always smooth. Sometimes the edges get messy, sometimes details disappear, and sometimes the model simply doesn't follow your control input the way you want. And when you are aiming for high-quality image generation, that inconsistency can be frustrating. Alibaba PAI has released Z-Image-Turbo-Fun-ControlNet-Union, a powerful new ControlNet trained that deliver stronger control and higher quality outputs. 

The model is built on 10,000 training steps using a massive dataset of 1 Millon high quality images, general scenes and human-centric content.

pose example
pose example


canny example
canny example

depth example
depth example

HED example
HED example

pose example
pose example

This model taps into 6 added blocks, supports multiple control conditions such as Canny, HED, Depth, Pose, and MLSD, and works just like a standard ControlNet. Training has been followed by the team on 1328 image resolution, with BFloat16 precision, a batch size of 64, a learning rate of 2e-5, and 0.10 text dropout for robustness.

The real magic here comes from how well the model responds to control_context_scale. By adjusting this value, you can tip the balance between strict control and creative freedom. And based on Alibaba’s tests, the refined output is ranges from 0.65 to 0.80. This range gives you sharper structure, better detail preservation, and noticeably more stable results.

Plus, the model accepts more detailed prompts. The more clarity you give, the more clean and accurate your output becomes that especially for human centric images or scenes with complex geometry.

 

Installation

Make sure you have ComfyUI installed and updated to the latest version. If not done yet, just update it from the Manager section by selecting Update All option.

 Download Z Image Turbo Fun Controlnet Union FP16

1. Download Z Image Turbo Fun Controlnet Union BF16 (Z-Image-Turbo-Fun-Controlnet-Union.safetensors) and save this into ComfyUI/models/model_patches folder. This supports Canny, HED, Depth, Pose and MLSD.

2. Now, these three models (Z Image turbo, text encoder and vae) are the same that are used for basic Z Image Turbo Text to Image workflow. If you already downloaded then you can skip them  (listed below) :-

Download Z Image Turbo FP8 

(a) Download Z Image Turbo FP8 (z-image-turbo-fp8-e4m3fn.safetensors / z-image-turbo-fp8-e5m2.safetensors), any of them and save this into ComfyUI/models/diffusion_models folder.  

(b) Next, you need text encoder. Download Qwen text encoder (qwen_3_4b.safetensors), put this into ComfyUI/models/text_encoders folder.

(c) Download VAE (ae.safetensors), place this into ComfyUI/models/vae folder.

3. Restart ComfyUI and refresh it to take effect.



Workflow 


1. Download Z-Image Turbo ControlnetUnion (Z_Image_Turbo_ControlNet_Union.json
) workflow from our Hugging face repository.


2. Drag and drop into ComfyUI canvas. If you get any red error missing nodes that means you are on older ComfyUI version. Just update it from the Manager by selecting Update ComfyUI option.

3. Follow the Workflow setup-

(a) Load your target image into Load image node.

(b) Select and load Z image turbo model (FP8 / BF16) into Load diffusion model node.

(c) Load the Z image Turbo Controlnet Union model into ModelPatchLoader node. If you are getting error node then just update comfyui to the latest version from the Manager.

(d) Select qwen vae from the load vae node.

(e) Load text encoder into clip node.  

(f) Now, the most important part. Preprocessors Controlnet group has AIO Aux preprocessor with multiple controlnet features (MSLD, Canny, Depth, Pose, HED) nodes.  For quick estimation, you can use AIO Aux Prprocessor node and select any of the available preprocessior from the drop down, then disable the rest nodes. Now connect AIO Aux Prprocessor to QwenImage DiffSynth Controlnet node. Here, you can also add extra node for preview image node for previewing the edges.

(g) KSampler settings-

Steps- 9

CFG-1.0 

Shift-3.0

Sampler- res_multi_step

(h) Put relative positive prompts into prompt box.

(i) Hit Run button to execute the workflow.

This release feels like a practical upgrade for anyone struggling with inconsistent ControlNet behavior. The fact that it was trained from scratch on high-quality data at high resolution really shows in its responsiveness. If you rely heavily on control conditions for professional or creative work, Z-Image-Turbo-Fun-ControlNet-Union is absolutely worth trying.