Use DreamO with Flux to Instantly Customize your Image

customize image with dreamo

There has been a lot of progress in customizing images using large generative models like changing the subject, background, style, or even identity. But, most of these methods are built for just one specific task. Creating a single system that handles all kinds of image customization is still a big challenge.

DreamO released by Bytedance, a unified diffusion transformer model for all can make your work easier. 

dreamo-showcase
For- Consistent generation


dreamo-showcase
For- Ecommerce Industry


dreamo-showcase
For-Product Advertising

The model has been updated to control the over-saturation and plastic-face generation issues. You can find more in-depth information from their research paper.


Installation

1. First, install and setup ComfyUI if you are new user. Older user need to update ComfyUI from the Manager section.

2. Now, get the basic Flux installation if not yet. This is important as the flux workflow uses the basic clip, text encoders and VAE models.

3. Move into ComfyUI/custom_nodes folder and clone the repository using command prompt:

git clone https://github.com/ToTheBeginning/ComfyUI-DreamO.git

4. Install the dependencies using command below.

For normal Comfy users:

pip install -r requirements.txt

For ComfyUI portable user, move inside ComfyUI_windows_portable folder and use this command:

python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-DreamO\requirements.txt

download dreamo models

5. Next is to download DreamO model (all safetensors file)from Hugging Face repository and save it into your "ComfyUI/models/loras" folder.

6. Now, download DreamO embedding and Ben2 model (for image processing) from their Hugging Face repository and save them into "ComfyUI/models/dreamo" folder. 

These both model support the auto-download when you run the workflow for first time. So, you can also leave if you want.

7. Another model you need to download, Flux Turbo Alpha by alimama. After downloading, rename the model file to "flux-turbo.safetensors" filename and save it into your "ComfyUI/models/loras" directory.

8. Restart ComfyUI and refresh it.


Workflow

1. Get the workflow from "ComfyUI/custom_nodes/ComfyUI-DreamO/workflows" directory

(a) dreamo_comfyui.json (For single condition inputting reference)

(b) dreamo_comfyui_2cond.json (For multiple condition inputting reference)

2. Drag and drop into ComfyUI.

3. Setup and Configuration to follow:

Load Image: Loads the reference image of a man in tuxedo with a smiley face (used for DreamO reference guidance).

LoRA Loader (flux-turbo.safetensors): Loads a LoRA model for fine-tuning the output (possibly a style or concept enhancer).

LoRA Loader (dreamo_comfyui.safetensors): Another LoRA model being loaded to add a specific concept or detail.

LoRA Loader (dreamo_cfg_distill_comfyui.safetensors): LoRA to control configuration behavior like CFG guidance.

LoRA Loader (dreamo_quality_lora_pos_comfyui.safetensors): LoRA model focused on enhancing image quality positively. 

Weight = 0.15

LoRA Loader (dreamo_quality_lora_neg_comfyui.safetensors): LoRA for negative quality elements (used with a negative weight: -0.8)

UNet Loader (flux1-dev-fp8.safetensors): Loads the UNet part of the model (used during diffusion). Uses fp8_e4m3fn (likely a quantized fast version).

CLIP Text Encode (Positive Prompt): Encodes the prompt using CLIP for guiding generation.

CLIP Loader (clip_l.safetensors): Loads the CLIP model fp8xl_fp8_e4m3fn.safetensors for both positive and negative prompt guidance.

VAE Loader (ae.safetensors): Loads the VAE (for decoding the latent into image space).

Empty Latent Image (SD3): Generates an empty latent image canvas of size 1024 x 1024.

DreamO Processor Loader: Loads the DreamO processor which facilitates Dreambooth-style control.

DreamO Ref Image Encode: Takes the reference image (smiley man in tuxedo), encodes it for DreamO to guide generation.

Apply DreamO: Combines positive prompt, negative prompt, reference features, LoRA weights, and produces a final latent.

KSampler: Takes latent image and DreamO-processed prompt info to generate final latent image

Steps: 12

CFG: 1.0

Sampler: euler

Scheduler: simple

Noise: 1.00