Flux.2 Dev (BF16/FP8/GGUF) Setup & Install in ComfyUI

 Flux 2 bf16/gguf/fp8 Installation in comfyui

If you have been following the rapid evolution of image generation models, you are going to love what comes next. FLUX.2  is the latest released model from Black Forest Labs, a powerful 32B-parameter rectified flow transformer built from the ground up. You can generate images in any aspect ratio, go all the way up to 4MP resolution. It also lets you guide the image with JSON controls, pose inputs, or even expand and shrink visuals with precision.

 

Flux2 model variants-

(a) FLUX.2 [pro] (via API)- It is the version made for people who want both high-quality images and very fast speed. It doesn't force you to choose between the two. You get great results quickly, making it ideal when you want professional output without waiting long.

(b) FLUX.2 [flex] (via API)- This is the version that focuses on accuracy and control. If you want perfect details, especially clean and readable text inside images, this is the best option. It allows you to control the style, layout, and fine elements more precisely than the others.

(c) FLUX.2 [dev] (Open weight)- Its the developer-friendly version. The model weights are open, meaning anyone can download it and use it in their own software or services. 

The model is registered under FLUX [dev] Non-Commercial License means generated outputs can be used for personal, scientific, and commercial purposes, but before model deployment you need to accept their licensing by consulting the BFL Team. So, we will be focusing on this variant.


Flux2 Dev showcase
Flux2 Dev showcase

It simplifies the whole prompt-to-image pipeline by switching to a single Mistral Small 3.1 text encoder, and brings serious upgrades in flexibility, architecture, and multi-image control. Whether you want to generate, edit, or blend up to 10 reference images at once, FLUX.2 gives you a cleaner, smarter, and far more capable playground to create in. 


 

Table of Contents

 

Installation

 

update comfyUI manager

1. Install ComfyUI if you have not done yet. Update ComfyUI from the Manager by selecting Update All if you already using.

2. Download Flux.2 Dev from the community (choose any of them as per system resources):

 accept license agreement

(a) Flux2 Dev BF16  (flux2-dev.safetensors) from Black forest labs. Before downloading, you need to accept their conditions and share your profile details. Use this only if you do not want to compromise with the output but have powerful resources.

 

Flux2 Dev FP8 

(b) Flux2 Dev FP8 (flux2_dev_fp8mixed.safetensors) optimized by ComfyUI. People using NVIDIA GPUs with 12GB VRAM and good system RAM(around 70GB or more) can run it.

(c) Flux2 Dev GGUF by city96 (Q2 for fast inference to Q8 for better quality). Lower VRAM and system RAM users need to use the GGUF variants by analyzing its model size.

 (d) Flux2 Dev GGUF by GGUF-org (Q2 for fast inference to Q8 for better quality). 

(e) Flux2 Dev GGUF by orabazes (Q2 for fast inference to Q8 for better quality). 

Save it into ComfyUI/models/diffusion_models folder. 

 

ComfyUI-GGUF custom node by City 96

For GGUF models, make sure you have ComfyUI-GGUF custom node by City 96. If not yet done, just install from Manager by selecting Custom Nodes Manager option. Update it if already using this.

 

Flux2 Dev GGUF 

If you do not know what is FP8/BF16/GGUF model variants, just follow our quantization guide to get the detailed overview.

 

flux2 vae

3. Download flux 2 Vae (flux2-vae.safetensors) and save this into ComfyUI/models/vae folder. 

 Download text encoder

4. Download text encoder (mistral_3_small_flux2_bf16.safetensors Or mistral_3_small_flux2_fp8.safetensors ). Choose any one of them according to your system resources. Save this into ComfyUI/models/text_encoders folder.

5. Restart and refresh ComfyUI to take effect. 

 

Workflow

1. Download Flux2 Dev Workflow from our Hugging face repository.

 flux 2 dev workflow

(a) Flux2_Dev_FP8_workflow.json ( FP8 workflow)

(b) Flux2_Dev_workflow.json ( BF16 workflow) 

(b)  Flux2_Dev_GGUF.json (for GGUF variant)

2.  Drag and drop into ComfyUI. The workflow combined with both function:

-Text To Image

-Image To Image (max 10 supported) 

 

ReferenceLatent node bypass

Now, lets say you want to use the text to image workflow. You can simply unbypass (shortcut- CTRL+B) the ReferenceLatent nodes to work on reference images.  If you like to add more reference images, you can add multiple reference images(max 10 images) by following the pattern and extend the workflow.

But, if you do not want any reference image, just select all the ReferenceLatent nodes(Step-3) and then use CTRL+B to bypass it, and the workflow will be converted into the basic Text to Image workflow.

 

(a) Load Flux2 Dev(bf16/fp8) model into Load diffusion model loader node. Use unet loader node if using GGUF.

(b) Load clip, text encoders,vae into their respective node.

(c) Put positive prompts into prompt box.  Negative prompts are not required.

(d) Set KSampler Settings:

Steps-50 (28 is good)

CFG- 4.0

Sampler- euler

The results which we have shown below are not cherry picked.

 

Testing-

 Human Photography

flux2 dev Output

Prompt used (for text To Image)- an American teenage girl taking a selfie, clear camera-facing pose, professional glam makeup, soft natural lighting, smooth skin texture, youthful modern styling, trendy outfit, high-quality portrait photography style

The image looks very clear with detailing on her eyes with clothing.

 

kirby in the party (flux2 dev testing)

 Prompt used- Authentic be real image of kirby and yoshi in a nostalgic 90s house party wearing sunglasses, fun vibes

Here, prompt has been followed very well, but from some how that blurry human fingers is kind of deformed. It is not supposed to be like that.


 Photo-realism

 

girl sitting on the wall -flux2dev testing

 Prompt used- A low-angle, cinematic portrait of a porcelain-skinned young woman perched on a sun-warmed concrete ledge, her jet-black bob with choppy bangs framing a defiant gaze. She wears a rib-knit black turtleneck under a daisy-embroidered pinafore, bold yellow cable-knit knee-highs slouched over chunky Mary Jane boots. Render in 8K ultra-realism with film-grain texture and soft pastel lens flares, color-graded like vintage Ektachrome teal highlights, buttery gold midtones. Shallow depth of field keeps her face and floral details razor-sharp. Mood is retro-futuristic grindhouse meets dreamy pop art dreamy yet razor-edged, whimsical yet fearless.

In this, the generation has been done in quite well mannered.

 

Text Rendering 

 

girl having text written on her cheek (flux2 dev text rendering)

Prompt used- An E-girl with black hair and soft bangs, pale skin, and the handwritten phrase 'I LOVE FLUX.2' on her left cheek. The camera focuses tightly on her cheek and the text, capturing crisp detail and a slightly aesthetic, modern E-girl style. 

Here, everything has been managed well but one thing to mention, in the prompt we mentioned left cheek from the character perspective but its written on the right.

 

Prompt style (for Image To Image):  Apply the design from Reference Image 1 onto objects in Reference Image 2.

 

 

JSON PROMPT STYLE 

The model also support advanced JSON format (key-value pair) schema provided below. You can write detailed prompts in this format into prompt box.

=============COPY=================  

 {
  "scene": "overall scene description",
  "subjects": [
    {
      "description": "detailed subject description",
      "position": "where in frame",
      "action": "what they're doing"
    }
  ],
  "style": "artistic style",
  "color_palette": ["#hex1", "#hex2", "#hex3"],
  "lighting": "lighting description",
  "mood": "emotional tone",
  "background": "background details",
  "composition": "framing and layout",
  "camera": {
    "angle": "camera angle",
    "lens": "lens type",
    "depth_of_field": "focus behavior"
  }
}

=============COPY================= 

After generating multiple images, we can say the model works better if you put detailed long prompts. Shorter prompts often generate artifacts and just lose the detailing as well. Of course, you can get good real effect but with 2-3 attempts.

Another thing you need to take care is the VRAM and system RAM. Due to its high parameter it takes hefty power resources. We tested and came to conclusion that even FP8 optimized needs RTX 4090/5090 GPUs without memory block swapping that gives slower inference. Lower VRAM users need to use the GGUF variants.

It can handle long 32k token prompts, produce results even better results than its earlier variant, and maintain consistent output even at an enterprise level. You can reference up to ten images at once and still get some of the best character consistency available today. The detail is so sharp and realistic that it pushes close to real photography, and its text rendering finally makes complex layouts like infographics and UI mockups production ready. 

Flux2 also follows prompts more accurately, understands structured instructions better than the earlier one. Overall, it's designed to give you flexibility, control, detail, and speed everything you need to create exactly what you imagine.