You have probably run into a frustrating limitation on speed drops dramatically when you use multiple reference images. FLUX.2 [klein] 9B-KV a successor of Flux.2 Klein 9B model released by Black Forest Labs makes work simple.
Most diffusion based editing pipelines reprocess every reference image during every denoising step. That means if your model runs 4 denoising steps and you use several reference images, the model repeatedly computes the same visual tokens again and again. The model is registered under FLUX Non-Commercial License.
![]() |
| Flux2 klein 9b kv showcase |
This becomes especially noticeable in interactive tools, where users expect near-instant feedback when adjusting prompts or generating variations. So even though modern models are powerful, redundant computation becomes the bottleneck.
This optimized model variant introduces KV-cache support, allowing the model to compute reference image representations once and reuse them during the rest of the generation process. In practice, that means its faster multi-reference editing, reduced redundant computation, up to 2.5x faster inference.
Installation
1. Make sure you already installed ComfyUI. Update it from the Manager if already installed.
2. Download Flux2 Klein 9B KV model variants. Choose the one that suits your VRAM and system RAM:-
(a) Flux2 Klein 9B KV BF16 (flux-2-klein-9b-kv.safetensors) by Black Forest Labs.
(b) Flux2 Klein 9B KV FP8 (flux-2-klein-9b-kv-fp8.safetensors) by Black Forest Labs.
Save this inside ComfyUI/models/diffusion_models folder.
(c) Flux2 Klein 9B KV GGUF (Q2 to Q8) by Quanstack.
Save this inside ComfyUI/models/unet folder. Make sure you have already installed the ComfyUI-GGUF custom node by city-96 from the Manager. If already done, update this custom node from the Manager itself.
3. As the model is built on basic Flux2 Klein 9B, rest of the models (text encoders, clip, vae etc) will be same. If not installed yet follow our Flux2 Klein 9B tutorial.
4. Restart and refresh ComfyUI.
Workflow
1. Download Flux2 klein 9B KV Workflows from our hugging Face Repository.
(a) Flux2_klein_kv_Txt2Img.json (Text to Image workflow)
(b) Flux2_klein_image_edit_9b_kv.json (Multi Image edit workflow)
2. Drag and simply drop into ComfyUI application.
Images generated using Flux 2 Klein 9B KV
Human Realism
Prompt:
Grungy analog photo of 10 year old Finn and Jake (golden retriever) from adventure time in 2012 watching adventure time on CRT TV in a dimly lit bedroom. Finn sitting on the floor in front of the TV holding the petting Jake with one hand and both looking back at the camera taking the photo while the game is on in the background visible to us. Flash photography, unedited.
Steps:4
Resolution: 1024 by1024
Prompt:
A 23-year-old Instagram model real world version of Asuka LAngley from Evangelion with characteristic hairstyle, blue eyes, and a confident smile is posing in front of the gym mirror. She’s wearing form-fitting light red leggings and a matching crop top, highlighting her toned physique. Her left hand is holding the phone up to capture the shot, while her right arm rests casually by her side. The gym is clean and spacious with dumbbells, machines, and a water bottle on the floor beside her. Behind her one can see her discarded eva unit 2 plugsuit.
Steps:4
Resolution: 1024 by1024
Anime
Prompt:
Grand Theft Auto San Andreas, CJ Selfie, CJ standing on Grove Street, taking a selfie at sunset, PS2 graphics, in game graphics, homies talking in background
Conclusion
FLUX.2 klein 9B-KV highlights an important trend in generative AI where optimization is becoming just as important as model size. Instead of simply scaling parameters, researchers are focusing on smarter computation strategies. KV caching may sound like a small architectural tweak, but its impact is huge especially for interactive creative tools where latency matters.
If this direction continues, we will likely see more models designed not just for quality, but also for real-time usability. And that’s a big step toward making AI image generation feel instant, responsive, and truly interactive.






.webp)
.webp)