Most image generation models are excellent at creating artwork, but they often struggle with design-focused tasks. Generating accurate text, maintaining structured layouts, placing objects precisely, and creating editable design elements remain common challenges. For designers, marketers, and content creators, this lack of control can turn a simple design task into multiple rounds of regeneration and manual editing.
Ideogram 4.0 is designed specifically to bridge the gap between image generation and professional design workflows. The model designed around a structured prompting system that provides greater control over composition, styling, colors, and text rendering.
Instead of relying solely on natural language prompts, the model is trained to understand organized JSON-based descriptions that can define scene details, visual style, backgrounds, object placement, and color palettes with high precision.
One of its standout features is support for layout-aware generation, allowing creators to position elements using bounding-box coordinates and guide visual design through specific color values.
The model also excels at generating readable text within images, making it particularly useful for posters, advertisements, social media graphics, UI mockups, and other text-heavy designs. It supports a wide range of output resolutions, offering flexibility for different creative and production needs.
Installation
1. First, install latest ComfyUI version. If already done, update it to the latest version from the Manager.
2. Download the Ideogram4 models from its hugging face repository. Choose the pair of models:
(a) ideogram4_fp8_scaled.safetensors
(b) ideogram4_unconditional_fp8_scaled.safetensors
(c) ideogram4_nvfp4_mixed.safetensors
(d) ideogram4_unconditional_nvfp4_mixed.safetensors
Note-You have to use the pair of models. Use the either Fp8(fp8scaled & unconditional fp8 scaled) or NVFP4(NVfp4 mixed & unconditional nvfp4 mixed)) model variant. We will update if GGUF variant will be available.
Save them inside ComfyUI/models/diffusion_models folder.
3. Download VAE. (flux2-vae.safetensors)
Save this inside ComfyUI/models/vae directory.
4. Download text encoders:
(a) qwen3vl 8b fp8 ( qwen3vl_8b_fp8_scaled.safetensors)
(b) gemma4_e4b_it_fp8 (gemma4_e4b_it_fp8_scaled.safetensors)
Save both inside ComfyUI/models/text_encoders directory.
If you are frequently encountering the "Image blocked by safety filter" message, one possible approach is to experiment with different uncensored text encoders, as the prompt interpretation layer can influence how requests are processed. Some users report that alternative vision-language models or text encoders can produce different results depending on the workflow and prompt structure being used.
5. Refresh and restart Comfyui to take effect.
Workflow
1. Download the workflow(ideogram4_Txt2Img.json) from our hugging face repository.
2. Drag and drop into ComfyUI. Install the missing nodes if found, from the Manager.
3. Load the models, vae, text encoders into their relevant nodes.
4. Set configurations:
Mode:
Turbo-12 steps, Default- 20steps, Quality- 48 steps
Sampler- euler
4. Put the prompt into prompt box. The included workflow supports multiple prompting methods. Users can enter simple natural language descriptions for quick generation or use structured JSON prompts for more predictable results and finer control.
Use this prompt format (if interested):
-------COPY THIS BELOW---------
{ "high_level_description": "...",
"style_description": { "aesthetics": "...",
"lighting": "...", "photo": "...",
"medium": "...", "color_palette": [] },
"compositional_deconstruction": { "background": "...",
"elements": [ { "type": "obj", "bbox": [],
"desc": "...", "color_palette": []
} ]
} }
---------COPY THIS ABOVE----------
An optional LLM Prompt Builder can also convert basic ideas into schema-compliant JSON prompts automatically, making it easier to take advantage of the model's advanced capabilities without manually writing the full prompt structure.
Supported resolution:
1:1(square)
3:2(Photo)
4:3(Standard)
16:9(Widescreen)
21:9(Ultrawide)
2:3 (Portrait photo)
3:4(Portrait standard)
9:16(portrait widescreen)
5. Hit Run to start generation.
Note- If you encounter an "Image blocked by safety filter" message, it originates from the safety mechanisms built into the Ideogram model itself and is not caused by ComfyUI. As mentioned above, try with different text encoders.
Another important factor is prompt formatting. Ideogram 4.0 is designed to work best with structured JSON prompts rather than simple natural-language instructions. Detailed JSON descriptions that clearly define the scene, style, objects, layout, and other attributes often provide more reliable results and improve prompt adherence.
In contrast, short or overly simplistic prompts may lead to less predictable outputs and may not take full advantage of the model's structure-aware design. For the best experience, consider using comprehensive, schema-based JSON prompts that fully describe the desired image. This not only gives you greater control over composition, styling, and text placement but also aligns more closely with the way the model was trained to interpret and generate images.







