The new text-to-image diffusion model Flux is destroying all open-source and black box models. This AI model has been released by Black Forest Labs. Trained with 12 billion parameters based on multimodal and parallel diffusion transformer block architecture.
FLUX : Installation is Here !! 😍
— Stable Diffusion Tutorials (@SD_Tutorial) August 3, 2024
Tested both models: 🤪
📌Flux DEV
📌Flux Schnell
Watch till the end...
Workflow Included 😀🥰👇 pic.twitter.com/3tjjfkF0Ao
Flux Schnell is registered under the Apache2.0 license whereas the Flux Dev is under non-commercial one. Want to test for your commercial projects? Then just switch to their API docs.
Currently, these are the text-to-image models but only two of them can be downloaded:
- Flux.1 Pro can be accessed using API. This model is a raw version designed for testing purpose.
- Flux.1 Dev, for high-end GPU users with more than 12GB VRAM and 32GB system RAM. It is the guidance distilled model of Flux Pro that is more powerful than the standard one. The output generation can be used for personal, scientific, and non-commercial purposes as described in the flux-1-dev-license.
- Flux.1 Schnell, for GPU users with 12GB VRAM or lower. Well, "Schnell" means fast in the German language. This model (for local development, personal, and commercial use) is capable of generating decent-quality images in only 1 to 4 steps.
Source: Black Forest Labs |
The tested results by the Black Forest Labs show how the model outperforms other renowned models like Stable Diffusion 3 Ultra, MidjourneyV6.0, and Dalle3(HD).
Here, the Flux.1 Pro model with ELO score(~1060) surpasses all the text-to-image models, followed closely by FLUX Dev(~1050). SDXL Lightning is the least of all performers with ELO scores (~930).
After huge confusion in the community, it is clear that now the Flux model can be trained on LoRA locally, you can also follow the walk-through tutorial.
Currently, there is no support for Automatic1111. Let's see how to install locally in ComfyUI and ForgeUI.
Table of Contents:
Installation in ComfyUI:
There are multiple variants of the official version released by the community. We have listed all types of Flux variants. You should choose the one that suits your machine.
What ever type of Flux model you want to use, first install ComfyUI in your machine.
TYPE A: Official Release by Black Forest Labs
Use this variant if you are a professional artist who do not want to comprise and owned the powerful machine.
1. Move to ComfyUI Manager and click "Update ComfyUI" to avoid errors. Then just restart and refresh it.
2. Download the respective models as per your machine requirements. But for illustration purposes, we have downloaded both.
(a) Download Flux.1 Dev, if you have a high-end GPU with more than 12GB VRAM.
To download this model, log in to Hugging Face and agree to their terms and agreements. After downloading, save it inside the "ComfyUI/models/unet" folder.
You also need VAE (ae.safetensors file) which can be downloaded from there. Save it into the "ComfyUI/models/vae" folder.
We tested this model on Colab 12GB VRAM free tier and with recommended settings the render time was 7 minutes 40 seconds.
(b) Download Flux.1 Schnell, for low end GPUs can run on 12GB VRAM. Save this inside "ComfyUI/models/unet" folder.
You need VAE(ae.safetensors file) which can be downloaded from there as illustrated in the above image. Save it into the "ComfyUI/models/vae" folder.
3. Next, you will need the Clip models (clip_l.safetensors and t5xxl_fp16 for more than 32GB System RAM or t5xxl_fp8_e4m3fn for lesser VRAM). You should use the FP8 clip model if you getting out of memory error.
So, just download it from the Hugging Face repository and save them inside the "ComfyUI/models/clip" folder.
Users who have already installed Stable Diffusion 3 don't need this. You can go to the "ComfyUI/models/clip" folder and verify these models are present or not.
4. Now, just restart and refresh ComfyUI. Move into the workflow section mentioned below.
TYPE B: Flux GGUF Quantized version
This is the Quantization Flux version produces quite good quality without compromising the extensive image pixels now supported by ComfyUI.
It consumes low GPU consumption with lesser rendering time. Specially the GGUF Loader works on GPU to improve the overall performance of your VRAM. T5 text encoder is also included to lower the VRAM power consumption.
1. Update ComfyUI from ComfyUI Manager by clicking on "Update All".
2. Move to "ComfyUI/custom_nodes" folder. Navigate to folder path location and type "cmd" to open command prompt.
3. Then just clone the repository by copying and paste into command prompt provided below:
git clone https://github.com/city96/ComfyUI-GGUF.git
4. Again, move back to root "ComfyUI_windows_portable" folder. Navigate to folder path location and type "cmd" to open command prompt.
Use this command to install the dependencies:
.\python_embeded\python.exe -s -m pip install -r .\ComfyUI\custom_nodes\ComfyUI-GGUF\requirements.txt
5. There are multiple models listed into the respective repository. Download any one of the the Pre-quantized models or both:
(a) Flux1-Dev GGUF
Save them into the "ComfyUI/models/unet" directory. All the clip models already handled by the CLIP loader. So, downloading this is not required.
However, if you want you can download as per your GGUF (t5_v1.1-xxl GGUF )models from Hugging Face and save it into "ComfyUI/models/clip" folder.
6. Then restart and refresh ComfyUI to take effect.
7. Move to the workflow section provided below to download it.
TYPE C: Flux By Comfyanonymous
Thes Flux versions have been compressed by Comfyanonymous who are behind the ComfyUI. You can use these if you have GPU with lower VRAM(12 GB or lesser).
1. Move to ComfyUI Manager and click "Update ComfyUI" to avoid errors. Then just restart and refresh it.
2. Download Flux Dev(FP8-bit) from Comfy-org's Hugging Face. Save the downloaded model inside "ComfyUI/models/checkpoints/" folder.
3. Download Flux Schnell(FP8-bit) from the Comfy-org's Hugging Face. Then, save the model inside "ComfyUI/models/checkpoints/" folder.
Here, clip models/VAE are included so you don't need to download these.
4. Now, move to the workflow section to download it.
TYPE D: FLUX By Kijai
It is the option for lower-end GPU users from another developer "Kijai" who released the compressed form of Flux Dev and Flux Schnell models in FP8bit versions.
1. Users who cannot run the official models can download these from his Hugging Face repository but it will have some image quality reduction.
2. You also need the respective VAE(ae.safetensors file) that can be downloaded as we described in TYPE A section (a) for Flux Dev and (b) for Flux Schnell. Then put it into the "ComfyUI/models/vae" folder.
3. You will need the Clip models (clip_l.safetensors and t5xxl_fp16 for more than 32GB System RAM or t5xxl_fp8_e4m3fn for lesser VRAM). You need to use the FP8 clip model if you getting out of memory error.
So, download these from the Hugging Face repository. Then save these into the "ComfyUI/models/clip" folder.
4. Now, move to the workflow section to download it.
Comfy Workflow Explanation:
1. Text-To-image:
1. There are different workflows for different types of Flux variants that can be downloaded from our Hugging Face Repository.
- For TYPE A - Flux_Dev_workflow or Flux_Schnell_workflow
- For TYPE B- Flux_GGUF_workflow
- For TYPE C - Flux_Dev_workflow_Comfy-org or Flux_Schnell_workflow-comfy-org
- For TYPE D - Use same as for TYPE A
2. Now, directly drag and drop the workflow into ComfyUI.
Now, many are facing errors like "unable to find load diffusion model nodes". This is due to the older version of ComfyUI you are running into machine. Just switch to ComfyUI Manager and click "Update ComfyUI".
3. Into the Load diffusion model node, load the Flux model, then select the usual "fp8_e5m2" or "fp8_e4m3fn" if getting out-of-memory errors. The default option is the "fp16" version for high-end GPUs.
4. Select the downloaded clip models from the "Dual Clip loader" node. Lower VRAM users should choose the fp8 version (but will impact the image quality) and higher VRAM users can choose the fp8 or fp16 version.
Put your positive prompts. Remember, there is no negative prompt box. So, you need to handle that from positive prompts by inputting the detailed prompts.
5. Recommended settings you need to choose:
For Flux Schnell-
- Sampling Method: Euler
- Steps: 1-4
- Dimensions: 1024 by 1024
- CFG: 1-2
For Flux Dev-
- Sampling Method: Euler
- Steps: 20-50
- Dimensions: 1024 by 1024
- CFG: 3.5
Generated using Flux Dev model |
Generated using Flux Schnell model |
Prompt used for both models: black forest toast spelling out the words 'TASTY', tasty, food photography, dynamic shot
The output has real detailing with all sprinkles making it more realistic. The shining effect on the sauce made it more of a professional shot. Using Flux Dev we got at the first attempt what we were expecting, but for Flux Schnell we ran thrice to get the perfect result.
To get random prompt ideas, you can try our pompt generator to generate Stable Diffusion prompts.
This time for testing, we ran Flux Dev and Flux Schnell model on NVIDIA RTX4060Ti with 16GB VRAM. The generation time was 1 min 31 seconds (fp 16bit) and 50 seconds(fp8 bit).
Now, let's try something with human faces.
Generated using Flux Schnell model |
Prompt used: a beautiful asian model, teal colored short hair, wearing transparent glasses, red lipstick, professional makeup, night life, professional photoshoot, 32k
Amazing, the face looks very realistic having professional lighting and high detailing added to it. The hair is really managed, with every fine detail. Look into her eyes having neutral shining effects.
The flux model has understood the context and added the nice busy lifestyle. The blurred light background is out of focus giving the simple touch of a cityscape urban theme.
Again, let's test something with prompt adherence using Flux Dev model.
Generated using Flux Dev model |
Prompt used: a tiny astronaut hatching from an egg on the moon
Let's try something like complicated typography.
Generated using Flux dev model |
Prompt used: a robotic machine with text label "FLUX" holding a sign board painted with text "I don't like Negatives"
Interestingly, as confirmed on the official page the results were more refined in the Flux Dev model than the Flux Schnell.
The prompt understanding, prompt adherence, typography, and scene complexity with detailing are great when compared with other older diffusion-based models.
2. Image-To-Image:
First of all, to work with the respective workflow you must update your ComfyUI from the ComfyUI Manager by clicking on "Update ComfyUI". This will avoid any errors.
The image-to-image workflow for official FLUX models can be downloaded from the Hugging Face Repository.
Installation in ForgeUI:
First Install ForgeUI if you have not yet.
Type A: By Black forest Labs
1. Download the Flux Dev or Flux Schnell model released by Black forest labs. After downloading, save it inside "Forge/webui/model/Stable-diffusion" folder.
2. Download the VAE(ae.safetensors) from hugging Face repository and save it inside "models/VAE" folder.
3. Now, download text encoders - Clip-l, t5-xxl_fp16(for higher VRAM) and t5xxl_fp8(for lower end VRAM) models from lllyasviel's Hugging Face repository. Save them inside "models/text_encoder" folder.
4.Finally, move to the root installation folder, and click on the "update" bat file to update it. Now, restart your ForgeUI using "run" bat file.
Type B: GGUF By Forge Developer
1. GGUF Variants released by Forge Developer(lllyasviel). Download Flux Dev GGUF / Flux Schnell GGUF models from his hugging face repository.
2. Download the same VAE and text encoder models as mentioned in TYPE A and save them to the relevant directories.
3. At last , move to the Forge root installation folder, and click on the "update" bat file to update it. Now, restart your ForgeUI using "run" bat file.
Type C: Optimized By Forge Developer
1. In case if you want to use the compressed model. There are basically two optimized variants of Flux Dev released officially by Forge Developer(lllyasviel). Another one is the Flux Schnell variant from third party developer(SilverOxides). You need to use it as per your system requirements.
- Download Flux Dev (flux1-dev-bnb-nf4 / flux1-dev-bnb-nf4-v2) from the Hugging Face repository, if you have an NVIDIA RTX 3000/4000 series GPU card. The users having CUDA greater than 11.7 version can use this model. As we have the 4000 series GPU card, we have downloaded this model.
- Download Flux Dev FP8 version from the Hugging Face repository. This is for mostly users with NVIDIA GTX 1000/2000 series cards.
- Download Flux Schnell bnb NF4 from Hugging Face repository. This is again supported in RTX 3000/4000series GPU cards.
2. After downloading, save it inside "Forge/webui/model/Stable-diffusion" folder.
3. Then, move to the Forge Installation folder, and click on the "update" bat file to update it. Now, restart your ForgeUI using "run" bat file.
Forge Workflow Explanation:
1. Open your Forge WebUI, then select the "Flux" option.
2. Select your downloaded model ("flux-dev-bnb-nf4", "flux-dev-fp8" or "flux-schnell-bnb-nf4") from the checkpoint drop-down option.
3. Put your positive prompts. Flux doesn't have the negative prompt feature. So, you donot need to put that.
Recommended settings you should use for :
Flux Dev
- Sampling Method: Euler
- Steps:20-50
- CFG:3.5 (default)
- Dimensions: 1024 by 1024
Flux Schnell
- Sampling Method: Euler
- Steps:4
- CFG:1
- Dimensions: 1024 by 1024
4. Hit the "Generate" button to start your image generation.
Generated using Flux Dev |
Prompt used: A female model in a dramatic pose, wearing an elegant Renaissance gown with intricate embroidery, set against a dark, richly textured background. The lighting should mimic the soft, moody ambiance of Renaissance paintings, with a focus on capturing the fine details of the fabric and the model's expression
Here, we are using NVIDIA RTX4060Ti with 16GBVRAM and the render time took 53 seconds for Flux Dev NF4 variant, generating 1024 by 1024 dimensions image with 20 Sampling Steps.
Using API:
1. As reported, Flux Pro can't be downloaded by open users. To test this model, you need to first create your API key from FalAI(partner of Black Forest Labs ) and inject it into the Flux Pro gradio based WebUI hosted by Hugging Space.
2. Finally, click "Generate Image".
Conclusion:
The Flux is really the most powerful model as compared to the older diffusion models. Finally, after certain tests, we can conclude that Flux Dev is more effective than the Flux Schnell model. Now its available in ComfyUI, Forge and SwarmUI.