When it comes to video creation, gaming, memes, or AI agents, there have not been much free options for text-to-speech models and most are paid and enjoy a near monopoly. But now, you can try Chatterbox, a free and open-source text-to-speech model developed by Resemble AI and licensed under MIT.
Chatterbox can generate speech from text and features emotion exaggeration control. It also allows you to clone your voice using just a 5-second audio sample. Currently, it only supports English, but support for other languages is likely in the near future.
❌Forget ElevenLabs .....
— Stable Diffusion Tutorials (@SD_Tutorial) May 29, 2025
😛Chatterbox by Resemble AI's
An Open source Free Txt To Speech model
For-memes, videos, games, or AI agents
Demo: https://t.co/AuE53Xt8Ve
Github:https://t.co/gvfrydM63D
Comfy : coming soon pic.twitter.com/ZDE0grEQa6
They claim that Chatterbox outperforms premium closed-source TTS models like ElevenLabs. You can explore their research and performance benchmarks in their analysis project.
We also have Bark from SunoAI that supports multiple languages and emotions, but it does not auto-detect emotional tone and you have to specify the emotion style manually. Unlike Chatterbox, it is under a non-commercial license and due to the limitations, you can generate only up to 14 seconds of audio at a time.
Installation
1. Install ComfyUI if you are new user. Older user need to update it from the Manager tab.
2. Move inside your ComfyUI/custom_nodes folder. Clone the repository using following command:
git clone https://github.com/filliptm/ComfyUI_Fill-ChatterBox.git
3. Install the required dependencies using command prompt.
For normal ComfyUI users:
pip install -r requirements.txt
For ComfyUI portable user, move inside ComfyUI_windows_portable folder and use this command:
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI_Fill-ChatterBox\requirements.txt
4. The Chatterbox TTS Model will be automatically downloaded from the ResembleAI's Hugging Face repository when you run the workflow for the first time. So, downloading manually is not required. You can track the real-time downloading status from ComfyUI's terminal.
4. Restart and refresh ComfyUI to take effect.
Workflow
1. After installation, get the workflow from ComfyUI/custom_nodes/ComfyUI_Fill-ChatterBox/web folder.
2. Drag and Drop into ComfyUI. You can also create these workflows by searching "FL Chatterbox" node. Here, you will get two workflows:
(a) Text to Speech workflow
LoadAudio : Uploads or loads an existing audio file.
FL chatterBox TTS node :
exaggeration: 0.50 (Controls how expressive the voice is. Higher means more dramatic.)
cfg_weight: 0.50 (Balances between creativity and prompt adherence. )
temperature: 0.80 (This affects randomness; higher = more diverse/creative outputs.)
use_cpu: false (The model runs on GPU for faster inference).
keep_model_loaded: true (Keeps the model in memory to reduce reloading times.)