Video generation has always been a resource-intensive task, often requiring powerful GPUs and significant processing time. But what if you could generate high-quality videos on a average level GPU? FramePack, a creative approach that's changing how we think about next-frame prediction models for video generation.
It is an image to video diffusion framework developed by researchers Lvmin Zhang and Maneesh Agrawala (the same Lvmin Zhang who also created ControlNet and IC-Light).
FramePack Diffusion model for Longer Video Gen for lower Vrams 🥰 🤟
— Stable Diffusion Tutorials (@SD_Tutorial) April 23, 2025
Setup locally: 👇🧐https://t.co/N4yBt8Y3GT pic.twitter.com/NYd0ulRG0F
Unlike traditional video AI models that we already used, consume excessive memory and often crash on consumer hardware, FramePack is designed to run smoothly on everyday computers, even a laptop with just 6GB of VRAM can generate full 30fps videos but with at least 30 GB of system RAM. You can get more detailed insights from their research paper.
Installation
1. Get ComfyUI installed if are new user. Older user have to update it from the Manager by clicking "Update ComfyUI".
2. Move into "ComfyUI/custom_nodes" folder. Clone the Kijai's y using the following command by typing into command prompt:
git clone https://github.com/kijai/ComfyUI-FramePackWrapper.git
3. Then install the required dependencies using following command into command prompt.
For normal ComfyUI users:
pip install -r requirements.txt
For ComfyUI portable user. Move inside ComfyUI_windows_portable folder and use this command:
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-FramePackWrapper\requirements.txt
4. Now, download the required base models, Text encoders, VAE, and clip Vision.
Here, we want to mention is that if you are already using the native support for HunyuanVideo by ComfyUI also described in our tutorial, you do not need to download these models (Text encoders, VAE, and clip Vision) as the project uses the same models. You only need to download the FramePack I2V diffusion model.
If you have not used, download and place these models into their respective folders:
-Download Clip Vision files , SigClip and put it in your ComfyUI/models/clip_vision folder.
-Download Text Encoders files and save them into your ComfyUI/models/text_encoders directory.
-Download VAE model then put it into your your ComfyUI/models/vae folder.
5. Now, download the FramePack diffusion model. Create folders inside "ComfyUI/models/diffusers/" folder as lllyasviel . Move inside it and create new folder FramePackI2V_HY folder and save the model inside it.
FramePack F1 also been officially released that uses single direction prediction rather than older model that uses the bi-directional prediction. You can download these and store them into the same folder described above.
or other alternative is by using the converted model by developer Kijai.
Download the FramePack I2V FP8 or FramePackI2V Fp16 and save this into your "ComfyUI/models/diffusion_models" folder.
We will update soon whenever the FramePack F1 quantized variant model gets available.
Select the one that suits your hardware and use cases. Fp8 requires low VRAM with lower quality generation whereas FP16 uses high VRAM with high quality output.
6. Restart ComfyUI and refresh it.
Workflow
1. Workflow can be found inside your "ComfyUI/custom_nodes/ComfyUI-FramePackWrapper/example_workflows" folder.
2. Drag and drop into ComfyUI.
(g) Provide relevant positive prompt into the conditioning. Here, negative conditioning is not required.
(h) Set configurations into the FramePack node:
- Set the Video length in seconds
- GPU memory preservation (minimum 6GB)
FramePack represents a significant breakthrough in making high-quality video generation accessible to everyday users. Its innovative approach to memory management and bi-directional sampling solves key challenges that have limited video generation on consumer hardware.
While it's particularly well-suited for certain types of videos and has some limitations with complex scene changes, the ability to generate minutes-long videos in a single pass on a laptop GPU is truly revolutionary. For content creators, researchers, and AI enthusiasts, FramePack opens up new possibilities without requiring enterprise-grade hardware.