Video generation has always been a resource-intensive task, often requiring powerful GPUs and significant processing time. But what if you could generate high-quality videos on a average level GPU? FramePack, a creative approach that's changing how we think about next-frame prediction models for video generation.
It is an image to video diffusion framework developed by researchers Lvmin Zhang and Maneesh Agrawala (the same Lvmin Zhang who also created ControlNet and IC-Light).
FramePack Diffusion model for Longer Video Gen for lower Vrams 🥰 🤟
— Stable Diffusion Tutorials (@SD_Tutorial) April 23, 2025
Setup locally: 👇🧐https://t.co/N4yBt8Y3GT pic.twitter.com/NYd0ulRG0F
Unlike traditional video AI models that we already used, consume excessive memory and often crash on consumer hardware, FramePack is designed to run smoothly on everyday computers, even a laptop with just 6GB of VRAM can generate full 30fps videos. You can get more detailed insights from their research paper.
Installation
1. Get ComfyUI installed if are new user. Older user have to update it from the Manager by clicking "Update ComfyUI".
2. Move into "ComfyUI/custom_nodes" folder. Clone the Kijai's y using the following command by typing into command prompt:
git clone https://github.com/kijai/ComfyUI-FramePackWrapper.git
3. Then install the required dependencies using following command into command prompt.
For normal ComfyUI users:
pip install -r requirements.txt
For ComfyUI portable user. Move inside ComfyUI_windows_portable folder and use this command:
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-FramePackWrapper\requirements.txt
4. Now, download the required base models, Text encoders, VAE, and clip Vision.
Here, we want to mention is that if you are already using the native support for HunyuanVideo by ComfyUI also described in our tutorial, you do not need to download these models (Text encoders, VAE, and clip Vision) as the project uses the same models. You only need to download the FramePack I2V diffusion model.
If you have not used, download and place these models into their respective folders:
-Download Clip Vision files , SigClip and put it in your ComfyUI/models/clip_vision folder.
-Download Text Encoders files and save them into your ComfyUI/models/text_encoders directory.
-Download VAE model then put it into your your ComfyUI/models/vae folder.
5. Now, download the FramePack diffusion model. Create folders inside "ComfyUI/models/diffusers/" folder as lllyasviel . Move inside it and create new folder FramePackI2V_HY folder and save the model inside it.
or
Download the FramePack I2V FP8 or FramePackI2V Fp16 and save this into your "ComfyUI/models/diffusion_models" folder.
Select the one that suits your hardware and use cases. Fp8 requires low VRAM with lower quality generation whereas FP16 uses high VRAM with high quality output.
6. Restart ComfyUI and refresh it.
Workflow
1. Workflow can be found inside your "ComfyUI/custom_nodes/ComfyUI-FramePackWrapper/example_workflows" folder.
2. Drag and drop into ComfyUI.
3. In ComfyUI, you can adjust various settings like:
- Upload your image
- Put relevant prompt into prompt box
- Model precision (BF16, FP16, FP32)
- Set the Attention mode (SDPA, Flash Attention, or Sage Attention)
- Set the Video length in seconds
- GPU memory preservation (minimum 6GB)
FramePack represents a significant breakthrough in making high-quality video generation accessible to everyday users. Its innovative approach to memory management and bi-directional sampling solves key challenges that have limited video generation on consumer hardware.
While it's particularly well-suited for certain types of videos and has some limitations with complex scene changes, the ability to generate minutes-long videos in a single pass on a laptop GPU is truly revolutionary. For content creators, researchers, and AI enthusiasts, FramePack opens up new possibilities without requiring enterprise-grade hardware.