Despite major progress in generative video models, a few issues still hold them back such aslong videos often fall apart color drifting, flickering, and quality loss over time. Different tasks require different models like one for text to video, another for image to video, and another for continuation. High-resolution video generation is slow, making real workflows inefficient. LongCat is single, unified model that handles all major video generation tasks while producing longer, higher-quality videos, faster. This removes one of the biggest limitations in current video-generation systems.
![]() |
| LongCat Working Principle |
LongCat-Video stands on several strong research contributions:
-Unified Video Generation Architecture - One model natively supports text-to-video, image-to-video, and video continuation.
-Pretraining on Video-Continuation -This is what makes long, stable video generation possible.
-Coarse-to-Fine Generation Strategy-It refines frames across both time and space, making inference far more efficient.
-Block Sparse Attention-A major performance boost when generating high-resolution videos.
-Multi-Reward RLHF (GRPO)-Multiple reward signals improve coherence, realism, and alignment across tasks.
![]() |
| LongCat model Showcase |
Installation
1. Setup ComfyUI if not yet. If already installed, update it from the Manager by selecting Update All.
2. Make sure you have Kijais custom node Wan Video wrapper installed. If already have then update the custom nodes from the Manager.
3. Download LongCat models from Kijai's hugging face repository:
(a) LongCat model fp8 (LongCat_TI2V_comfy_fp8_e4m3fn_scaled_KJ.safetensors), for 12 to 16 GB VRAM.
(b) Long Cat BF16 (LongCat_TI2V_comfy_bf16.safetensors), for higher VRAM 24 GB or more with better output.
Download any of them as per your system resources, save it inside ComfyUI/models/diffusion_models folder.
4. Download lora models (any of them):
(a) Long Cat BF16 lora (LongCat_refinement_lora_rank128_bf16.safetensors), save this inside ComfyUI/models/loras folder. This needs only 12 steps.
(b) LongCat distill lora alpha(LongCat_distill_lora_alpha64_bf16.safetensors), save this inside ComfyUI/models/loras folder.
6. Restart and Refresh ComfyUI.
Workflow
1. Get the workflow (LongCat_TI2V_example_01.json) inside your ComfyUI/custom_nodes/ComfyUI-WanVideoWrapper/example_workflows folder.
2. Drag and drop into ComfyUI. You will get missing red error nodes. Just install them from Manager. The workflow will be based on Wan 2.2 5b TI2V framework. So all the basic models(wan 5b TI2V model, wan2.1vae, umt5-xxl etc ) will be same. Load them all into its relative node.
3. Execute your workflow by setting up :
(a) Load your image into Load image node.
(b) Load Lora model into WanVideo Lora Select node.
(c) Load wan 2.1vae into Wanvideo vae loader.
(d) Load LongCat model into Wan Video Model Loader node. Select your attention generation model-flash att, sdpa, sage_atten etc.
(e) Add your detailed positive and negative prompts into prompt box.
(f) Hit run to execute the workflow. You can use the WanVideoblockSwap node to control the generation with block swapping techniques if having low VRAMs.
CFG-1
Shift-12
Inside wanVideo scheduler node, Scheduler- LongCat distill Euler
4. (a) Use infinite workflow for longer videos generation. You can extend it further for even longer videos but also takes longer inference time.
(b) There are total 5 groups of same bunch of nodes in the workflow.
To do the extension, move to the last group, then detach the nodes ImageBatchExtendWithOverlap and GetImageSizeAndCount.
(c) Select and copy any of the group (as they are same), paste there. Your new group as extension is ready.
(d) Add new Reroute node, connect the ImageBatchExtendWithOverlap of older group to new Reroute node and then to ImageBatchExtendWithOverlap node of new group.
(e) Finally, connect the ImageBatchExtendWithOverlap of new last group and GetImageSizeAndCount node. Then just repeat the same if you want to generate further longer videos.









