Most recent video restoration models (like SeedVR1) are accurate but slow, they use many diffusion steps, which takes time and GPU power. Then SeedVR2 has been released by ByteDance registered under Apache2.0 license fix this problem.
Some image models learned to do restoration in one step (called distillation), but doing that for videos is much harder because you lose temporal smoothness between frames (you get flicker frames). Quality drops for high resolution videos.
![]() |
SeedVR2 working architecture (Ref-Official SeedVR2 page) |
Most existing models use fixed-size windows, which break on high resolution videos as they do not align well. That has been improved by SeedVR2 (also called AnonymousVR) that is a super-fast, uses one step video restoration model that uses diffusion transformers and adversarial learning (like GANs).
It fix high resolution even 1080p, real-world videos in a single step, with good quality and smoothness, and without using huge computational power. You can find more detailed information into their research paper.
Installation
1. Install ComfyUI if you are new to it. Update it from the Manager tab(by selecting Update All) if you are using older version.
2. From the Manager section, select Custom nodes Manager option then search and install these nodes:
(a) ComfyUI-SeedVR2_VideoUpscaler by Author- Numz
(b) ComfyUI-VideoHelperSuite by Author- Kosinkadink
(c) CoCOTooks_IO by Author- coco
You do not need to download any models. All the models and respective files will be auto-download from official Hugging face repository, when you run the workflow for the first time. You can track the real time status on ComfyUI terminal.
3. Restart and Refresh ComfyUI.
Workflow
1. The workflow is so simple. Search and Add Load Video node, SeedVR2 Video Upscaler node and Video Combine node.
2. Connect them as shown above.
3. Load your low quality video into Load Video node.
4. Just click Run button to initiate the workflow.
Set the configurations for upscaling your video from SeedVR2 Video Upscaler node.
Tips to Consider
Temporal Consistency (for smooth video)
If you want smooth results between video frames, you need to process at least 5 frames at a time (batch_size = 5). More frames (larger batch_size) = better results but requires A LOT of VRAM (more than 24GB is required).
Model- For speed with low VRAM use SeedVR2 3B Fp8 and for high quality output use SeedVR2 7b Fp16 variant.
Seed- If you are focusing on detail, you can change this much but often not used. But you you observe any artifacts you can experiment with this parameters.
New_resolution- This detects the most shortest edge of your image.
CFG Scale- 1 (default)
Batch_size- Number of frames you want to work with at a time. Less will add flickering effects. Adding more will make your inference slower. Usually it will be multiple of 4 + 1.
Preserve_vram- If you get Out of memory (OOM) error, just enable the preserve_vram option that will going to use the block swap mechanism on your CPU RAM, combine preserve_vram as True and BlockSwap option as enabled.
VRAM Usage Tips
For higher input video resolution you need more VRAM needed. If you get an OOM (Out of Memory) error, first, try to lowering the input resolution of the video. Also consider lowering the output resolution just reduce batch_size (like set to 1).
BlockSwap Configuration (for low VRAM GPUs)
Add the SeedVR2 Block Swap Config node. BlockSwap helps you run big models on small VRAM GPUs by moving parts of the model to the CPU during processing. To use this, add the SEEDVR2 BlockSwap Config node. Connect its output to the block_swap_config input of the SeedVR2 node.
BlockSwap Settings parameters
(a) blocks_to_swap-How much of the model to move to CPU. 0 = fast but needs high VRAM, 16 = balanced and 32 = max savings (very slow).
(b) offload_io_components- Save even more VRAM by offloading extra parts like input/output layers.
(c) enable_debug- Shows you detailed memory use good if you are testing settings.
(d) cache_model- Keeps the model ready in RAM for multiple runs (faster after first run). At the initial run, the model gets loaded and that's some time takes longer. So to avoid that you can enable this that will use the cache memory.
(e) use_non_blocking- Keep it True for better performance.
Start with blocks_to_swap = 16. If you still get memory errors, increase the number (up to 32 or 36). If you have extra VRAM left, lower the number for faster performance.
Problems to Consider
1. Many people found failure cases while working with the workflow. We need to wait for further updates.
2. Training a large video model is more challenging than training image models.
3. Flickering in videos with motion when using the 3B model.
4. 7B model reduces flickering but does not fully eliminate it.
5. Over sharpening occurs, especially in videos with resolution below 480P.
6. The current model relies on heavy computation. Inference time is unsatisfactory for personal users.