Wan 2.2 Animate (V1/V2): Consistent Video to Video Pose Transfer

Using Wan 2.2 Animate you can easily replicate facial expressions, body gesture to your end video. You can find more in-depth information by accessing their research paper.

wan animate architecture (Ref- Official page)

Well, we already explained the alternative way to transfer pose with Wan 2.2 Vace Fun Control tutorial, but it need good VRAM with high noise and low noise models. Right now, we will see how we can do pose transfer with Wan 2.2 Animate model.

Wan 2.2 Animate Workflow

(Detailed explanation included) 👇 pic.twitter.com/oEi3KWDZp0
— Stable Diffusion Tutorials (@SD_Tutorial) October 4, 2025

Table of Contents

Installation

1. Make sure you have updated your ComfyUI to the latest version. If not yet, just update it from the Manager by clicking Update All.

2. Download the Wan 2.2 Animate Model (BF16/Fp8/GGUF) as given below. Choose the one that supports your system resources-

- Wan 2.2 Animate 14B BF16 (wan2.2_animate_14B_bf16.safetensors) for high Vram with best output

- Wan 2.2 Animate FP8 (Choose any of them)

Wan2_2-Animate-14B_fp8_e4m3fn_scaled_KJ.safetensors for 4000 GPU series or

Wan2_2-Animate-14B_fp8_e5m2_scaled_KJ.safetensors for 3000 GPU series, or

Wan2_2-Animate-14B_fp8_scaled_e4m3fn_KJ_v2.safetensors for 4000 GPU series or

Wan2_2-Animate-14B_fp8_scaled_e5m2_KJ_v2.safetensors for 3000 series card

Then save it into ComfyUI/models/diffusion_models folder. Here, the main difference between V1 and V2 variants is, V2 models will give you better motion with inference speed than the earlier one.

3. Download VAE (wan_2.1_vae.safetensors) and save it into ComfyUI/models/vae folder.

4. Download Text Encoder (umt5_xxl_fp8_e4m3fn_scaled.safetensors) and put it into ComfyUI/models/text_encoders folder.

5. Download Wan Relight Lora (WanAnimate_relight_lora_fp16.safetensors or WanAnimate_relight_lora_fp16_resized_from_128_to_dynamic_22.safetensors) and then save it into ComfyUI/models/loras folder.

6. Download LightX2V I2V model (lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors) and save this into your ComfyUI/models/loras folder.

7. Finally, download ClipVision h (clip_vision_h.safetensors) model. Save it into ComfyUI/models/clip_vision folder.

You can also follow the details for other different Wan 2.2 Models (FP16/Fp8/GGUF) in our WAN 2.2 installation tutorial.

Its to take it into mind, we are showcasing the native support Wan 2.2 animate workflow released officially by ComfyUI and not the Kijai's Wan 2.2 animate workflow(that will be added later here).

Running the workflow

1. Load the image and reference video to transfer the pose.

2. Load models (Wan2.2 Animate, loras, text encoders, vae,clips etc) into their respective nodes.

3. Add the prompts into prompt box.

4. Put red and green dots for better pose estimation.

5. Hit run to execute the workflow.

Workflow Node Explanation

1. Download the workflow (Wan2.2_14B_Animate.json) from our Hugging face repository. Alternatively, you can get the workflow from ComfyUI from the workflow > Template section by searching- Wan 2.2 animate.

If you want the Kijai's workflow then move inside ComfyUI/custom_nodes/ComfyUI-WanVideoWrapper/example_workflows folder.

-wanvideo_WanAnimate_example_01.json (V1)

-wanvideo_WanAnimate_preprocess_example_02.json (V2)

Now, drag and drop workflow into ComfyUI. Finally, restart and refresh ComfyUI.

If you get missing errors nodes message, install the missing nodes from the Manager by selecting Install Missing Custom Nodes option. Then, select all to install them from the list at once. Then, restart and refresh ComfyUI to take effect.

Users getting missing models error message can download them by clicking into its relative links from their respective repositories.

2. Now, after opening the workflow, follow the step by step process-

(a) Video Resolution- Set the video resolution in width and height. More will give you good output with consistency but also requires more VRAM and inference time. So choose this wisely. You can start check by choosing default value as 640 pixels(use multiple of 16). If you get OOM (Out of memory)error, decrese the value and having more Vrams users can increase this.

For eg. we use 1120 pixel(that is 16x70), get OOM error. Now, we will decrease value and try with 960 pixels(that is 16x60) and so on.

(b) Load Models- Select the downloaded models (Wan2.2 Animate, loras, text encoders, vae,clips etc) one by one. By the way, all the respective models get preselected for you. If you get error message as missing models, follow the installation section explained above.

(c) Prompting- Add you detailed positive and negative prompts.

(d) Input image- Upload your reference image into Load image node.

(e) Input Video- Load your original video as input for reference to replicate the style of video. Use max 5 seconds long for faster generation. The node can auto manage the audio as well if you upload the video with audio embedded. So, you donot need to take care for this. This means you will get the same audio embedded with your generated video.

(f) Upscale image- This node provides you the provision to set the image resolution. This prvents to handle the oversized video. It will upscale you image automatically if you donot set it.

(g) DW Pose Estimator- It uses the DW Pose model feature to estimate and control the body-hand movement, facial expressions in the driving video. You can leave these as default.

(h) Character Mask & Background Video Preprocessing- Disable the Sampling+ video output group (right click and select-set group to never option) so that we can do preprocessing before videogen that will gonna save our time. After preprocessing confirmation, you can renable it for video generation.

Enable the Mask Preview and Preview image node by selecting these node and again select unbypass option. This will give you the clear picture frame by frame whats happening on into the video generation and how the model is detecting the character movement.

Now, the points editor node helps you to mask the character and do the video preprocessing. Use the point editor to control your character movement. You will have two colored points (red-negative point to select the background & green-positive point to select the character) starts with 0.

Use Shift key + Left click to add green dots and Shift key + right click to add the red dots. Precise selection isn't required.

Use multiple (5-10 each) red & green dots, as it will help the model to clearly identify the character movement with the background. To set to default state, click the New Canvas option and all the dots will get removed. Alternatively, you can also use the fix node recreate option by right clicking on Points editor node if you to reset the node.

At last, enable the Sampling+ video output group (right click and select-set group to always option) for video generation .

(i) Wan Animate To Video- Inside the Sampling+ video output group, you will get the WanAnimateToVideo node. Its the main part that takes prompts, vae, clip, input image with videos etc. Set the background video, character mask option.

Use the length value to set video frames length (default-77). If your inputted video have less frames than 77, the rest will be the still image.

There are two modes-
-Mix mode (Character replace)- replace the character from inputted video with the inputted image maintaining the same background.
-Pose mode (Pose transfer)- animates the character with background included in the input image.

Use these nodes background video, character mask to disconnecting / connect to use the two modes. If these are connected then you are on Mix mode, and disconnected means you are using the Pose mode.

(j) KSampler settings- Configure your KSampler settings:

Steps-6 (default), use 1-2 (for faster generation/testing with low quality), use 7-10 (for better video generation with slower inference)
CFG- 1.0
Sampler- Euler
Scheduler- Simple
Denoise-1.0

(k) Video extend example + Video Output- This group section helps you to Extending the video generation more than 4 seconds. Enable/disable using right click on the group and select set group to never/always option.

Make sure you use inputted video length more than 8 seconds to get the clear extension otherwise you will get the rest frames as static.

Tip- Now, if you wish to do longer infinite generation, you just need to do some changes. Add new WanAnimateToVideo node(copy it), then connect the output of video frame offset from existing WanAnimateToVideo node (Sampling+ video output group) to input of video frame offset from new WanAnimateToVideo node. Then, Vae Decode to Continue motion of WanAnimateToVideo node.

People are reporting that its quite good. Well, actually it's better but not perfect. Often your wrong input files or parameters will generate bad results like weird hands and other artifacts.You need to do multiple trials to gain the advantage.

Wan 2.2 Animate (V1/V2): Consistent Video to Video Pose Transfer

Installation

Running the workflow

Workflow Node Explanation

Posted by Administrator

Search This Blog

Popular Posts

Wan2.2 VideoGen locally in ComfyUI (FP16/FP8/GGUF)

Qwen Image Edit 2509 GGUF/fp8/Bf16 Multi Image Editing

Wan 2.2 LoRA Training (Windows/Linux)

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Run Stable Diffusion 10x faster on AMD GPUs

Wan 2.1: Install & Generate Videos locally (FP16/BF16/FP8/GGUF)

Important Pages

Our Social Page

Recent Post

Contact form

Wan 2.2 Animate (V1/V2): Consistent Video to Video Pose Transfer

Installation

Running the workflow

Workflow Node Explanation

Posted by Administrator

Related Posts

Search This Blog

Our Social Community

Popular Posts

Wan2.2 VideoGen locally in ComfyUI (FP16/FP8/GGUF)

Qwen Image Edit 2509 GGUF/fp8/Bf16 Multi Image Editing

Wan 2.2 LoRA Training (Windows/Linux)

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Run Stable Diffusion 10x faster on AMD GPUs

Wan 2.1: Install & Generate Videos locally (FP16/BF16/FP8/GGUF)

Important Pages

Our Social Page

Recent Post

Contact form