Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

This guide provides step-by-step instructions for installing and running Wan-Move locally. The framework is implemented as a minimal extension on top of the Wan2.1 codebase, making it easy to set up if you have prior experience with similar systems.

Important Note

Wan-Move is implemented as a minimal extension on top of the Wan2.1 codebase. If you have tried Wan2.1, you can reuse most of your existing setup with very low migration cost.

Prerequisites

Before installing Wan-Move, ensure you have:

Python 3.8 or later
PyTorch 2.4.0 or later
CUDA-capable GPU (recommended for practical inference times)
Git for cloning the repository
Sufficient disk space for model weights (approximately 30GB)

Step 1: Clone the Repository

Clone the Wan-Move repository from GitHub:

git clone https://github.com/ali-vilab/Wan-Move.git
cd Wan-Move

Step 2: Install Dependencies

Install the required Python packages. Ensure PyTorch 2.4.0 or later is installed:

# Ensure torch >= 2.4.0
pip install -r requirements.txt

Step 3: Download Model Weights

The Wan-Move-14B-480P model is available through both Hugging Face and ModelScope platforms. Choose the download method that works best for your location and setup.

Available Models

Model	Platform	Notes
Wan-Move-14B-480P	Hugging Face / ModelScope	5s 480P video generation

Option A: Download via Hugging Face CLI

First, install the Hugging Face CLI tool:

pip install "huggingface_hub[cli]"

Then download the model:

huggingface-cli download Ruihang/Wan-Move-14B-480P --local-dir ./Wan-Move-14B-480P

Option B: Download via ModelScope CLI

First, install the ModelScope package:

pip install modelscope

Then download the model:

modelscope download churuihang/Wan-Move-14B-480P --local_dir ./Wan-Move-14B-480P

Step 4: Run the Default Example

The repository includes a sample case in the examples folder. Run this to verify your installation:

python generate.py \
  --task wan-move-i2v \
  --size 480*832 \
  --ckpt_dir ./Wan-Move-14B-480P \
  --image examples/example.jpg \
  --track examples/example_tracks.npy \
  --track_visibility examples/example_visibility.npy \
  --prompt "A laptop is placed on a wooden table. The silver laptop is connected to a small grey external hard drive and transfers data through a white USB-C cable. The video is shot with a downward close-up lens." \
  --save_file example.mp4

Evaluation on MoveBench

MoveBench is a benchmark dataset for evaluating motion-controllable video generation. To use it, first download the dataset:

huggingface-cli download Ruihang/MoveBench --local-dir ./MoveBench --repo-type dataset

Important Notes for MoveBench

MoveBench provides video captions. For fair evaluation, turn off the prompt extension function developed in Wan2.1.
MoveBench supports both English and Chinese. Use the --language flag: en for English, zh for Chinese.

Single-GPU Inference on MoveBench

For single-object motion test:

python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode single --language en --save_path results/en --eval_bench

For multi-object motion test:

python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode multi --language en --save_path results/en --eval_bench

Multi-GPU Inference on MoveBench

Wan-Move supports FSDP and xDiT USP for accelerated inference. When running multi-GPU batch evaluation, disable the Ulysses strategy by setting --ulysses_size 1. Ulysses is only supported when generating a single video with multi-GPU inference.

For single-object motion test with 8 GPUs:

torchrun --nproc_per_node=8 generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode single --language en --save_path results/en --eval_bench --dit_fsdp --t5_fsdp

For multi-object motion test with 8 GPUs:

torchrun --nproc_per_node=8 generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode multi --language en --save_path results/en --eval_bench --dit_fsdp --t5_fsdp

Running Evaluation

After generating all results, update the results storage path in MoveBench/bench.py, then run:

python MoveBench/bench.py

Additional Options

Visualizing Trajectories

To visualize trajectory motion effects in your videos, add the --vis_track flag:

python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode single --language en --save_path results/en --eval_bench --vis_track

A separate visualization script is also available at scripts/visualize.py for different visualization settings.

Reducing Memory Usage

If you encounter OOM (Out-of-Memory) issues, use these options to reduce GPU memory usage:

python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --offload_model True --t5_cpu

Input Data Format

Wan-Move requires specific input formats for trajectories and visibility masks:

Trajectory Files (.npy)

NumPy arrays containing x,y coordinates for each tracked point across all frames. Shape should be (num_frames, num_points, 2).

Visibility Files (.npy)

NumPy arrays indicating when points are visible or occluded. Shape should be (num_frames, num_points), with 1 for visible and 0 for occluded.

Troubleshooting

Problem: CUDA Out of Memory

Solution: Use the --offload_model True and --t5_cpu flags to reduce GPU memory usage.

Problem: PyTorch Version Too Old

Solution: Upgrade PyTorch to version 2.4.0 or later: pip install --upgrade torch

Problem: Model Download Fails

Solution: Try the alternative download method (Hugging Face or ModelScope). Ensure you have stable internet connection.

Next Steps

After installation, you can:

Experiment with the provided examples in the examples folder
Create your own trajectory data using motion tracking tools
Run evaluation on MoveBench to compare results
Explore different motion control applications
Contribute to the project on GitHub

Gradio Demo Coming Soon

The research team has indicated plans to release a Gradio demo interface that will provide an easier way to interact with Wan-Move through a web interface. Stay tuned for updates on the GitHub repository.

Note: This installation guide is based on the official Wan-Move repository. For the most up-to-date information and troubleshooting, refer to the GitHub repository and official documentation.

Installation Guide