Wan2.1-T2V-14B: Video Generation with HF

Wan2.1-T2V-14B: Video Generation with HF Wan2.1-T2V-14B: Video Generation with HF

Disclaimer: This post has been translated to English using a machine translation model. Please, let me know if you find any mistakes.

It is clear that the largest hub of AI models is Hugging Face. And now they are offering the possibility to perform inference on some of their models using serverless GPU providers.

One of those models is Wan-AI/Wan2.1-T2V-14B, which as of writing this post is the best open-source video generation model, as can be seen in the Artificial Analysis Video Generation Arena Leaderboard

video generation arena leaderboard

If we look at its modelcard we can see on the right a button that says Replicate.

Wan2.1-T2V-14B modelcard

Inference providerslink image 10

If we go to the Inference providers settings page, we will see something like this

Inference Providers

Where we can press the button with a key to enter the API KEY of the provider we want to use, or leave selected the path with two dots. If we choose the first option, it will be the provider who charges us for the inference, while in the second option, it will be Hugging Face who charges us for the inference. So do what suits you best.

Inference with Replicatelink image 11

In my case, I obtained an API KEY from Replicate and added it to a file called .env, which is where I will store the API KEYS and which you should not upload to GitHub, GitLab, or your project repository.

The .env must have this format

HUGGINGFACE_TOKEN_INFERENCE_PROVIDERS="hf_aL...AY"
REPLICATE_API_KEY="r8_Sh...UD"

Where HUGGINGFACE_TOKEN_INFERENCE_PROVIDERS is a token you need to obtain from Hugging Face and REPLICATE_API_KEY is the API KEY of Replicate which you can obtain from Replicate.

Reading the API Keyslink image 12

The first thing we need to do is read the API KEYS from the .env file

	
< > Input
Python
import os
import dotenv
dotenv.load_dotenv()
REPLICATE_API_KEY = os.getenv("REPLICATE_API_KEY")
HUGGINGFACE_TOKEN_INFERENCE_PROVIDERS = os.getenv("HUGGINGFACE_TOKEN_INFERENCE_PROVIDERS")
Copied

Logging in the Hugging Face hublink image 13

To be able to use the Wan-AI/Wan2.1-T2V-14B model, as it is on the Hugging Face hub, we need to log in.

	
< > Input
Python
from huggingface_hub import login
login(HUGGINGFACE_TOKEN_INFERENCE_PROVIDERS)
Copied

Inference Clientlink image 14

Now we create an inference client, we have to specify the provider, the API KEY and in this case, additionally, we are going to set a timeout of 1000 seconds, because by default it is 60 seconds and the model takes quite a while to generate the video.

	
< > Input
Python
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="replicate",
api_key=REPLICATE_API_KEY,
timeout=1000
)
Copied

Video Generationlink image 15

We already have everything to generate our video. We use the text_to_video method of the client, pass it the prompt, and tell it which model from the hub we want to use; if not, it will use the default one.

	
< > Input
Python
video = client.text_to_video(
"Funky dancer, dancing in a rehearsal room. She wears long hair that moves to the rhythm of her dance.",
model="Wan-AI/Wan2.1-T2V-14B",
)
Copied

Saving the videolink image 16

Finally, we save the video, which is of type bytes, to a file on our disk.

	
< > Input
Python
output_path = "output_video.mp4"
with open(output_path, "wb") as f:
f.write(video)
print(f"Video saved to: {output_path}")
Copied
>_ Output
			
Video saved to: output_video.mp4

Generated videolink image 17

This is the video generated by the model

Continue reading

Last posts -->

Have you seen these projects?

Gymnasia

Gymnasia Gymnasia
React Native
Expo
TypeScript
FastAPI
Next.js
OpenAI
Anthropic

Mobile personal training app with AI assistant, exercise library, workout tracking, diet and body measurements

Horeca chatbot

Horeca chatbot Horeca chatbot
Python
LangChain
PostgreSQL
PGVector
React
Kubernetes
Docker
GitHub Actions

Chatbot conversational for cooks of hotels and restaurants. A cook, kitchen manager or room service of a hotel or restaurant can talk to the chatbot to get information about recipes and menus. But it also implements agents, with which it can edit or create new recipes or menus

View all projects -->
>_ Available for projects

Do you have an AI project?

Let's talk.

maximofn@gmail.com

Machine Learning and AI specialist. I develop solutions with generative AI, intelligent agents and custom models.

Do you want to watch any talk?

Last talks -->

Do you want to improve with these tips?

Last tips -->

Use this locally

Hugging Face spaces allow us to run models with very simple demos, but what if the demo breaks? Or if the user deletes it? That's why I've created docker containers with some interesting spaces, to be able to use them locally, whatever happens. In fact, if you click on any project view button, it may take you to a space that doesn't work.

Flow edit

Flow edit Flow edit

FLUX.1-RealismLora

FLUX.1-RealismLora FLUX.1-RealismLora
View all containers -->
>_ Available for projects

Do you have an AI project?

Let's talk.

maximofn@gmail.com

Machine Learning and AI specialist. I develop solutions with generative AI, intelligent agents and custom models.

Do you want to train your model with these datasets?

short-jokes-dataset

HuggingFace

Dataset with jokes in English

Use: Fine-tuning text generation models for humor

231K rows 2 columns 45 MB
View on HuggingFace →

opus100

HuggingFace

Dataset with translations from English to Spanish

Use: Training English-Spanish translation models

1M rows 2 columns 210 MB
View on HuggingFace →

netflix_titles

HuggingFace

Dataset with Netflix movies and series

Use: Netflix catalog analysis and recommendation systems

8.8K rows 12 columns 3.5 MB
View on HuggingFace →
View more datasets -->