Whisper

Whisper Whisper

Disclaimer: This post has been translated to English using a machine translation model. Please, let me know if you find any mistakes.

Introductionlink image 5

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. Using such a large and diverse dataset leads to greater robustness against accents, background noise, and technical language. Additionally, it allows for transcription in multiple languages as well as translation of those languages into English.

Website

Paper

GitHub

Model card

Installationlink image 6

To install this tool, it's best to create a new Anaconda environment.

	
!conda create -n whisper
Copy

We enter the environment

	
!conda activate whisper
Copy

We install all the necessary packages

	
!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
Copy

Finally, we install whisper

	
!pip install git+https://github.com/openai/whisper.git
Copy

And we update ffmpeg

	
!sudo apt update && sudo apt install ffmpeg
Copy

Usagelink image 7

We import whisper

	
import whisper
Copy

We select the model, the larger it is, the better it will perform.

	
# model = "tiny"
# model = "base"
# model = "small"
# model = "medium"
model = "large"
model = whisper.load_model(model)
Copy

We load the audio from this old ad (from 1987) for Micro Machines

	
audio_path = "MicroMachines.mp3"
audio = whisper.load_audio(audio_path)
audio = whisper.pad_or_trim(audio)
Copy
	
mel = whisper.log_mel_spectrogram(audio).to(model.device)
Copy
	
_, probs = model.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")
Copy
	
Detected language: en
	
options = whisper.DecodingOptions()
result = whisper.decode(model, mel, options)
Copy
	
result.text
Copy
	
"This is the Micro Machine Man presenting the most midget miniature motorcade of micro machines. Each one has dramatic details, terrific trim, precision paint jobs, plus incredible micro machine pocket play sets. There's a police station, fire station, restaurant, service station, and more. Perfect pocket portables to take any place. And there are many miniature play sets to play with and each one comes with its own special edition micro machine vehicle and fun fantastic features that miraculously move. Raise the boat lift at the airport, marina, man the gun turret at the army base, clean your car at the car wash, raise the toll bridge. And these play sets fit together to form a micro machine world. Micro machine pocket play sets so tremendously tiny, so perfectly precise, so dazzlingly detailed, you'll want to pocket them all. Micro machines and micro machine pocket play sets sold separately from Galoob. The smaller they are, the better they are."

Continue reading

Stream Information in MCP: Complete Guide to Real-time Progress Updates with FastMCP

Stream Information in MCP: Complete Guide to Real-time Progress Updates with FastMCP

Learn how to implement real-time streaming in MCP (Model Context Protocol) applications using FastMCP. This comprehensive guide shows you how to create MCP servers and clients that support progress updates and streaming information for long-running tasks. You'll build streaming-enabled tools that provide real-time feedback during data processing, file uploads, monitoring tasks, and other time-intensive operations. Discover how to use StreamableHttpTransport, implement progress handlers with Context, and create visual progress bars that enhance user experience when working with MCP applications that require continuous feedback.

Last posts -->

Have you seen these projects?

Horeca chatbot

Horeca chatbot Horeca chatbot
Python
LangChain
PostgreSQL
PGVector
React
Kubernetes
Docker
GitHub Actions

Chatbot conversational for cooks of hotels and restaurants. A cook, kitchen manager or room service of a hotel or restaurant can talk to the chatbot to get information about recipes and menus. But it also implements agents, with which it can edit or create new recipes or menus

Subtify

Subtify Subtify
Python
Whisper
Spaces

Subtitle generator for videos in the language you want. Also, it puts a different color subtitle to each person

View all projects -->

Do you want to apply AI in your project? Contact me!

Do you want to improve with these tips?

Last tips -->

Use this locally

Hugging Face spaces allow us to run models with very simple demos, but what if the demo breaks? Or if the user deletes it? That's why I've created docker containers with some interesting spaces, to be able to use them locally, whatever happens. In fact, if you click on any project view button, it may take you to a space that doesn't work.

Flow edit

Flow edit Flow edit

FLUX.1-RealismLora

FLUX.1-RealismLora FLUX.1-RealismLora
View all containers -->

Do you want to apply AI in your project? Contact me!

Do you want to train your model with these datasets?

short-jokes-dataset

Dataset with jokes in English

opus100

Dataset with translations from English to Spanish

netflix_titles

Dataset with Netflix movies and series

View more datasets -->