Disclaimer: This post has been translated to English using a machine translation model. Please, let me know if you find any mistakes.
Introduction
Whisper
is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. Using such a large and diverse dataset leads to greater robustness against accents, background noise, and technical language. Additionally, it allows for transcription in multiple languages as well as translation of those languages into English.
Installation
To install this tool, it's best to create a new Anaconda environment.
!conda create -n whisper
We enter the environment
!conda activate whisper
We install all the necessary packages
!conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
Finally, we install whisper
!pip install git+https://github.com/openai/whisper.git
And we update ffmpeg
!sudo apt update && sudo apt install ffmpeg
Usage
We import whisper
import whisper
We select the model, the larger it is, the better it will perform.
# model = "tiny"# model = "base"# model = "small"# model = "medium"model = "large"model = whisper.load_model(model)
We load the audio from this old ad (from 1987) for Micro Machines
audio_path = "MicroMachines.mp3"audio = whisper.load_audio(audio_path)audio = whisper.pad_or_trim(audio)
mel = whisper.log_mel_spectrogram(audio).to(model.device)
_, probs = model.detect_language(mel)print(f"Detected language: {max(probs, key=probs.get)}")
Detected language: en
options = whisper.DecodingOptions()result = whisper.decode(model, mel, options)
result.text
"This is the Micro Machine Man presenting the most midget miniature motorcade of micro machines. Each one has dramatic details, terrific trim, precision paint jobs, plus incredible micro machine pocket play sets. There's a police station, fire station, restaurant, service station, and more. Perfect pocket portables to take any place. And there are many miniature play sets to play with and each one comes with its own special edition micro machine vehicle and fun fantastic features that miraculously move. Raise the boat lift at the airport, marina, man the gun turret at the army base, clean your car at the car wash, raise the toll bridge. And these play sets fit together to form a micro machine world. Micro machine pocket play sets so tremendously tiny, so perfectly precise, so dazzlingly detailed, you'll want to pocket them all. Micro machines and micro machine pocket play sets sold separately from Galoob. The smaller they are, the better they are."