This project provides a Dockerized environment for transcribing audio files into subtitle (.srt
format) using OpenAI Whisper. The container includes all necessary dependencies, ensuring a seamless transcription experience.
- Transcribe audio files (
.mp3
,.wav
, etc.) into.srt
subtitle files. - Leverages OpenAI's Whisper with the base model for transcription.
- Automatically saves subtitle in a dedicated
subtitle
folder.
Before using this project, ensure the following are installed on your system:
-
Clone the Repository (if applicable):
git clone [email protected]:thomaskanzig/whisper-transcriber.git cd whisper-transcriber-docker
-
Build the Docker Image:
docker build -t whisper-transcriber .
To transcribe an audio file and generate an .srt subtitle file:
- Place your audio file in the
audio
directory within the project folder. - Run the following command:
docker run -v $(pwd):/app whisper-transcriber audio/<PATH-AUDIO-FILE>
- docker run: Runs the Docker container.
- -v $(pwd):/app: Mounts the current directory into the container’s /app directory.
- whisper-transcriber: Name of the Docker image.
- Path to the audio file you want to transcribe.
If you have an audio file named example.mp3 in your audio/ directory:
docker run -v $(pwd):/app whisper-transcriber audio/example.mp3
The transcription will save the subtitle file as:
subtitle/example.srt
The generated .srt file will be saved in the subtitle folder within your current working directory.
Example structure after running the command:
.
├── audio
│ └── example.mp3
├── subtitle
│ └── example.srt
├── Dockerfile
├── entrypoint.sh
└── README.md
Change the Whisper Model
You can edit the entrypoint.sh file to use a different Whisper model, such as tiny, medium, or large. Modify the line:
whisper "$1" --model base --output_format srt --output_dir /app/subtitle
to:
whisper "$1" --model medium --output_format srt --output_dir /app/subtitle
Common Errors
-
Permission Denied for entrypoint.sh: Ensure the script has executable permissions:
chmod +x entrypoint.sh
-
No Output in subtitle: Verify the audio file path and format. Supported formats include .mp3, .wav, .m4a, etc.