InnerVoice

InnerVoice is a Telegram bot that transcribes and translates voice messages using OpenAI's Whisper model. Built with aiogram and other Python libraries, the bot processes incoming voice messages, converts them to WAV format via ffmpeg, and then uses Whisper to generate both a transcription and a translation.

Overview

InnerVoice enables Telegram users to simply send a voice message and receive:

A transcription of the audio.
A translation of the spoken content.

The bot is designed for reproducibility and ease of deployment on various servers. You can clone or download the repository and follow the instructions below to set up your environment.

Features

Voice-to-Text Transcription: Converts audio messages to text.
Translation: Provides a translation of the transcribed text.
Resource-Aware Processing: Monitors CPU usage and delays processing if necessary.
Logging: Detailed logs in bot.log aid in debugging and performance monitoring.

Requirements

The bot requires:

Python 3.8+
ffmpeg for audio conversion
A Telegram Bot API Token (generated via BotFather)
Linux/macOS/Windows environment

System Dependencies

Ensure the following packages are installed before running the bot:

sudo apt update
sudo apt install python3-venv
sudo apt install ffmpeg
sudo apt install util-linux

Note: ffmpeg is required to convert OGG voice messages to WAV format. util-linux provides ionice for resource-aware processing.

Installation & Deployment

1. Clone the Repository

git clone https://github.com/arkano1dev/InnerVoice.git
cd InnerVoice

2. Set Up the Virtual Environment

python3 -m venv venv
source venv/bin/activate  # For Linux/macOS
venv\Scripts\activate    # For Windows (Command Prompt)

3. Install Dependencies

For CPU (No CUDA/GPU)

pip install aiogram python-dotenv psutil
pip install tiktoken
pip install openai-whisper --no-cache-dir
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

For GPU (CUDA Support)

pip install -r requirements.txt

Note: The requirements.txt includes the standard installation with CUDA support. If you are running the bot on a system without a GPU, follow the CPU installation steps instead.

4. Set Up Environment Variables

Create a .env file inside the InnerVoice folder:

echo 'BOT_TOKEN=your_telegram_token_here' > .env

Replace your_telegram_token_here with your actual Telegram Bot API key.

Running the Bot

1. Activate the Virtual Environment

Each time you start the bot, activate the virtual environment:

source venv/bin/activate  # Linux/macOS
venv\Scripts\activate    # Windows

2. Start the Bot

python3 bot.py

3. Running in the Background (Linux)

To keep the bot running after closing the terminal:

nohup python3 bot.py > bot.log 2>&1 &

nohup ensures the bot continues running in the background.

To stop the bot:

pkill -f bot.py

4. Checking Logs

Monitor the bot logs using:

tail -f bot.log

Usage

Start the Bot:
- Run python3 bot.py after setting up your environment.
Interacting via Telegram:
- Send the /start command to receive a welcome message.
- Send a voice message. The bot will:
  - Download and convert the audio.
  - Process the audio using the Whisper model.
  - Return the transcription and translation.
Logging:
- Refer to bot.log for detailed logs and troubleshooting information.

The performance of Inervoice depends on the Whisper model variant selected. See the guide below:

Model Variant	CPU Requirements	Memory (RAM)	GPU (Optional)	Notes
Tiny	2+ cores	≥ 2 GB	Not required	Fastest response; lower accuracy. Ideal for low-resource devices.
Base	2+ cores	≥ 2–3 GB	Not required	Improved accuracy over Tiny; minimal resource use.
Small	4+ cores	≥ 4 GB	Beneficial: ~2–3 GB VRAM if using GPU	Balances speed and accuracy well.
Medium	4–8 cores	≥ 8 GB	Recommended: at least 4 GB VRAM for GPU use	Better accuracy; default model used in Inervoice.
Large	8+ cores	≥ 16 GB	Strongly recommended: high-end GPU (≥ 8 GB VRAM)	Highest accuracy; most resource intensive.

Customization & Contributing

Change Model Variant: Modify the following line in bot.py to use a different Whisper model:
```
model = whisper.load_model("medium")
```
Replace "medium" with "tiny", "base", "small", or "large".
Contributions: Contributions are welcome! Please open an issue or submit a pull request for improvements.

License

This project is licensed under the MIT License. Feel free to use and modify it as needed.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
bot.py		bot.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InnerVoice

Table of Contents

Overview

Features

Requirements

System Dependencies

Installation & Deployment

1. Clone the Repository

2. Set Up the Virtual Environment

3. Install Dependencies

For CPU (No CUDA/GPU)

For GPU (CUDA Support)

4. Set Up Environment Variables

Running the Bot

1. Activate the Virtual Environment

2. Start the Bot

3. Running in the Background (Linux)

4. Checking Logs

Usage

Customization & Contributing

License

About

Releases

Packages

Languages

arkano1dev/InnerVoice

Folders and files

Latest commit

History

Repository files navigation

InnerVoice

Table of Contents

Overview

Features

Requirements

System Dependencies

Installation & Deployment

1. Clone the Repository

2. Set Up the Virtual Environment

3. Install Dependencies

For CPU (No CUDA/GPU)

For GPU (CUDA Support)

4. Set Up Environment Variables

Running the Bot

1. Activate the Virtual Environment

2. Start the Bot

3. Running in the Background (Linux)

4. Checking Logs

Usage

Customization & Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages