Effortless dictation powered by Gemini. Turn rambling voice recordings into perfectly transcribed and polished notes.
AudioWrite is a client-side web application that leverages Google's Gemini AI to transform your voice recordings into accurate transcriptions and then refines them into well-structured, polished notes.
- 🗣️ Voice Recording & Dictation: Record audio directly in your browser.
- 🧠 AI-Powered Transcription: Fast and accurate speech-to-text using Gemini.
- 📝 AI-Powered Note Polishing: Gemini refines raw transcriptions into clean, Markdown-formatted notes.
- 🌐 Multi-language Output: Select the output language for polished notes.
- ✨ Customizable Polishing Prompts: Guide the AI with specific instructions.
- 📋 Copy to Clipboard: Easily copy raw or polished notes.
- 👁️ Live Audio Waveform: Visual feedback during recording.
- 🎯 Focus Mode: Minimalist UI to help you concentrate.
- 💾 Local Storage: Notes and your API key are saved persistently in your browser.
- 🎨 Dark & Light Themes: Switch to your preferred mode.
- 🗂️ Note Archive: Manage, load, re-polish, and delete notes.
- 📱 Responsive Design: Works on desktop and mobile.
- PWA Progressive Web App: Installable for an app-like experience with offline asset caching.
- Frontend: HTML5, CSS3, TypeScript
- AI: Google Gemini API (
@google/genai) - Markdown Rendering:
marked - PWA:
vite-plugin-pwafor Service Worker generation and manifest handling. - Storage: Browser Local Storage
- Build Tool: Vite
- A modern web browser (e.g., Chrome, Firefox, Safari, Edge).
- Your own Google Gemini API Key. You can obtain one from Google AI Studio. The application will guide you on how to get one in the settings menu.
If you want to run a local copy after cloning the repository:
-
Clone the repository:
git clone https://github.com/hoomanick/AudioWrite.git cd AudioWrite -
Install dependencies:
npm install
-
Run the development server:
npm run dev
This will usually open the app in your browser at
http://localhost:5173(or a similar port). -
Set Your API Key:
- Click the Settings icon (🔑) in the app.
- Enter your Gemini API Key and click "Save & Apply Key".
- This key is required for all AI features. It's stored in your browser's
localStorage, so it persists between sessions.
- Set API Key: Click the Settings icon (🔑). Enter your Gemini API Key and click "Save & Apply". The app will guide you if you don't have one.
- Record: Click the microphone button (🎙️). Grant microphone permission if prompted.
- Speak: Dictate your content.
- Stop: Click the stop button (⏹️).
- Review & Edit:
- The Polished Note is shown by default. Use the "Copy Polished" button to copy its content.
- Switch to Raw Transcription using the tabs. Use the "Copy Raw" button for its content.
- Edit the note title, raw transcription, or polished content directly. Changes save automatically.
- Customize (Optional):
- Output Language: Select your desired language for the polished note.
- Custom Prompt (✨): Provide specific instructions to the AI for note polishing (e.g., "Summarize in 3 bullet points," "Adopt a formal tone").
- Manage Notes:
- New Note (📄): Creates a blank note.
- Archive (🗄️): View, load, re-polish, or delete saved notes.
- Theme (☀️/🌙): Toggle light/dark mode.
- Installable: On supported devices, install AudioWrite for an app-like experience.
- Offline Access: The app shell and previously saved notes/API key (from local storage) are accessible offline. Core app assets are cached by the service worker.
- Note: AI features require an active internet connection.
AudioWrite is built using Vite and deployed as a static website to GitHub Pages.
- Push your code to your GitHub repository (
mainbranch for source). - Ensure
vite.config.tshas the correctbasepath (e.g.,/AudioWrite/). - Ensure
package.jsonhas the correcthomepageURL. - Run
npm run deploy. This script builds the project and pushes thedistfolder contents to thegh-pagesbranch. - Configure GitHub Pages (Settings > Pages) to deploy from the
gh-pagesbranch. - Your site will be available at
https://hoomanick.github.io/AudioWrite/. - Users of the deployed version will need to provide their own Gemini API Key.
Contributions are welcome! Please feel free to fork the project, create a feature branch, commit your changes, and open a Pull Request.
This project is licensed under the Apache License 2.0. See the SPDX-License-Identifier in index.tsx or visit Apache License 2.0.
- Created by Hooman Nick.
- Powered by the Google Gemini API.
- Uses Marked.js for Markdown rendering and Font Awesome for icons.
- Built with Vite and
vite-plugin-pwa.