This is a personal repo for my work at a startup. The architecture leverages AWS services, Deepgram, OpenAI, Twilio, and Plivo to handle outbound calls, speech-to-text, text-to-speech, and real-time interactions.
Below is the architecture diagram for the project:
Create a .env file based on the following template:
# Twilio Main Account
TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=
TWILIO_PHONE_NUMBER=
# Deepgram Keys
DEEPGRAM_API_KEY1=
DEEPGRAM_API_KEY2=
DEEPGRAM_API_KEY3=
DEEPGRAM_API_KEY4=
DEEPGRAM_API_KEY5=
DEEPGRAM_API_KEY6=
# Open AI
OPENAI_API_KEY=
# AWS Credentials
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_REGION=
PORT=8080- For Deepgram API keys, you can use only one. Multiple keys are provided here to handle more calls since Deepgram has a 100 requests limit for the free version.
-
Install dependencies:
npm i
-
Build the project:
npm run build
-
Start the project:
npm run start
