Skip to content

Arvo-AI/aurora

Aurora Logo

Aurora

Open source AI agent for automated incident investigation & root cause analysis

GitHub Stars Latest Release License Build Status Forks Discord Book a Demo

Features · How It Works · Integrations · Quick Start · Docs · Website


What's New

  • AI-Suggested Code Fixes — Aurora can now generate pull requests with remediation code
  • Infrastructure Knowledge Graph — Memgraph-powered dependency mapping across all cloud providers
  • Postmortem Export — One-click export to Confluence with full timeline and root cause
  • OVH & Scaleway Support — New cloud connectors for European providers

See the full CHANGELOG for all releases.


Aurora is an open-source (Apache 2.0) AI-powered incident management platform for SRE teams. When a monitoring tool fires an alert, Aurora's LangGraph-orchestrated AI agents autonomously investigate the incident — querying infrastructure across AWS, Azure, GCP, OVH, Scaleway, and Kubernetes, correlating data from 22+ tools, and delivering a structured root cause analysis with remediation recommendations.

Unlike traditional tools that automate workflows (Slack channels, paging, runbooks), Aurora automates the investigation itself.

Aurora Demo — AI agent investigating a cloud incident
Watch the full demo video

Aurora Incident Investigation — AI-generated root cause analysis with timeline and remediation steps

Features

Agentic AI Investigation

Aurora's AI agents dynamically select from 30+ tools to investigate incidents. They run kubectl, aws, az, and gcloud commands in sandboxed Kubernetes pods, query logs, check recent deployments, and correlate data across systems — all autonomously.

Aurora AI agent investigating an incident — running kubectl commands and analyzing pod status

Incident Dashboard

Track all incidents in a single dashboard. Aurora ingests alerts from PagerDuty, Datadog, Grafana, and other monitoring tools via webhooks, automatically triggering background investigations.

Aurora incidents dashboard showing active incidents with severity levels

Automated Postmortem Generation

Aurora generates detailed postmortem reports with timeline, root cause, impact assessment, and remediation steps. Export directly to Confluence.

Aurora auto-generated postmortem with timeline, root cause, and remediation

Infrastructure Knowledge Graph

Visualize your entire infrastructure as a dependency graph powered by Memgraph. When an incident occurs, Aurora traces the blast radius across services and cloud providers.

Aurora infrastructure dependency graph showing service relationships

AI-Suggested Code Fixes

Aurora doesn't just find the root cause — it suggests code fixes and can generate pull requests with the remediation.

Aurora AI-suggested code fix with pull request preview

Additional Capabilities

  • Knowledge Base RAG — Weaviate-powered vector search over your runbooks, past postmortems, and documentation
  • Multi-Cloud Native — AWS (STS AssumeRole), Azure (Service Principal), GCP (OAuth), OVH, Scaleway, Kubernetes
  • Any LLM Provider — OpenAI, Anthropic, Google, or local models via Ollama for air-gapped deployments
  • Terraform/IaC Analysis — Understands your infrastructure-as-code state
  • Self-Hosted — Docker Compose or Helm chart. HashiCorp Vault for secrets management
  • Free Forever — No per-seat or per-incident pricing. Apache 2.0.

How It Works

Alert fires (PagerDuty, Datadog, Grafana, etc.)
        │
        ▼
   Aurora receives webhook
        │
        ▼
   AI agent selects tools (from 30+)
        │
        ├── Queries cloud APIs (AWS, Azure, GCP)
        ├── Runs CLI commands in sandboxed pods
        ├── Checks Kubernetes cluster status
        ├── Searches knowledge base (RAG)
        └── Traverses infrastructure dependency graph
                │
                ▼
   Root Cause Analysis generated
        │
        ├── Structured RCA with timeline
        ├── Impact assessment & blast radius
        ├── Remediation recommendations
        ├── Code fix suggestions (with PRs)
        └── Postmortem exported to Confluence

Integrations

Aurora integrates with 22+ tools across your stack:

Category Tools
Monitoring PagerDuty, Datadog, Grafana, Netdata, Dynatrace, Coroot, ThousandEyes, BigPanda
Cloud Providers AWS, Azure, GCP, OVH, Scaleway
Infrastructure Kubernetes, Terraform, Docker
Communication Slack
Code & Docs GitHub, Bitbucket, Confluence
Search Self-hosted SearXNG
Data Stores Memgraph (graph), Weaviate (vector), PostgreSQL
Secrets HashiCorp Vault

Supported LLM Providers

Provider Models
OpenAI GPT-5.4, GPT-5.2, o3, o4-mini, o3-mini, GPT-4.1, GPT-4.1-mini, GPT-4o, GPT-4o-mini
Anthropic Claude Opus 4.6, Claude Sonnet 4.6, Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5, Claude 3.5 Sonnet, Claude 3 Haiku
Google Gemini Gemini 3.1 Pro Preview, Gemini 3 Flash Preview, Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash Lite
Vertex AI Same Gemini models via Google Cloud with IAM auth
OpenRouter Any model via OpenRouter API (single key for all providers)
Ollama Llama 3.1, Qwen 2.5, and any local model (air-gapped)

Quick Start

Get Aurora running locally for testing and evaluation:

# 1. Clone the repository
git clone https://github.com/arvo-ai/aurora.git
cd aurora

# 2. Initialize configuration (generates secure secrets automatically)
make init

# 3. Edit .env and add your LLM API key
#    Get one from: https://openrouter.ai/keys or https://platform.openai.com/api-keys
nano .env  # Add OPENROUTER_API_KEY=sk-or-v1-...

# 4. Start Aurora (prebuilt from GHCR, or build from source)
make prod-prebuilt   # or: make prod-local to build images locally

# 5. Get Vault root token and add to .env
#    Check the vault-init container logs for the root token:
docker logs vault-init 2>&1 | grep "Root Token:"
#    You'll see output like:
#    ===================================================
#    Vault initialization complete!
#    Root Token: hvs.xxxxxxxxxxxxxxxxxxxxxxxxxxxx
#    IMPORTANT: Set VAULT_TOKEN=hvs.xxxxxxxxxxxxxxxxxxxxxxxxxxxx in your .env file
#               to connect Aurora services to Vault.
#    ===================================================
#    Copy the root token value and add it to your .env file:
nano .env  # Add VAULT_TOKEN=hvs.xxxxxxxxxxxxxxxxxxxxxxxxxxxx

# 6. Restart Aurora to load the Vault token
make down
make prod-prebuilt   # or: make prod-local to build from source

That's it! Open http://localhost:3000 in your browser.

The first user to register becomes the admin. After that, registration is closed — admins invite new users from the Organization page.

Note: Aurora works without any cloud provider accounts! The LLM API key is the only external requirement. Connectors are optional and can be enabled later if needed via the env file.

Endpoints:

To stop: make down | Logs: make logs

If you want cloud connectors, add provider credentials referenced in .env.example.

Pin a specific version

make prod-prebuilt VERSION=v1.2.3

Build from source

make prod-local

Deploy on Kubernetes

helm install aurora ./helm/aurora

For detailed deployment guides, see the Documentation.

Architecture

Component Technology
Backend Python, Flask, Celery, LangGraph
Frontend Next.js
Graph Database Memgraph
Vector Store Weaviate
Secrets HashiCorp Vault
Storage PostgreSQL, Redis, SeaweedFS
aurora/
├── server/      # Python API, chatbot, Celery workers
├── client/      # Next.js frontend
├── config/      # Configuration files
├── deploy/      # Deployment scripts
├── scripts/     # Utility scripts
└── website/     # Documentation (Docusaurus)

Security & Roles

Aurora uses Casbin RBAC with three roles enforced at both the API and UI layers:

Role Capabilities
Admin Full access — manage users, org settings, LLM config, connectors, incidents, chat
Editor Write access — connectors, SSH keys, VMs, knowledge base, incidents, chat
Viewer Read-only — view incidents, postmortems, dashboards, chat
  • Registration is closed after the first (admin) user. New accounts are created by admins only.
  • Backend RBAC via @require_permission decorators on all write endpoints (Casbin policy enforcement).
  • Frontend guards via ConnectorAuthGuard on sensitive pages (SSH keys, VM config, connector auth).
  • CORS is restricted to FRONTEND_URL — no wildcard origins on any endpoint.
  • The Flask API (port 5080) is exposed to the host because the frontend makes direct browser calls to it for some features (OVH/Scaleway VMs, cloud graph). CORS and RBAC protect it.

For more details, see SECURITY.md.

Data Privacy

Aurora is fully self-hosted — your incident data never leaves your environment.

  • All data stays on your infrastructure (Docker Compose or Kubernetes)
  • No telemetry or usage data sent to Arvo AI
  • Secrets stored in HashiCorp Vault with encryption at rest
  • LLM API calls go directly from your infrastructure to your chosen provider
  • Use Ollama for local LLM inference to avoid LLM provider API calls (note: web search, cloud integrations, and Terraform registry still require network access)
  • RBAC enforced at both API and UI layers

Community

We'd love your help making Aurora better.

  • Discord — Ask questions, share feedback, get help
  • GitHub Issues — Report bugs or request features
  • GitHub Discussions — General discussion and ideas
  • Book a Demo — See Aurora in action with our team
  • Website — Learn more about Arvo AI
  • Documentation — Full deployment and configuration guides
  • Blog — Guides on incident management, RCA, and SRE best practices

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Please read our Code of Conduct before participating.

License

Apache License 2.0. See LICENSE.


If Aurora helps your team, give us a star on GitHub!

About

Aurora — Open source AI-powered agentic incident management & root cause analysis for SREs. LangGraph agents investigate across AWS, Azure, GCP, Kubernetes. Integrates with PagerDuty, Datadog, Grafana, Slack. Apache 2.0.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages