Survey Studio 📚

A multi-agent literature review assistant powered by AutoGen. Survey Studio uses AI agents to automatically search arXiv, analyze research papers, and generate comprehensive literature reviews through a clean REST API.

🌟 Features

Multi-Agent System: Two specialized AI agents work together:
- Search Agent 🔍: Crafts optimized arXiv queries and retrieves relevant papers
- Summarizer Agent 📝: Generates structured literature reviews with key insights
Multi-Provider AI Support: Intelligent fallback across 4 AI providers:
- Together AI - Cost-effective with generous free tier
- Google Gemini - Fast and capable for complex analysis
- Perplexity - Research-focused with web access
- OpenAI - Reliable fallback with consistent performance
Cost Optimization: Automatic provider selection based on cost efficiency and availability
Usage Monitoring: Track API usage, costs, and performance across all providers
REST API: Clean, well-documented API functions for building custom interfaces
arXiv Integration: Direct access to the world's largest repository of academic papers
Configurable: Adjustable number of papers, AI models, and search parameters
Export Support: Generate Markdown and HTML reports
Professional Development Setup: Full CI/CD pipeline with linting and type checking

🧭 Architecture

flowchart TD
  subgraph API["REST API Layer"]
    Client["Client Application"] -->|"HTTP requests"| APIFunctions["API Functions"]
    APIFunctions -->|"JSON responses"| Client
  end

  subgraph Backend["Survey Studio Core"]
    Orchestrator["Orchestrator"] --> SearchAgent
    Orchestrator --> SummarizerAgent
    LLMFactory["LLM Factory"] -->|"intelligent selection"| AIProviders
  end

  subgraph AIProviders["AI Providers"]
    TogetherAI["Together AI<br/>(Primary)"]
    Gemini["Google Gemini<br/>(Secondary)"]
    Perplexity["Perplexity<br/>(Research)"]
    OpenAI["OpenAI<br/>(Fallback)"]
  end

  SearchAgent["Search Agent"] -->|"queries"| ArXiv[("arXiv API")]
  SearchAgent -->|"returns papers"| Orchestrator
  SummarizerAgent["Summarizer Agent"] -->|"LLM calls"| LLMFactory
  LLMFactory -->|"fallback on failure"| AIProviders
  Orchestrator -->|"results"| APIFunctions

REST API Layer: Provides clean functions for building custom interfaces
Search Agent: Generates and executes arXiv queries
Summarizer Agent: Produces structured review using AI models
LLM Factory: Intelligently selects and manages AI providers with fallback
Orchestrator: Manages the multi-agent loop and data flow

🚀 Quick Start

Prerequisites

Python 3.12.11+
Poetry (for dependency management)
At least one AI provider API key

Installation

Clone the repository:

git clone https://github.com/Aditya-gam/survey-studio.git
cd survey-studio

Install dependencies with Poetry:
```
poetry install
```

Set up environment variables:

Option A: Using .env file (Recommended for local development):

# Copy the template and add your API key
cp .env.example .env
# Edit .env and add your actual API key

Option B: Using environment variable:

export OPENAI_API_KEY="your-openai-api-key-here"

🌐 REST API

Survey Studio provides a comprehensive REST API that enables developers to integrate literature review capabilities into their applications. The API follows modern REST principles and includes complete OpenAPI/Swagger documentation.

Quick Start Workflow

Follow this complete workflow to get started with the Survey Studio API:

1. Start the Development Server

poetry run dev

The server will start on localhost:8000 with hot reload enabled for development.

2. Verify Server Health

Check that everything is working correctly:

curl -X GET "localhost:8000/health"

Expected Response:

{
  "status": "healthy",
  "providers": {
    "available_count": 1,
    "best_provider": "together-ai",
    "providers": [
      {
        "name": "together-ai",
        "model": "meta-llama/Llama-3.1-70B-Instruct-Turbo",
        "priority": 1,
        "free_tier_rpm": 30,
        "free_tier_tpm": 200000
      }
    ]
  },
  "timestamp": "2025-01-20T10:30:45.123456",
  "version": "0.1.0"
}

3. (Optional) Validate Review Parameters

Before running a full review, you can validate your parameters:

curl -X POST "localhost:8000/api/v1/validate" \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "retrieval augmented generation",
    "num_papers": 5,
    "model": "auto"
  }'

Expected Response:

{
  "status": "completed",
  "results": {
    "valid": true,
    "message": "Parameters are valid",
    "topic": "retrieval augmented generation",
    "num_papers": 5,
    "model": "auto"
  }
}

4. Run Literature Review

Execute a complete literature review:

curl -X POST "localhost:8000/api/v1/reviews" \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "retrieval augmented generation",
    "num_papers": 5,
    "model": "auto"
  }'

Expected Response:

{
  "status": "completed",
  "results": [
    "# Literature Review: Retrieval Augmented Generation\n\n## Overview\nRetrieval-Augmented Generation (RAG) represents a significant advancement...",
    "## Key Findings\n1. RAG systems demonstrate superior performance in knowledge-intensive tasks...",
    "## Paper Analysis\n### Paper 1: 'Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks'\n- **Authors**: Lewis et al.\n- **Summary**: This foundational work introduces..."
  ]
}

5. Export Results

Generate formatted exports of your review results:

curl -X POST "localhost:8000/api/v1/export" \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "retrieval augmented generation",
    "results_frames": [
      "# Literature Review: Retrieval Augmented Generation...",
      "## Key Findings...",
      "## Paper Analysis..."
    ],
    "num_papers": 5,
    "model": "auto",
    "session_id": "session_12345",
    "format_type": "markdown"
  }'

Expected Response:

{
  "content": "# Survey Studio Literature Review\n\n**Topic**: Retrieval Augmented Generation...",
  "filename": "literature_review_retrieval_augmented_generation_20250120_103045.md",
  "mime_type": "text/markdown",
  "format": "markdown",
  "metadata": {
    "topic": "retrieval augmented generation",
    "num_papers": 5,
    "model": "auto",
    "session_id": "session_12345",
    "generated_at": "2025-01-20T10:30:45.123456"
  }
}

API Endpoints Reference

Method	Endpoint	Description
`GET`	`/health`	Service health check with provider status
`GET`	`/`	Basic service information
`GET`	`/providers`	Detailed AI provider configuration
`GET`	`/models`	Available AI models by provider
`POST`	`/api/v1/validate`	Validate review parameters (optional)
`POST`	`/api/v1/reviews`	Execute literature review
`POST`	`/api/v1/export`	Export review results to various formats

Request/Response Schemas

Review Request

{
  "topic": "string (1-500 characters)",
  "num_papers": "integer (1-50)",
  "model": "string ('auto' for automatic selection)"
}

Export Request

{
  "topic": "string",
  "results_frames": ["array of strings"],
  "num_papers": "integer",
  "model": "string",
  "session_id": "string",
  "format_type": "string ('markdown' or 'html', default: 'markdown')"
}

Complete API Documentation

For detailed API specifications, request/response schemas, and interactive testing:

Visit: http://localhost:8000/docs (when server is running)

This provides the full OpenAPI/Swagger documentation with:

Complete endpoint specifications
Request/response examples
Interactive API testing interface
Model schemas and validation rules

Quick Verification

To confirm everything is working after setup:

Server Started Successfully: Look for INFO: Uvicorn running on http://0.0.0.0:8000
Health Check Passes: curl localhost:8000/health returns "status": "healthy"
API Documentation Accessible: Visit http://localhost:8000/docs in your browser

🔧 Configuration

AI Provider Configuration

Survey Studio supports multiple AI providers with intelligent fallback and cost optimization:

Supported Providers (in priority order):

Together AI - Best free tier, cost-effective for general tasks
Google Gemini - Fast and capable for complex analysis
Perplexity - Best for research with web access capabilities
OpenAI - Reliable fallback with consistent performance

Required API Keys (at least one):

TOGETHER_AI_API_KEY - Get from Together AI
GEMINI_API_KEY - Get from Google AI Studio
PERPLEXITY_API_KEY - Get from Perplexity
OPENAI_API_KEY - Get from OpenAI Platform

Optional Model Overrides:

TOGETHER_AI_MODEL - e.g., meta-llama/Llama-3.1-70B-Instruct-Turbo
GEMINI_MODEL - e.g., gemini-2.5-flash or gemini-1.5-pro
PERPLEXITY_MODEL - e.g., llama-3.1-sonar-huge-128k-online
OPENAI_MODEL - e.g., gpt-4o

🛠 Development Setup

Initial Setup

Install Poetry (if not already installed):

curl -sSL https://install.python-poetry.org | python3 -

Clone and install dependencies:

git clone https://github.com/Aditya-gam/survey-studio.git
cd survey-studio
poetry install

Install pre-commit hooks:
```
poetry run pre-commit install
```

Development Workflow

No venv activation required (use Poetry runner):

# Prefer prefixing commands with 'poetry run'
poetry run <command>

Run linting and formatting:

poetry run ruff check .
poetry run ruff format .

Type checking:
```
poetry run pyright
```
Pre-commit: run the full code quality pipeline locally:
```
poetry run pre-commit run --all-files
```

Code Quality Pipeline

The project enforces 100% compliance via Ruff, Pyright, detect-secrets, and commit message validation.

Ruff formatting: opinionated code formatting. Imports sorted with isort profile.
Ruff linting: rule sets enabled: E,W,F,I,B,C4,UP,N,SIM,TCH,ARG,PIE,PT,RET,SLF,TID,ERA,PL.
Type checking (Pyright): strict configuration; comprehensive type checking with Microsoft's Pyright.
Secrets scanning: detect-secrets with a committed baseline.
Commit messages: Conventional Commits validated by Commitizen.
Poetry checks: validates project metadata and lock consistency.

Project Structure

survey-studio/
├── src/
│   └── survey_studio/
│       ├── __init__.py          # Package initialization
│       ├── app.py              # Main module (redirects to API)
│       ├── api.py              # REST API functions
│       ├── backend.py          # AutoGen multi-agent backend
│       ├── config.py           # Configuration management
│       ├── export.py           # Export functionality
│       ├── orchestrator.py     # Main orchestrator
│       └── validation.py       # Input validation
├── .github/
│   └── workflows/             # CI/CD workflows
├── pyproject.toml             # Poetry configuration
├── .pre-commit-config.yaml    # Pre-commit hooks
├── .gitignore                 # Git ignore rules
├── CHANGELOG.md               # Project changelog
└── README.md                  # This file

📊 Technology Stack

Backend: AutoGen (multi-agent framework)
API: FastAPI, Pydantic, Uvicorn
Data Source: arXiv API
AI Models: Multiple providers with intelligent fallback
Development: Poetry, Ruff, Pyright
CI/CD: Pre-commit hooks, GitHub Actions

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

AutoGen for the multi-agent framework
FastAPI for the API framework
arXiv for providing access to academic papers
OpenAI and other AI providers for the language models

📞 Support

If you have questions or need help:

Check the documentation
Search existing issues
Create a new issue

Survey Studio - Accelerating research through AI-powered literature reviews ✨

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github		.github
scripts		scripts
src/survey_studio		src/survey_studio
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build_info.txt		build_info.txt
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Survey Studio 📚

🌟 Features

🧭 Architecture

🚀 Quick Start

Prerequisites

Installation

🌐 REST API

Quick Start Workflow

1. Start the Development Server

2. Verify Server Health

3. (Optional) Validate Review Parameters

4. Run Literature Review

5. Export Results

API Endpoints Reference

Request/Response Schemas

Review Request

Export Request

Complete API Documentation

Quick Verification

🔧 Configuration

AI Provider Configuration

🛠 Development Setup

Initial Setup

Development Workflow

Code Quality Pipeline

Project Structure

📊 Technology Stack

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages