This project demonstrates automated prompt optimization for a database view selector agent using DSPy (Declarative Self-improving Language Programs). The goal is to transform a handwritten prompt-based system into an optimized, data-driven solution that automatically learns to select the most relevant Snowflake database views based on user queries.
Domain: Financial data analysis for Private Equity, Real Estate, Infrastructure, Credit, and other investment platforms.
Problem: Given a natural language question about financial data (e.g., "What is MIC's current exposure in the USA in PE?"), the system must:
- Understand the question context and domain terminology
- Select the most relevant Snowflake database views from 20+ available views
- Handle financial classification rules (Asset Classes, Investment Classes, Platforms)
- Maintain conversation context for follow-up questions
Challenge: Handwritten prompts are difficult to maintain, don't learn from mistakes, and struggle with edge cases.
Optimize prompt engineering through automated learning instead of manual prompt crafting.
- Baseline Performance: Establish handwritten prompt performance metrics
- DSPy Implementation: Convert prompt logic into DSPy modules with Chain-of-Thought reasoning
- Prompt Optimization: Apply multiple DSPy optimizers to improve accuracy
- Performance Comparison: Quantify improvements using precision, recall, F1, and accuracy metrics
- Production Readiness: Create deployable optimized modules
- Python 3.10+
- uv - Fast Python package installer and environment manager
uv syncprompt-optimization-lab/
├── README.md
├── data/
│ ├── snowflake_metadata.yaml # View metadata and descriptions
│ └── snowflake_view.json # Structured view definitions
├── notebooks/
│ └── 01_data_preparation.ipynb # Data loading and exploration
└── .venv/
uv run python src/optimizer/optimize_view_selector.py --optimizer labeledfewshot --k 10
uv run python src/optimizer/optimize_view_selector.py --optimizer gepa --train-size 20 --val-size 10
uv run python src/optimizer/optimize_view_selector.py --optimizer bootstrap --max-demos 10
uv run python src/optimizer/optimize_view_selector.py --optimizer all
uv run python src/optimizer/optimize_view_selector.py --optimizer all --output-dir results/exp1