Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
147 changes: 147 additions & 0 deletions INTEGRATION_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Index Advisor Testing Framework Integration - Summary

## Overview
This document summarizes the changes made to integrate the testing framework into the current Index Advisor implementation to support the new CSV format specification.

## Changes Made

### 1. Updated CSV Format Specification
The CSV format now includes the following columns as per requirements:
- **Category**: Test category (e.g., "Missing Index", "Unused Index")
- **Test Case**: Test case name/identifier
- **Tags**: Semicolon-separated tags for categorization
- **Collection**: Name of the collection to test
- **Positive / Negative**: Test type (Positive/Negative)
- **Query**: MongoDB query to test
- **Expected Index Advisor Suggestion**: Expected index creation/drop command
- **Explanation**: Description of the test scenario
- **Current Index**: Existing indexes on the collection
- **Comment**: Additional comments or notes

### 2. Updated Type Definitions (`test/indexAdvisor/types.ts`)
- Added new fields to `TestCase` interface:
- `tags?: string`
- `testType?: string` (for Positive/Negative)
- `explanation?: string`
- `currentIndex?: string`
- `comment?: string`

- Added corresponding fields to `TestResult` interface to preserve input data in output

### 3. Updated CSV Parser (`test/indexAdvisor/utils.ts`)
- Modified `loadTestCases()` function to:
- Support new CSV column names (case-insensitive)
- Handle both "Positive / Negative" and "Positive/Negative" column names
- Maintain backward compatibility with old CSV format
- Use `findIndex()` for flexible column matching

### 4. Updated Output Format (`test/indexAdvisor/utils.ts`)
- Modified `saveResultsAsCSV()` to output all input columns plus result columns:
- All original CSV columns preserved
- Additional result columns:
- Suggested Indexes
- If Matches Expected
- Analysis
- Execution Plan (Sanitized)
- Updated Execution Plan
- Query Performance (ms)
- Updated Performance (ms)
- Performance Improvement (%)
- Collection Stats
- Index Stats
- Model Used
- Errors
- Timestamp

### 5. Updated Test Runner (`test/indexAdvisor/testRunner.ts`)
- Modified `executeTestCase()` to propagate new fields from TestCase to TestResult

### 6. Updated Command Integration (`src/commands/llmEnhancedCommands/runIndexAdvisorTests.ts`)
- Updated error handling to include all new fields in failed test results

### 7. Updated Example Files
- **test/indexAdvisor/test-cases.example.csv**: Updated with new CSV format containing 6 example test cases
- **test/indexAdvisor/test-config.example.json**: Changed from `clusterId` to `connectionString` as primary connection method

### 8. Added Documentation
- Created comprehensive README (`test/indexAdvisor/README.md`) documenting:
- CSV mode vs Directory mode
- Configuration file format
- CSV test cases format
- Output format
- Running tests
- Performance measurement
- Best practices
- Troubleshooting

## Key Features

### Backward Compatibility
The CSV parser maintains backward compatibility with the old format by:
- Checking for both new and old column names
- Falling back to old format parsing if new columns are not found
- Using case-insensitive header matching

### Flexible Configuration
- Supports both `connectionString` and `clusterId` in configuration
- `connectionString` is now the recommended approach for CSV mode
- Parses credentials automatically from connection string

### Comprehensive Testing
The framework now:
1. Runs queries from CSV on specified collections
2. Records execution plans
3. Gets Index Advisor suggestions
4. Compares suggestions with expected results
5. Optionally measures performance with and without suggested indexes
6. Outputs all data to CSV and JSON formats

## Testing Performed

Manual validation was performed to ensure:
- ✅ CSV parser correctly handles new format
- ✅ Case-insensitive header matching works
- ✅ Multiple test cases are parsed correctly
- ✅ Empty lines are handled
- ✅ Quoted fields with commas are parsed correctly
- ✅ All new fields are preserved in output
- ✅ Build succeeds without errors
- ✅ No linting errors

## Usage Example

1. Create config file:
```json
{
"connectionString": "mongodb://user:pass@host:port/db",
"databaseName": "testDatabase"
}
```

2. Create CSV test file:
```csv
Category,Test Case,Tags,Collection,Positive / Negative,Query,Expected Index Advisor Suggestion,Explanation,Current Index,Comment
Missing Index,Test 1,basic,users,Positive,db.users.find({user_id: 1}),"db.getCollection('users').createIndex({'user_id':1},{})","Single field index test",None,Comment
```

3. Run via VS Code Command: "DocumentDB: Run Index Advisor Tests"

4. Review output CSV and JSON files with all test results

## Benefits

1. **Structured Testing**: New CSV format provides clear test organization
2. **Comprehensive Results**: Output includes all input data plus detailed results
3. **Easy Analysis**: Tags and categories help organize and filter tests
4. **Performance Insights**: Optional performance measurement shows impact of suggestions
5. **Documentation**: Clear documentation helps users create and run tests
6. **Backward Compatible**: Existing directory-based tests still work

## Next Steps

Users can now:
1. Create test collections in their database
2. Populate with representative data
3. Create CSV test cases using the new format
4. Run batch tests to validate Index Advisor suggestions
5. Analyze results to improve index recommendations
20 changes: 2 additions & 18 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions src/commands/llmEnhancedCommands/runIndexAdvisorTests.ts
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,13 @@ export async function runIndexAdvisorTests(context: IActionContext): Promise<voi
category: testCase.category,
scenarioDescription: testCase.scenarioDescription,
expectedResult: testCase.expectedResult,
query: testCase.query,
tags: testCase.tags,
testType: testCase.testType,
explanation: testCase.explanation,
currentIndex: testCase.currentIndex,
comment: testCase.comment,
notes: testCase.notes,
errors: error instanceof Error ? error.message : String(error),
timestamp: new Date().toISOString(),
});
Expand Down
162 changes: 162 additions & 0 deletions test/indexAdvisor/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Index Advisor Testing Framework

This testing framework allows you to run batch tests for the Index Advisor feature using either CSV-based test cases or directory-based test cases.

## Overview

The testing framework supports two modes:
1. **CSV Mode**: Run tests from a CSV file with query execution and performance measurement
2. **Directory Mode**: Run tests from pre-loaded execution plans (no database connection required)

## CSV Mode

### Configuration File

Create a JSON configuration file with the following structure:

```json
{
"connectionString": "mongodb://username:password@host:port/database?authSource=admin",
"databaseName": "testDatabase",
"preferredModel": "gpt-4o",
"fallbackModels": ["gpt-4o-mini"],
"shouldWarmup": true,
"connectionTimeout": 30000,
"queryTimeout": 60000
}
```

**Required Fields:**
- `connectionString`: MongoDB connection string to the test cluster
- `databaseName`: Name of the database containing test collections

**Optional Fields:**
- `preferredModel`: Preferred AI model for index recommendations (default: "gpt-4o")
- `fallbackModels`: Array of fallback models if preferred model is unavailable
- `shouldWarmup`: Whether to warm up the connection before tests (default: true)
- `connectionTimeout`: Connection timeout in milliseconds (default: 30000)
- `queryTimeout`: Query timeout in milliseconds (default: 60000)

### CSV Test Cases File

Create a CSV file with the following columns:

| Column | Description | Required |
|--------|-------------|----------|
| Category | Test category (e.g., "Missing Index", "Unused Index") | Yes |
| Test Case | Test case name/identifier | Yes |
| Tags | Semicolon-separated tags for categorization | No |
| Collection | Name of the collection to test | Yes |
| Positive / Negative | Test type (Positive/Negative) | No |
| Query | MongoDB query to test (e.g., "db.users.find({user_id: 1234})") | Yes |
| Expected Index Advisor Suggestion | Expected index creation/drop command | Yes |
| Explanation | Description of the test scenario | No |
| Current Index | Existing indexes on the collection | No |
| Comment | Additional comments or notes | No |

**Example CSV:**

```csv
Category,Test Case,Tags,Collection,Positive / Negative,Query,Expected Index Advisor Suggestion,Explanation,Current Index,Comment
Missing Index,Test Case 1,basic;single-field,users,Positive,db.users.find({user_id: 1234}),"db.getCollection('users').createIndex({'user_id':1},{})","No single index exists for the query field user_id",None,Basic single-field index creation test
```

### Output Format

The testing framework generates two output files:

1. **CSV File**: Contains all input columns plus result columns:
- Suggested Indexes
- If Matches Expected (true/false)
- Analysis
- Execution Plan (Sanitized)
- Updated Execution Plan (if performance measurement enabled)
- Query Performance (ms)
- Updated Performance (ms)
- Performance Improvement (%)
- Collection Stats
- Index Stats
- Model Used
- Errors
- Timestamp

2. **JSON File**: Structured JSON output with metadata and detailed results

### Running Tests

1. Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P)
2. Run the command: "DocumentDB: Run Index Advisor Tests"
3. Select the configuration file (test-config.json)
4. Select the test cases file (CSV file)
5. Choose whether to measure performance (CSV mode only)
6. Select the output location for results

### Performance Measurement

When performance measurement is enabled (CSV mode only), the framework will:
1. Execute the query and measure initial performance
2. Apply the suggested index changes
3. Re-execute the query and measure updated performance
4. Calculate performance improvement percentage
5. Restore the original index state

**Note**: Performance measurement is slower and modifies indexes temporarily. Choose "Skip Performance Measurement" for faster testing without database modifications.

## Directory Mode

Directory mode allows testing with pre-loaded execution plans without requiring a database connection.

### Directory Structure

```
testcases/
test-case-1/
description.json
executionPlan.json
collectionStats.json (optional)
indexStats.json (optional)
test-case-2/
...
```

**description.json format:**
```json
{
"collectionName": "users",
"category": "Missing Index",
"description": "Test single field index",
"expectedResults": "db.getCollection('users').createIndex({'user_id':1},{})"
}
```

## Best Practices

1. **Test Organization**: Use meaningful categories and tags to organize tests
2. **Test Data**: Ensure test collections have representative data for accurate results
3. **Expected Results**: Write expected index suggestions in MongoDB shell format
4. **Performance Testing**: Use performance measurement only when needed, as it's slower
5. **Version Control**: Keep test cases in version control for reproducibility

## Troubleshooting

- **Connection Issues**: Verify connection string format and credentials
- **Query Parsing Errors**: Ensure queries are in valid MongoDB shell format
- **Missing Collections**: Verify all collections exist in the test database
- **Performance Measurement Failures**: Check that user has permissions to create/drop indexes

## Example Workflow

1. Create test collections in your test database
2. Populate collections with representative data
3. Create configuration file with connection details
4. Create CSV file with test cases
5. Run tests using the VS Code command
6. Review results in the output CSV and JSON files
7. Analyze performance improvements and match rates

## Notes

- All test collections should exist in the database specified in the configuration
- The framework uses AI models to generate index suggestions, so results may vary
- CSV mode requires database connection; directory mode does not
- Performance measurement temporarily modifies indexes but restores original state
14 changes: 7 additions & 7 deletions test/indexAdvisor/test-cases.example.csv
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
collectionName,query,expectedResult,notes
a,"Missing Index","No single index exists for the query","db.a.find({user_id: 1234})","db.getCollection('a').createIndex({'user_id':1},{})"
b,"Missing Index","No single index exists for the query","db.b.find({age: {$gt: 30, $lt: 40}})","db.getCollection('b').createIndex({'age':1},{})"
c,"Missing Index","Composite index for find","db.c.find({country: 'US', city: 'NY'})","db.getCollection('c').createIndex({'country':1,'city':1},{'storageEngine':{'enableOrderedIndex':true}})"
d,"Missing Index","Composite index for find and sort","db.d.find({score: {$gt: 42, $lt: 10340}}).sort({timestamp: -1})","db.getCollection('d').createIndex({'score':1,'timestamp':-1},{'name':'score_1_timestamp_-1','storageEngine':{'enableOrderedIndex':true}})"
f,"Unused Index","Index is not needed on low selectivity field","db.f.find({gender: 'F'})","db.getCollection('f').dropIndex({'gender':1})"
g,"No Index","Index is not needed on small collection","db.g.find({flag: true})","db.getCollection('g').dropIndex({'flag':1})"
Category,Test Case,Tags,Collection,Positive / Negative,Query,Expected Index Advisor Suggestion,Explanation,Current Index,Comment
Missing Index,Test Case 1,basic;single-field,users,Positive,db.users.find({user_id: 1234}),"db.getCollection('users').createIndex({'user_id':1},{})","No single index exists for the query field user_id",None,Basic single-field index creation test
Missing Index,Test Case 2,basic;range-query,orders,Positive,"db.orders.find({age: {$gt: 30, $lt: 40}}})","db.getCollection('orders').createIndex({'age':1},{})","No index exists for range query on age field",None,Range query requires index for better performance
Missing Index,Test Case 3,composite;multiple-fields,locations,Positive,"db.locations.find({country: 'US', city: 'NY'})","db.getCollection('locations').createIndex({'country':1,'city':1},{'storageEngine':{'enableOrderedIndex':true}})","Composite index needed for multi-field query",None,Multi-field query optimization
Missing Index,Test Case 4,composite;sort,events,Positive,db.events.find({score: {$gt: 42}}).sort({timestamp: -1}),"db.getCollection('events').createIndex({'score':1,'timestamp':-1},{'name':'score_1_timestamp_-1','storageEngine':{'enableOrderedIndex':true}})","Composite index for filter and sort operations",None,Filter with sort requires compound index
Unused Index,Test Case 5,optimization;low-selectivity,profiles,Negative,db.profiles.find({gender: 'F'}),"db.getCollection('profiles').dropIndex({'gender':1})","Index not needed on low selectivity field","{'gender':1}",Low cardinality field doesn't benefit from index
No Index Needed,Test Case 6,optimization;small-collection,flags,Negative,db.flags.find({flag: true}),"db.getCollection('flags').dropIndex({'flag':1})","Index not needed on small collection","{'flag':1}",Small collections are better scanned
2 changes: 1 addition & 1 deletion test/indexAdvisor/test-config.example.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"clusterId": "",
"connectionString": "mongodb://username:password@host:port/database?authSource=admin",
"databaseName": "testDatabase",
"preferredModel": "gpt-4o",
"fallbackModels": ["gpt-4o-mini"],
Expand Down
Loading