Skip to content

Latest commit

 

History

History
401 lines (293 loc) · 12.2 KB

File metadata and controls

401 lines (293 loc) · 12.2 KB

Testing Guide

Overview

This document describes the comprehensive test suite for the metrics-processor project, including unit tests, integration tests, E2E tests, coverage measurement, and regression protection.

Test Organization

Test Structure

tests/
├── fixtures/                      # Shared test fixtures and utilities
│   ├── mod.rs                     # Module declaration
│   ├── configs.rs                 # YAML configuration fixtures
│   ├── graphite_responses.rs      # Mock Graphite response data
│   └── helpers.rs                 # Test helper functions and assertions
├── docker/                        # Docker setup for E2E tests
│   ├── docker-compose.yml         # go-carbon + carbonapi containers
│   ├── go-carbon.conf             # Carbon storage configuration
│   ├── carbonapi.yml              # CarbonAPI configuration
│   └── README.md                  # Docker setup documentation
├── integration_e2e_reporter.rs    # E2E tests with real Graphite
├── integration_health.rs          # Service health integration tests
├── integration_api.rs             # API integration tests
└── documentation_validation.rs    # Documentation validation tests

src/
├── common.rs                      # + #[cfg(test)] mod tests { 11 tests }
├── types.rs                       # + #[cfg(test)] mod tests { 6 tests }
├── config.rs                      # + #[cfg(test)] mod tests { 7 tests }
├── graphite.rs                    # + #[cfg(test)] mod tests { 6 tests }
└── api/v1.rs                      # + #[cfg(test)] mod tests { 4 tests }

Test Categories

  1. Unit Tests: Located inline with source code using #[cfg(test)] modules
  2. Integration Tests: Located in tests/ directory for cross-module scenarios
  3. E2E Tests: Full pipeline tests with real Docker containers
  4. Fixtures: Shared test data and utilities in tests/fixtures/

Running Tests

Run All Unit Tests

cargo test

Run Specific Test Module

# Run only common module tests
cargo test common::tests

# Run only config tests
cargo test config::tests

# Run only integration tests (excluding E2E)
cargo test --test integration_*

Run Tests with Output

cargo test -- --nocapture

Run Tests in Parallel (default)

cargo test -- --test-threads=4

Run Tests Sequentially

cargo test -- --test-threads=1

E2E Integration Tests

The E2E tests (integration_e2e_reporter.rs) validate the complete metrics-processor pipeline using real Docker containers.

What They Test

  • Metrics ingestion via Carbon protocol
  • Query processing via CarbonAPI
  • Health expression evaluation
  • Incident creation and severity assignment
  • Reporter log output format and content

Prerequisites

  1. Docker: Must be installed and running
  2. Available Ports:
    • 2003 - Carbon plaintext protocol
    • 8080 - CarbonAPI query endpoint
    • 3005 - Convertor API
    • 9999 - Mock Status Dashboard

Running E2E Tests

# Run E2E tests (Docker containers managed automatically)
cargo test --test integration_e2e_reporter -- --ignored --nocapture

The test automatically:

  1. Restarts Docker containers to ensure clean state
  2. Builds convertor and reporter binaries
  3. Runs all 4 test scenarios
  4. Validates log output for each scenario

Test Scenarios

Scenario Weight Condition Expected Log
healthy 0 All metrics OK No incident log
degraded_slow 1 Response time > 1200ms matched_expression="...api_slow..."
degraded_errors 1 Success rate < 65% matched_expression="...api_success_rate_low..."
outage 2 100% failures matched_expression="...api_down"

E2E Test Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Test Code  │────▶│  go-carbon  │────▶│  carbonapi  │
│(write data) │     │  (storage)  │     │   (query)   │
└─────────────┘     └─────────────┘     └─────────────┘
                                               │
      ┌────────────────────────────────────────┘
      ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Convertor  │────▶│  Reporter   │────▶│ Mock Status │
│  (process)  │     │   (alert)   │     │  Dashboard  │
└─────────────┘     └─────────────┘     └─────────────┘
      │                   │
      │                   ▼
      │             ┌─────────────┐
      └────────────▶│  Log Output │◀── Test validates
                    │  (stdout)   │
                    └─────────────┘

Data Isolation

Each scenario uses a unique service name (e.g., rms_healthy, rms_outage) to prevent data pollution between scenarios. This allows sequential execution without clearing Graphite data.

E2E Troubleshooting

"Docker containers failed to start"

# Check Docker is running
docker ps

# Check port availability
lsof -i :2003 -i :8080 -i :3005 -i :9999

# Manually start containers
cd tests/docker && docker compose up -d

"Convertor not ready after timeout"

  • The convertor may need more time to start
  • Check for port conflicts on 3005
  • Review convertor stderr output

"No incident log found"

# Verify Graphite received data
curl 'http://localhost:8080/metrics/find?query=stats.*&format=json'

# Check go-carbon logs
docker logs metrics-processor-go-carbon

"Log validation failed"

  • Verify expected expression matches the config
  • Check for ANSI escape codes (test strips them automatically)
  • Review the full log output in test output

Test Coverage

Measuring Coverage

This project uses cargo-tarpaulin for code coverage measurement.

Install cargo-tarpaulin

cargo install cargo-tarpaulin

Generate Coverage Report

# Generate XML report (for CI/CD)
cargo tarpaulin --out Xml --output-dir ./coverage

# Generate HTML report (for local viewing)
cargo tarpaulin --out Html --output-dir ./coverage

# Generate both
cargo tarpaulin --out Xml --out Html --output-dir ./coverage

Coverage Thresholds

The project enforces a 95% coverage threshold for core business functions:

cargo tarpaulin --fail-under 95

CI/CD Integration

Coverage is automatically measured in CI via GitHub Actions (.github/workflows/coverage.yml):

  • Runs on every push and pull request
  • Generates XML report for Codecov
  • Generates HTML report as artifact
  • Fails build if coverage drops below 95%

Test Execution Time

Target: Under 2 Minutes

The test suite is designed to execute quickly to enable rapid development feedback.

Current Performance:

  • Unit tests: 44 tests in ~0.05 seconds
  • Integration tests: 8 tests in ~0.05 seconds
  • Doc tests: 7 tests
  • Total: 59 tests in < 0.2 seconds

Measuring Test Time

time cargo test

Regression Protection

Validation Approach

The test suite is designed to catch breaking changes immediately:

  1. Comprehensive Coverage: All core business logic is tested
  2. Clear Assertions: Tests use descriptive error messages
  3. Edge Cases: Boundary conditions, null values, negative numbers
  4. Operator Testing: All comparison operators (Lt, Gt, Eq) validated

Manual Regression Validation

To verify the test suite catches breaking changes:

  1. Backup the source file:

    cp src/common.rs src/common.rs.backup
  2. Introduce a breaking change (e.g., swap Lt and Gt operators):

    # Edit src/common.rs and swap the operators
    CmpType::Lt => x > metric.threshold,  # Wrong!
    CmpType::Gt => x < metric.threshold,  # Wrong!
  3. Run tests (should fail):

    cargo test common::tests
  4. Verify failures with clear error messages

  5. Restore original:

    mv src/common.rs.backup src/common.rs

Expected Behavior

When breaking changes are introduced:

  • ✓ Tests fail immediately
  • ✓ Error messages clearly indicate the problem
  • ✓ Multiple tests catch the same logical error (redundancy)
  • ✓ Zero false positives

Test Coverage by Feature

Phase 1: Core Metric Flag Evaluation (US1)

  • Tests: 11 unit tests
  • Coverage: Lt, Gt, Eq operators
  • Edge Cases: None values, boundaries, negative numbers, zero threshold
  • Status: ✅ Complete (100% coverage)

Phase 2: Service Health Aggregation (US2)

  • Tests: 11 tests (8 unit + 3 integration)
  • Coverage: Expression evaluation, weight calculation, OR/AND operators
  • Edge Cases: Unknown service/environment, empty datapoints, partial data
  • Status: ✅ Complete

Phase 3: Configuration Processing (US4)

  • Tests: 11 tests (6 in types.rs + 5 in config.rs)
  • Coverage: Template variables, validation, defaults, multi-source config
  • Status: ✅ Complete (100% coverage on config.rs)

Phase 4: API Endpoints (US3)

  • Tests: 10 tests (4 unit + 6 integration)
  • Coverage: REST API handlers, error responses, Graphite compatibility
  • Status: ✅ Complete

Phase 5: Graphite Integration (US5)

  • Tests: 6 tests
  • Coverage: Query building, metrics discovery, utility endpoints
  • Status: ✅ Complete

Current Coverage

Overall Library Coverage: 71.56% (307/429 lines)

Module Coverage Status
src/config.rs 100.0%
src/common.rs 89.3%
src/types.rs 82.6%
src/api/v1.rs 74.4%
src/graphite.rs 56.8%

Core Business Functions (config + common + types): 89.9%

Best Practices

Writing Tests

  1. Use Descriptive Names: Test names should clearly indicate what they test

    #[test]
    fn test_lt_operator_below_threshold() { ... }
  2. Use Custom Assertions: Leverage helpers for better error messages

    use crate::fixtures::helpers::assert_metric_flag;
    assert_metric_flag(result, expected, "Lt operator with negative values");
  3. Test Edge Cases: Always include boundary conditions

    • None/null values
    • Zero values
    • Negative values
    • Boundary values (threshold ± 0.001)
  4. Isolate Tests: Each test should be independent

    • Use mockito::Server::new() for per-test isolation
    • Avoid shared mutable state
    • Clean up resources in test teardown

Test Data Management

  1. Use Fixtures: Centralize test data in tests/fixtures/
  2. Reuse Helpers: Use helper functions for common setup
  3. Mock External Services: Never call real external APIs in tests

Troubleshooting

Tests Failing Unexpectedly

  1. Check for state pollution: Ensure tests are isolated
  2. Rebuild from scratch: cargo clean && cargo test
  3. Check for race conditions: Run with --test-threads=1

Coverage Too Low

  1. Identify uncovered code: cargo tarpaulin --out Html
  2. Add tests for uncovered branches
  3. Focus on core business logic first

Tests Running Too Slow

  1. Profile test execution: cargo test -- --nocapture
  2. Check for unnecessary sleeps or timeouts
  3. Use mocks instead of real HTTP calls

Contributing

When adding new features:

  1. Write tests first (TDD approach)
  2. Ensure tests fail before implementation
  3. Implement feature until tests pass
  4. Verify coverage meets threshold
  5. Update this documentation if adding new test categories

References