Skip to content

hildensia/bayesian_changepoint_detection

Repository files navigation

Bayesian Changepoint Detection

Python 3.8+ PyTorch License

A modern, PyTorch-based library for Bayesian changepoint detection in time series data. This library implements both online and offline methods with GPU acceleration support for high-performance computation.

Features

  • PyTorch Backend: Leverages PyTorch for efficient computation and automatic differentiation
  • GPU Acceleration: Automatic device detection with support for CUDA and Apple Silicon (MPS)
  • Online & Offline Methods: Sequential and batch changepoint detection algorithms
  • Multiple Distributions: Support for univariate and multivariate Student's t-distributions
  • Flexible Priors: Constant, geometric, and negative binomial prior distributions
  • Type Safety: Full type annotations for better development experience
  • Comprehensive Testing: Extensive test suite with GPU testing support

Installation

This package supports multiple installation methods with modern Python package managers. Choose the method that best fits your workflow.

Method 1: Using UV (Recommended)

UV is a fast Python package installer and resolver. It's the recommended approach for new projects.

Install UV

# macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# Or with pip
pip install uv

Install the package with UV

# Create a new virtual environment and install
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install bayesian-changepoint-detection

# Or install directly with auto-managed environment
uv run python -c "import bayesian_changepoint_detection; print('Success!')"

Development installation with UV

git clone https://github.com/estcarisimo/bayesian_changepoint_detection.git
cd bayesian_changepoint_detection

# Create virtual environment
uv venv

# Activate virtual environment
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in development mode with all dependencies
uv pip install -e ".[dev]"

# Or install specific dependency groups
uv pip install -e ".[dev,docs,gpu]"

Method 2: Using pip with Virtual Environments

Create and activate a virtual environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Linux/macOS:
source venv/bin/activate
# On Windows:
venv\Scripts\activate

# Upgrade pip
pip install --upgrade pip

Install the package

# Install from PyPI (when available)
pip install bayesian-changepoint-detection

# Or install from source
git clone https://github.com/estcarisimo/bayesian_changepoint_detection.git
cd bayesian_changepoint_detection
pip install -e .

# Install with development dependencies
pip install -e ".[dev]"

Method 3: Using conda/mamba

# Create conda environment
conda create -n bayesian-cp python=3.9
conda activate bayesian-cp

# Install PyTorch first (recommended for better compatibility)
conda install pytorch torchvision torchaudio -c pytorch

# Install the package
pip install bayesian-changepoint-detection

# Or from source
git clone https://github.com/estcarisimo/bayesian_changepoint_detection.git
cd bayesian_changepoint_detection
pip install -e ".[dev]"

Dependency Groups

The package defines several optional dependency groups:

  • dev: Development tools (pytest, black, mypy, etc.)
  • docs: Documentation generation (sphinx, numpydoc)
  • gpu: GPU support (CUDA-enabled PyTorch)

Install specific groups

# With UV
uv pip install "bayesian-changepoint-detection[dev,gpu]"

# With pip
pip install "bayesian-changepoint-detection[dev,gpu]"

GPU Support

For CUDA support, ensure you have CUDA-compatible hardware and drivers, then:

Option 1: Install PyTorch with CUDA manually (Recommended)

# Visit https://pytorch.org/get-started/locally/ for the latest commands
# Example for CUDA 11.8:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Example for CUDA 12.1:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Then install the package
pip install bayesian-changepoint-detection
# or from source:
pip install -e .

Option 2: Install with GPU extras (May install CPU-only PyTorch)

# Note: The [gpu] extra attempts to install torch[cuda], but this may not always
# install the GPU version correctly. Option 1 is more reliable.

# UV
uv pip install "bayesian-changepoint-detection[gpu]"

# pip
pip install "bayesian-changepoint-detection[gpu]"

Verify GPU Support

# Check if PyTorch can see your GPU
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
python -c "import torch; print(f'GPU count: {torch.cuda.device_count()}')"
python -c "import torch; print(f'GPU name: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"No GPU\"}')"

GPU/CUDA Acceleration

The library provides GPU acceleration for significant performance improvements. Here's a quick example:

import torch
from functools import partial
from bayesian_changepoint_detection import online_changepoint_detection, constant_hazard
from bayesian_changepoint_detection.online_likelihoods import StudentT

# Automatic device selection (chooses GPU if available)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Generate sample data and move to GPU
data = torch.cat([torch.randn(100), torch.randn(100) + 3]).to(device)

# Set up GPU-enabled model
hazard_func = partial(constant_hazard, 250)
likelihood = StudentT(alpha=0.1, beta=0.01, device=device)

# Run detection on GPU
run_length_probs, changepoint_probs = online_changepoint_detection(
    data, hazard_func, likelihood
)

print("Detected changepoints:", torch.where(changepoint_probs > 0.5)[0])

Performance Benefits:

  • 10-100x speedup on compatible hardware
  • Especially beneficial for large datasets (>1000 points) and multivariate data
  • Automatic memory management and device detection

đź“– For a complete GPU guide with benchmarks, multivariate examples, and memory management tips, see

Verify Installation

Test your installation:

import torch
from bayesian_changepoint_detection import get_device_info

# Check device availability
print(get_device_info())

# Quick test
from bayesian_changepoint_detection.generate_data import generate_mean_shift_example
partition, data = generate_mean_shift_example(3, 50)
print(f"Generated test data: {data.shape}")

Or run the comprehensive test script:

# Run quick test (after installation)
python quick_test.py

# Run example (from project root, without installation)
PYTHONPATH=. python examples/simple_example.py

Development Setup

For contributors and developers:

# Clone the repository
git clone https://github.com/estcarisimo/bayesian_changepoint_detection.git
cd bayesian_changepoint_detection

# Option 1: Using UV (recommended)
uv venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
uv pip install -e ".[dev,docs]"

# Option 2: Using pip
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install -e ".[dev,docs]"

# Run tests
pytest
# or if pytest is not in PATH:
python -m pytest

# Run code formatting
black bayesian_changepoint_detection tests examples
isort bayesian_changepoint_detection tests examples

# Run type checking
mypy bayesian_changepoint_detection

Troubleshooting

Common Issues

  1. pytest command not found

    # Option 1: Use python -m pytest
    python -m pytest
    
    # Option 2: Ensure pytest is installed
    pip install pytest
    
    # Option 3: Run the basic test directly
    python test.py
  2. PyTorch installation conflicts

    # Uninstall and reinstall PyTorch
    pip uninstall torch torchvision torchaudio
    pip install torch torchvision torchaudio
  3. CUDA version mismatch

    # Check CUDA version
    nvidia-smi
    
    # Install matching PyTorch version from https://pytorch.org/
  4. Virtual environment issues

    # Recreate virtual environment
    rm -rf venv  # or .venv
    python -m venv venv
    source venv/bin/activate
    pip install --upgrade pip
  5. Permission errors

    # Use --user flag if you can't create virtual environments
    pip install --user bayesian-changepoint-detection

Quick Start

Online Changepoint Detection

import torch
from functools import partial
from bayesian_changepoint_detection import (
    online_changepoint_detection,
    constant_hazard,
    StudentT
)

# Generate sample data
torch.manual_seed(42)
data = torch.cat([
    torch.randn(50) + 0,      # First segment: mean=0
    torch.randn(50) + 3,      # Second segment: mean=3
    torch.randn(50) + 0,      # Third segment: mean=0
])

# Set up the model
hazard_func = partial(constant_hazard, 250)  # Expected run length of 250
likelihood = StudentT(alpha=0.1, beta=0.01, kappa=1, mu=0)

# Run online changepoint detection
run_length_probs, changepoint_probs = online_changepoint_detection(
    data, hazard_func, likelihood
)

print("Detected changepoints at:", torch.where(changepoint_probs > 0.5)[0])

Offline Changepoint Detection

from bayesian_changepoint_detection import (
    offline_changepoint_detection,
    const_prior
)
from bayesian_changepoint_detection.offline_likelihoods import StudentT as OfflineStudentT

# Generate sample data (same as above)
data = torch.cat([
    torch.randn(50) + 0,      # First segment: mean=0
    torch.randn(50) + 3,      # Second segment: mean=3
    torch.randn(50) + 0,      # Third segment: mean=0
])

# Use offline method for batch processing
prior_func = partial(const_prior, p=1/(len(data)+1))
likelihood = OfflineStudentT()

Q, P, changepoint_log_probs = offline_changepoint_detection(
    data, prior_func, likelihood
)

# Get changepoint probabilities
changepoint_probs = torch.exp(changepoint_log_probs).sum(0)

GPU Acceleration

# Automatic GPU detection
device = get_device()  # Selects best available device
print(f"Using device: {device}")

# Force specific device
likelihood = StudentT(device='cuda')  # Use GPU
data_gpu = data.to('cuda')

# All computations will run on GPU
run_length_probs, changepoint_probs = online_changepoint_detection(
    data_gpu, hazard_func, likelihood
)

Multivariate Data

from bayesian_changepoint_detection.online_likelihoods import MultivariateT

# Generate multivariate data
dims = 3
data = torch.cat([
    torch.randn(50, dims) + torch.tensor([0, 0, 0]),
    torch.randn(50, dims) + torch.tensor([2, -1, 1]),
    torch.randn(50, dims) + torch.tensor([0, 0, 0]),
])

# Multivariate likelihood
likelihood = MultivariateT(dims=dims)

# Run detection
run_length_probs, changepoint_probs = online_changepoint_detection(
    data, hazard_func, likelihood
)

Mathematical Background

This library implements Bayesian changepoint detection as described in:

  1. Paul Fearnhead (2006). "Exact and Efficient Bayesian Inference for Multiple Changepoint Problems." Statistics and Computing, 16(2), 203-213.

  2. Ryan P. Adams and David J.C. MacKay (2007). "Bayesian Online Changepoint Detection." arXiv preprint arXiv:0710.3742.

  3. Xuan Xiang and Kevin Murphy (2007). "Modeling Changing Dependency Structure in Multivariate Time Series." ICML, 1055-1062.

Key Concepts

  • Run Length: Time since the last changepoint
  • Hazard Function: Prior probability of a changepoint at each time step
  • Likelihood Model: Distribution of observations within segments
  • Posterior: Probability distribution over run lengths given data

API Reference

Core Functions

  • online_changepoint_detection(): Sequential changepoint detection
  • offline_changepoint_detection(): Batch changepoint detection

Likelihood Models

  • StudentT: Univariate Student's t-distribution (unknown mean and variance)
  • MultivariateT: Multivariate Student's t-distribution

Prior Distributions

  • const_prior(): Uniform prior over changepoint locations
  • geometric_prior(): Geometric distribution for inter-arrival times
  • negative_binomial_prior(): Generalized geometric distribution

Hazard Functions

  • constant_hazard(): Constant probability of changepoint occurrence

Device Management

  • get_device(): Automatic device selection
  • to_tensor(): Convert data to PyTorch tensors
  • get_device_info(): Get information about available devices

Performance

The PyTorch implementation provides significant performance improvements:

  • Vectorized Operations: Efficient batch computations
  • GPU Acceleration: 10-100x speedup on compatible hardware
  • Memory Efficiency: Optimized memory usage for large datasets
  • Parallel Processing: Multi-threaded CPU operations

Benchmarks

Theoretical estimates, must be benchmarked

On a typical dataset (1000 time points, univariate):

Method Device Time Speedup
Original (NumPy) CPU 2.3s 1x
PyTorch CPU 0.8s 2.9x
PyTorch GPU (RTX 3080) 0.05s 46x

Examples

See the examples/ directory for complete examples:

  • examples/basic_usage.py: Simple univariate example
  • examples/multivariate_example.py: Multivariate time series
  • examples/gpu_acceleration.py: GPU usage examples
  • examples/Example_Code.ipynb: Jupyter notebook tutorial

Development

Running Tests

Basic Tests (No pytest required)

# Run the basic test suite directly
python test.py

# This runs simple univariate and multivariate changepoint detection tests

Full Test Suite (Requires pytest)

# First, install development dependencies
pip install -e ".[dev]"

# Run all tests in the tests/ directory
pytest tests/
# or if pytest is not in PATH:
python -m pytest tests/

# Run with verbose output
pytest tests/ -v

# Run with coverage report
pytest tests/ --cov=bayesian_changepoint_detection
# or:
python -m pytest tests/ --cov=bayesian_changepoint_detection

# Run specific test files
pytest tests/test_device.py
pytest tests/test_online_likelihoods.py

# Run GPU tests only (requires CUDA)
pytest tests/ -m gpu

# Run non-GPU tests only
pytest tests/ -m "not gpu"

The full test suite includes:

  • Device management tests
  • Online and offline likelihood tests
  • Prior distribution tests
  • Integration tests with regression testing
  • GPU computation tests (when CUDA available)

Code Quality

# Format code
black bayesian_changepoint_detection tests

# Sort imports
isort bayesian_changepoint_detection tests

# Type checking
mypy bayesian_changepoint_detection

# Linting
flake8 bayesian_changepoint_detection tests

Migration from v0.4

The new PyTorch-based API maintains compatibility while offering performance improvements:

# Old API (still works)
import bayesian_changepoint_detection.offline_changepoint_detection as offcd
Q, P, Pcp = offcd.offline_changepoint_detection(data, prior_func, likelihood_func)

# New PyTorch API (recommended)
from bayesian_changepoint_detection import offline_changepoint_detection
Q, P, Pcp = offline_changepoint_detection(data, prior_func, likelihood)

Contributing

Contributions are welcome! Please see our Contributing Guide for details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use this library in your research, please cite:

@software{bayesian_changepoint_detection,
  title={Bayesian Changepoint Detection: A PyTorch Implementation},
  author={Kulick, Johannes and Carisimo, Esteban},
  url={https://github.com/estcarisimo/bayesian_changepoint_detection},
  year={2025},
  version={1.0.0}
}

Acknowledgments

  • Original implementation by Johannes Kulick
  • PyTorch migration and modernization by Esteban Carisimo
  • Inspired by the work of Fearnhead, Adams, MacKay, Xiang, and Murphy

About

Methods to get the probability of a changepoint in a time series.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 10

Languages