fastcpd¶
Fast change point detection in Python using PELT and SeGD algorithms.
Overview¶
fastcpd is a Python library for detecting structural breaks in time series and sequential data. It implements efficient algorithms for identifying points where statistical properties change.
Key capabilities:
Multiple detection algorithms (PELT, SeGD)
Parametric and nonparametric models
Comprehensive evaluation metrics
Built-in dataset generation
Installation¶
From Test PyPI:
# Install Armadillo (required for C++ extension)
brew install armadillo # macOS
# sudo apt-get install libarmadillo-dev # Linux
# Install package
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ pyfastcpd
From source:
git clone https://github.com/zhangxiany-tamu/fastcpd_Python.git
cd fastcpd_Python
pip install -e .
Quick Example¶
import numpy as np
from fastcpd.segmentation import mean
# Generate data with mean change at position 300
data = np.concatenate([
np.random.normal(0, 1, 300),
np.random.normal(5, 1, 400)
])
# Detect change points
result = mean(data, beta="MBIC")
print(result.cp_set) # [300]
Supported Models¶
Parametric:
Mean, variance, mean+variance
Binomial and Poisson regression
Linear regression and LASSO
AR, VAR, ARMA, GARCH time series
Nonparametric:
Rank-based detection
RBF kernel methods
See Available Models for details.
Algorithm Features¶
PELT Algorithm¶
Exact optimization with pruning for linear time complexity (average case).
SeGD Algorithm¶
Fast gradient-based approximation for large datasets. Configurable via vanilla_percentage parameter:
# Pure PELT (exact)
result = fastcpd(data, family="binomial", vanilla_percentage=1.0)
# Pure SeGD (fast)
result = fastcpd(data, family="binomial", vanilla_percentage=0.0)
# Adaptive
result = fastcpd(data, family="binomial", vanilla_percentage='auto')
Implementation¶
Core models (mean, variance): C++ for speed
GLM models: Python with optional Numba acceleration
Time series: Python with statsmodels/arch
Evaluation¶
Six evaluation metrics included:
Precision, Recall, F1-Score
Hausdorff distance
Covering metric (multi-annotator)
Annotation error
Example:
from fastcpd.metrics import evaluate_all
metrics = evaluate_all(
true_cps=[100, 200, 300],
pred_cps=[98, 205, 310],
n_samples=500,
margin=10
)
Links¶
Citation¶
If you use this software in your research, please cite:
@article{zhang2023sequential,
title={Sequential Gradient Descent and Quasi-Newton's Method for Change-Point Analysis},
author={Zhang, Xianyang and Dawn, Trisha},
journal={Proceedings of AISTATS},
year={2023}
}
License¶
MIT License. See LICENSE file for details.