The problem
Aircraft telemetry is high dimensional, unlabeled, and safety critical. Supervised models struggle without labeled anomalies. Aeroguard treats this as an unsupervised structure learning problem — surfacing statistically unusual flight regimes and pointing engineers to the specific sensors that drove the anomaly.
How it's built
Unsupervised Anomaly Detection Pipeline
What powers it
What was hard
- Scaling and windowing high-frequency multivariate sensor data
- Choosing PCA components to preserve variance without over-compressing rare regimes
- Tuning DBSCAN eps / min_samples for a noisy density landscape
- Attributing cluster anomalies back to the responsible raw sensors
- Making the pipeline interpretable enough for non-ML domain experts
Why it's built this way
Unsupervised over supervised
Labels are expensive and biased. Density-based clustering surfaces genuine structure without assuming what an anomaly looks like.
PCA before DBSCAN
DBSCAN struggles in high-dimensional spaces. PCA keeps the geometry meaningful while retaining most variance.
Streamlit for the operator UI
Fast iteration, native Python, and interactive plots without a full frontend stack.
Per-sensor attribution
A flagged anomaly is only useful if you can answer "which sensor?" — attribution is a first-class output, not an afterthought.
What I'd tell my past self
- Feature engineering and windowing matter more than the choice of clustering algorithm.
- Interpretability drives adoption in safety-critical domains far far more than raw accuracy.
- Unsupervised methods surface unknown unknowns — but only if you invest in visualization.