Shipped

Aeroguard

Flight anomaly detection with unsupervised ML

Detects aircraft anomalies using PCA and DBSCAN on high dimensional NASA telemetry data with visualization and sensor level diagnostics.

Problem

The problem

Aircraft telemetry is high dimensional, unlabeled, and safety critical. Supervised models struggle without labeled anomalies. Aeroguard treats this as an unsupervised structure learning problem — surfacing statistically unusual flight regimes and pointing engineers to the specific sensors that drove the anomaly.

System Design

How it's built

Architecture

Unsupervised Anomaly Detection Pipeline

01
NASA Telemetry
Multivariate sensors
step
02
Preprocessing
Scaling + windowing
step
03
PCA
Dimensionality reduction
step
04
DBSCAN
Density clustering
step
05
Anomaly Scoring
Distance + noise labels
step
06
Sensor Attribution
Per-feature diagnostics
step
07
Streamlit Dashboard
Interactive visualization
step
Tech Stack

What powers it

PythonScikit-learnPCADBSCANStreamlitNumPyPandasPlotly
Challenges

What was hard

  • Scaling and windowing high-frequency multivariate sensor data
  • Choosing PCA components to preserve variance without over-compressing rare regimes
  • Tuning DBSCAN eps / min_samples for a noisy density landscape
  • Attributing cluster anomalies back to the responsible raw sensors
  • Making the pipeline interpretable enough for non-ML domain experts
Design Decisions

Why it's built this way

Unsupervised over supervised

Labels are expensive and biased. Density-based clustering surfaces genuine structure without assuming what an anomaly looks like.

PCA before DBSCAN

DBSCAN struggles in high-dimensional spaces. PCA keeps the geometry meaningful while retaining most variance.

Streamlit for the operator UI

Fast iteration, native Python, and interactive plots without a full frontend stack.

Per-sensor attribution

A flagged anomaly is only useful if you can answer "which sensor?" — attribution is a first-class output, not an afterthought.

Lessons Learned

What I'd tell my past self

  • Feature engineering and windowing matter more than the choice of clustering algorithm.
  • Interpretability drives adoption in safety-critical domains far far more than raw accuracy.
  • Unsupervised methods surface unknown unknowns — but only if you invest in visualization.
Next case study
OnboardAI
Autonomous agent onboarding platform