Shipped

Aeroguard

Flight anomaly detection with unsupervised ML

Detects aircraft anomalies using PCA and DBSCAN on high dimensional NASA telemetry data with visualization and sensor level diagnostics.

Problem

The problem

Aircraft telemetry is high dimensional, unlabeled, and safety critical. Supervised models struggle without labeled anomalies. Aeroguard treats this as an unsupervised structure learning problem — surfacing statistically unusual flight regimes and pointing engineers to the specific sensors that drove the anomaly.

System Design

How it's built

Architecture

Unsupervised Anomaly Detection Pipeline

NASA Telemetry

Multivariate sensors

step

Preprocessing

Scaling + windowing

step

PCA

Dimensionality reduction

step

DBSCAN

Density clustering

step

Anomaly Scoring

Distance + noise labels

step

Sensor Attribution

Per-feature diagnostics

step

Streamlit Dashboard

Interactive visualization

step

Tech Stack

What powers it

PythonScikit-learnPCADBSCANStreamlitNumPyPandasPlotly

Challenges

What was hard

Scaling and windowing high-frequency multivariate sensor data
Choosing PCA components to preserve variance without over-compressing rare regimes
Tuning DBSCAN eps / min_samples for a noisy density landscape
Attributing cluster anomalies back to the responsible raw sensors
Making the pipeline interpretable enough for non-ML domain experts

Design Decisions

Why it's built this way

Unsupervised over supervised

Labels are expensive and biased. Density-based clustering surfaces genuine structure without assuming what an anomaly looks like.

PCA before DBSCAN

DBSCAN struggles in high-dimensional spaces. PCA keeps the geometry meaningful while retaining most variance.

Streamlit for the operator UI

Fast iteration, native Python, and interactive plots without a full frontend stack.

Per-sensor attribution

A flagged anomaly is only useful if you can answer "which sensor?" — attribution is a first-class output, not an afterthought.

Lessons Learned

What I'd tell my past self

Feature engineering and windowing matter more than the choice of clustering algorithm.
Interpretability drives adoption in safety-critical domains far far more than raw accuracy.
Unsupervised methods surface unknown unknowns — but only if you invest in visualization.

Next case study

OnboardAI

Autonomous agent onboarding platform

Back to all projects