A full-stack business intelligence project analyzing 273,391 records of U.S. border crossing data spanning April 1996 – January 2026 across 114 active ports of entry. Built to demonstrate end-to-end business analysis capability — from raw government data ingestion to executive-ready deliverables.
The dashboard is modeled after GDIT's Common Operating Picture framework used in Intelligence and Homeland Security programs, combining a Power BI-style executive layer with a Tableau-style analyst drill-down and an ML-driven port risk register.
Live dashboard → nithink-pixel.github.io/border-intelligence-dashboard
| Panel | Layer | Description |
|---|---|---|
| Executive View | Power BI style | KPI cards, monthly trend, port rankings, YoY comparison |
| Analyst Drill-Down | Tableau style | Port breakdown, mode share, COVID impact, anomaly table |
| Port Risk Clustering | ML layer | KMeans clustering, scatter plot, full risk register |
| BA Findings Memo | Deliverable | 3 findings, 2 risks, 1 tiered recommendation |
Finding 1 — San Ysidro is the sole Critical Hub 274.4M crossings (2020–2025) — nearly 2× the next highest port (El Paso, 147.2M). Isolated as the only "Critical Hub" port among 114 analyzed via KMeans clustering. Volatility index of 0.995 indicates consistent, high-density throughput.
Finding 2 — US-Canada border dropped 19.2% in 2025 US-Canada crossings fell from 74.6M (2024) to 60.3M (2025) while US-Mexico held flat at ~266M. Statistically significant divergence requiring root-cause investigation before staffing decisions are made at northern ports.
Finding 3 — 10 ports flagged with anomalous spikes Z-score analysis flagged 10 ports exceeding z > 2.0, with 8 of 10 clustering in July–August 2024 on the US-Canada border — a temporal concentration that exceeds normal summer seasonality.
# Core pipeline
pandas # ETL, aggregation, feature engineering
scikit-learn # KMeans clustering (k=4), StandardScaler normalization
numpy # Z-score anomaly detection per port- Algorithm: KMeans (k=4, n_init=10, random_state=42)
- Features: total_volume, avg_monthly_volume, volatility_index, measure_type_diversity
- Normalization: StandardScaler (zero mean, unit variance)
- Anomaly Detection: Per-port z-score on monthly volume (threshold: z > 2.0)
- Vanilla HTML/CSS/JS — zero framework dependencies
- Chart.js 4.4.0 for all visualizations
- IBM Plex font family
- Fully responsive — works on any device
| Attribute | Value |
|---|---|
| Source | Bureau of Transportation Statistics (BTS) |
| Portal | data.gov — CBP Border Crossing Entry Data |
| Records | 273,391 rows |
| Date range | April 1996 – January 2026 |
| Borders | US-Mexico · US-Canada |
| Ports | 114 active ports of entry |
| Measures | Personal Vehicles, Pedestrians, Trucks, Buses, Trains + passengers |
| License | Public domain (U.S. Government work) |
Raw CBP Data (273,391 rows)
│
▼
Python ETL Pipeline
├── Date parsing & normalization
├── Port-level feature engineering
│ ├── total_volume
│ ├── avg_monthly_volume
│ ├── std_monthly (volatility)
│ └── measure_type_diversity
│
▼
KMeans Clustering (k=4)
├── StandardScaler normalization
├── Cluster assignment → risk tier labeling
└── Output: Critical Hub / High Volume / Moderate / Low Activity
│
▼
Z-Score Anomaly Detection
├── Per-port monthly baseline (mean, std)
├── Z-score computation per port-month
└── Flag: z > 2.0 → anomaly event
│
▼
Dashboard (HTML/Chart.js)
├── Executive View — Power BI layer
├── Analyst Drill-Down — Tableau layer
├── Risk Clustering — ML layer
└── BA Findings Memo — deliverable layer
Based on the analysis, a 4-tier resource allocation model is recommended:
| Tier | Classification | Ports | Resource Strategy |
|---|---|---|---|
| 1 | Critical Hub | 1 | AI monitoring · Dedicated analyst · Real-time alerts |
| 2 | High Volume | 11 | Automated anomaly alerts · Monthly review cycles |
| 3 | Moderate | 45 | Quarterly review · Surge capacity on-call |
| 4 | Low Activity | 57 | Annual review · Automated reporting only |
border-intelligence-dashboard/
│
├── index.html # Full dashboard (self-contained)
├── README.md # This file
│
└── analysis/ # Python scripts (local, not deployed)
├── etl_pipeline.py # Data cleaning & feature engineering
├── clustering.py # KMeans port risk classification
└── anomaly_detect.py # Z-score flagging per port
Built by Nithin Krishna as part of a business analysis portfolio project.
- MS Business Analytics — UMass Isenberg School of Management (May 2027)
- Skills demonstrated: SQL · Python · Power BI · Tableau · Business Analysis · KMeans Clustering · Z-score Anomaly Detection · Executive Reporting
- LinkedIn: linkedin.com/in/nithin-krishna145
- GitHub: github.com/nithink-pixel
Data source: Bureau of Transportation Statistics · U.S. Customs and Border Protection · data.gov · Public domain