Skip to content

Architecture

DHANUSH G edited this page Mar 4, 2026 · 1 revision

🧩 System Architecture

Back to Home | Setup-Guide | API-Reference


πŸ—ΊοΈ High-Level Architecture

The platform follows a 3-tier architecture with a clear separation between the AI/data layer, the API layer, and the presentation layer.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              PRESENTATION LAYER                  β”‚
β”‚   Next.js 14 Dashboard (localhost:3000)          β”‚
β”‚   β”œβ”€β”€ Recharts (2D graphs: traffic, anomalies)  β”‚
β”‚   β”œβ”€β”€ 3D Threat Globe (React Three Fiber)        β”‚
β”‚   └── 3D Network Topology (nodes & edges)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ REST API (JSON)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               API LAYER                          β”‚
β”‚   FastAPI Backend (localhost:8000)               β”‚
β”‚   β”œβ”€β”€ GET  /          β†’ Health Check              β”‚
β”‚   β”œβ”€β”€ POST /logs/     β†’ Log Ingestion             β”‚
β”‚   β”œβ”€β”€ GET  /logs/     β†’ Log Retrieval + Pagination β”‚
β”‚   └── POST /predict/  β†’ AI Anomaly Score           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   DATA LAYER      β”‚  β”‚    AI LAYER        β”‚
β”‚ SQLite (via ORM) β”‚  β”‚ Isolation Forest   β”‚
β”‚ SQLAlchemy modelsβ”‚  β”‚ (.pkl model file)  β”‚
β”‚ Log records      β”‚  β”‚ Scikit-learn       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”„ Data Flow

Log Ingestion Flow

Client/Agent
    β”‚
    β”‚  POST /logs/  {source_ip, dest_ip, protocol, bytes, event_type, details}
    β–Ό
FastAPI Router
    β”‚
    β”œβ”€β”€ Pydantic Schema Validation
    β”‚
    β”œβ”€β”€ SQLAlchemy β†’ SQLite (persist raw log)
    β”‚
    └── Return saved log record (JSON)

Anomaly Detection Flow

Client
    β”‚
    β”‚  POST /predict/  {feature vector}
    β–Ό
FastAPI Router
    β”‚
    β”œβ”€β”€ Load Isolation Forest model (.pkl)
    β”‚
    β”œβ”€β”€ Feature extraction (NumPy/Pandas)
    β”‚
    β”œβ”€β”€ model.predict() β†’ anomaly score
    β”‚
    └── Classify: Normal / Suspicious / Critical
    β”‚
    └── Return {score, label, confidence}

🧠 AI Model: Isolation Forest

Why Isolation Forest?

Property Benefit
Unsupervised No labeled attack data needed
Handles high-dimensional data Works with IPs, ports, bytes, timing
Scales well Faster than LOF for large log volumes
Zero-day friendly Detects unknown/novel attack patterns
Low false-positive rate Tuned contamination parameter

How it Works

  1. Training: train_model.py generates synthetic logs (generated_logs.csv) simulating both normal and anomalous traffic patterns
  2. Feature Engineering: Numeric features (bytes transferred, port numbers, protocol encoding) are extracted
  3. Model Fitting: IsolationForest(contamination=0.05) is trained on the dataset
  4. Serialization: Model saved to ai-model/isolation_forest_model.pkl via joblib
  5. Inference: On each /predict/ call, the model scores the input and returns a classification

Threat Classification Thresholds

Score Range Classification Action
score > -0.1 Normal Log and continue
-0.3 < score ≀ -0.1 Suspicious Flag for review
score ≀ -0.3 Critical Immediate alert

πŸ–ΏοΈ Database Schema

Log Entry Table

CREATE TABLE logs (
    id               INTEGER PRIMARY KEY AUTOINCREMENT,
    source_ip        VARCHAR NOT NULL,
    destination_ip   VARCHAR NOT NULL,
    protocol         VARCHAR NOT NULL,
    bytes_transferred INTEGER NOT NULL,
    event_type       VARCHAR NOT NULL,   -- 'normal' | 'suspicious' | 'critical'
    details          TEXT,
    timestamp        DATETIME DEFAULT CURRENT_TIMESTAMP
);

Scalability Note

SQLite is used for local development. For production, replace the DATABASE_URL with a PostgreSQL connection string β€” SQLAlchemy handles the transition seamlessly.


🌐 Frontend Components

Component Technology Purpose
Traffic Charts Recharts (LineChart, BarChart) Visualize log volume and traffic over time
Threat Pie Chart Recharts (PieChart) Distribution of Normal / Suspicious / Critical
3D Threat Globe React Three Fiber + drei Global geographic threat origin map
Network Topology React Three Fiber Real-time node-edge graph of connections
Log Table Next.js + Tailwind Paginated, searchable raw log viewer
Alert Banner Lucide + Tailwind Live critical event notifications

πŸ”„ CI/CD Pipeline

# .github/workflows/ci.yml
Trigger: push / pull_request to main

Steps:
  1. Checkout code
  2. Set up Python 3.10
  3. Install backend dependencies (pip install -r requirements.txt)
  4. Set PYTHONPATH=. (for backend module resolution)
  5. Run pytest (backend/tests/)
  6. Report test results

πŸš€ Future Architecture Extensions

  • WebSockets: Replace REST polling with ws:// streams for real-time push alerts
  • Celery + Redis: Async task queue for background model retraining
  • Kafka / RabbitMQ: Message broker for high-throughput log ingestion
  • Docker Compose: Orchestrate backend, frontend, and DB as containers
  • Autoencoder Model: Deep learning replacement for Isolation Forest for richer embeddings
  • PostgreSQL: Production-grade database with full-text search

Back to Home | Next: Setup-Guide