Named after the divine discus weapon of Lord Vishnu, this system serves as a protective guardian against aerial threats.
A production-grade deep learning system for detecting drone/UAV acoustic signatures using Mel-Spectrogram analysis and Convolutional Neural Networks.
Develop an acoustic surveillance system for ground-based defense units that can:
- Listen to environmental audio in real-time
- Detect the specific acoustic signature of drone propellers
- Classify audio as Threat (Drone) or Safe (Background)
- Audio-to-Vision Pipeline: Converts raw audio waveforms to Mel-Spectrograms for CNN analysis
- Custom CNN Architecture: 4-layer network optimized for acoustic pattern recognition
- Defense-Grade Metrics: Prioritizes Recall (minimizing missed threats) over Precision
- Production Ready: Modular Python package with inference API
- Auto-Ingestion: Automatically downloads and prepares the DroneAudioDataset
SudarshanChakra/
βββ configs/
β βββ config.py # Central configuration parameters
βββ src/
β βββ __init__.py
β βββ data_ingestion.py # Auto-clone dataset from GitHub
β βββ data_loader.py # PyTorch Dataset & DataLoader
β βββ model.py # CNN architectures
β βββ train.py # Training pipeline with early stopping
β βββ inference.py # Real-time threat detection
βββ outputs/
β βββ models/ # Saved model checkpoints
β βββ plots/ # Confusion matrix, training curves
β βββ logs/ # Training reports (JSON)
βββ data/ # Auto-downloaded dataset
βββ main.py # Main entry point
βββ requirements.txt # Dependencies
βββ README.md
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # Linux/Mac
# or: venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt# Complete pipeline: download data -> train -> evaluate
python main.py --trainThis will:
- Clone the DroneAudioDataset from GitHub
- Print the dataset directory structure
- Auto-discover Drone/Background audio paths
- Train the CNN with early stopping
- Generate confusion matrix and training plots
- Save the best model to
outputs/models/best_model.pth
# Analyze a single audio file
python main.py --detect path/to/audio.wav
# With custom threshold (lower = more sensitive)
python main.py --detect recording.wav --threshold 0.3
# Run demo on sample data
python main.py --demoEdit configs/config.py to modify system parameters:
| Parameter | Default | Description |
|---|---|---|
SAMPLE_RATE |
22050 Hz | Audio sampling rate |
DURATION |
2.0 s | Analysis window |
N_MELS |
128 | Mel frequency bins |
BATCH_SIZE |
32 | Training batch size |
LEARNING_RATE |
1e-4 | Adam optimizer LR |
THREAT_CONFIDENCE_THRESHOLD |
0.4 | Detection sensitivity |
Source: DroneAudioDataset by Sara Al-Emadi
The system automatically:
- Clones the repository to
data/DroneAudioDataset/ - Scans for
Binary_Drone_Audiofolder structure - Discovers Drone and Background/Unknown audio directories
- Validates WAV file integrity
Input: Mel-Spectrogram (1, 128, 87)
β
ConvBlock: 1 β 32 channels, MaxPool
β
ConvBlock: 32 β 64 channels, MaxPool
β
ConvBlock: 64 β 128 channels, MaxPool
β
ConvBlock: 128 β 256 channels, MaxPool
β
Global Average Pooling
β
FC: 256 β 128 β 64 β 2 (with Dropout)
β
Output: [Safe, Threat] logits
Total Parameters: ~500K (lightweight for edge deployment)
In defense applications:
- False Negative (missed drone) = Critical β Prioritize high Recall
- False Positive (false alarm) = Acceptable β Accept lower Precision
The system uses a 0.4 confidence threshold by default, biasing toward threat detection.
[TEST RESULTS]
Accuracy: 0.9523
Precision: 0.9412
Recall: 0.9697 β 97% of drones detected!
F1 Score: 0.9552
[DEFENSE METRICS INTERPRETATION]
- Recall (96.97%): Percentage of actual threats detected
β 3.0% of threats are MISSED (False Negatives)
- Precision (94.12%): Percentage of alerts that are real threats
β 5.9% of alerts are FALSE (False Positives)
After training, find these artifacts in outputs/:
| File | Description |
|---|---|
models/best_model.pth |
Best model checkpoint |
plots/confusion_matrix.png |
Test set confusion matrix |
plots/training_history.png |
Loss, accuracy, metrics curves |
logs/training_report.json |
Full training metrics log |
from src.inference import ThreatDetector
# Initialize detector
detector = ThreatDetector(threshold=0.4)
# Analyze audio file
result = detector.detect("recording.wav")
# Result format:
# {
# "status": "THREAT", # or "SAFE"
# "confidence": 0.89,
# "probabilities": {"safe": 0.11, "threat": 0.89},
# "file": "recording.wav"
# }
# Print formatted alert
detector.print_alert(result)# Show configuration
python main.py --config
# Data ingestion only
python main.py --ingest
# Training pipeline
python main.py --train
# Inference
python main.py --detect audio.wav
python main.py --detect audio.wav --threshold 0.3
# Demo mode
python main.py --demo- Python 3.8+
- PyTorch 2.0+
- librosa 0.10+
- scikit-learn 1.3+
- Git (for dataset cloning)
If using the DroneAudioDataset:
Al-Emadi, Sara, et al. "Audio Based Drone Detection and Identification
using Deep Learning." 2019 15th International Wireless Communications
& Mobile Computing Conference (IWCMC). IEEE, 2019.
This project is for authorized defense research and educational purposes.
SUDARSHANchakra - Protecting the skies through acoustic intelligence.