An end-to-end kidney disease classification system with experiment tracking, reproducible pipelines, and production-ready ML workflows.
๐ Live App โข ๐ Workflow โข ๐๏ธ Architecture โข ๐ Quick Start โข ๐ MLflow & DVC
This project is a production-style machine learning system for kidney disease classification, designed with MLOps best practices in mind.
It focuses on:
- Modular ML pipelines
- Experiment tracking with MLflow
- Reproducibility using DVC
- Configuration-driven development
- Deployment-ready Streamlit interface
The project structure closely resembles real-world industry ML systems.
The entire pipeline is configuration-driven and modular:
- Update
config.yaml - Update
secrets.yaml(optional) - Update
params.yaml - Define entities
- Update configuration manager in
src/config - Update individual components
- Update pipeline logic
- Update
main.py - Update
dvc.yaml - Update
app.py
This design ensures clean separation of concerns and easy experimentation.
Data Ingestion
โ
Data Validation
โ
Data Transformation
โ
Base Model Preparation (VGG16)
โ
Model Training
โ
Model Evaluation
โ
MLflow Logging & Registry
โ
Streamlit Inference App
DVC orchestrates the pipeline, while MLflow tracks experiments and models.
| Layer | Technology |
|---|---|
| Language | Python 3.10 |
| Model | VGG16 (Transfer Learning) |
| ML Framework | TensorFlow / Keras |
| Experiment Tracking | MLflow |
| Pipeline Orchestration | DVC |
| Frontend | Streamlit |
| Configuration | YAML |
| Tracking Server | DagsHub |
- Conda
- Python 3.10
- Git
- DVC
git clone https://github.com/vivek34561/kidney_disease_classification
cd kidney_disease_classificationconda create -n myenv python=3.10 -y
conda activate myenvpip install -r requirements.txtpython app.pyOpen your browser and navigate to the local Streamlit URL shown in the terminal.
- Tracks experiments, parameters, metrics, and artifacts
- Maintains model registry
- Enables reproducibility and comparison
Useful commands:
mlflow uiDocumentation:
Tracking URI:
https://dagshub.com/vivek34561/kidney_disease_classification.mlflow
Set environment variables:
export MLFLOW_TRACKING_URI=https://dagshub.com/vivek34561/kidney_disease_classification.mlflow
export MLFLOW_TRACKING_USERNAME=vivek34561
export MLFLOW_TRACKING_PASSWORD=your_token_hereThen run:
python main.pydvc init
dvc repro
dvc dagdvc reproruns the entire ML pipelinedvc dagvisualizes pipeline dependencies
- Production-grade experiment tracking
- Parameter, metric, and artifact logging
- Model versioning and comparison
- Lightweight pipeline orchestration
- Reproducible experiments
- Data and model version control
- Ideal for PoC and research-to-production workflows
- CI/CD integration for ML pipelines
- Automated model promotion rules
- Cloud-based artifact storage
- Model drift detection
- API-based inference service
Vivek Kumar Gupta AI Engineering Student | ML & MLOps Enthusiast
GitHub: https://github.com/vivek34561
LinkedIn: https://linkedin.com/in/vivek-gupta-0400452b6
Portfolio: https://resume-sepia-seven.vercel.app/
MIT License ยฉ 2025 Vivek Kumar Gupta
- Align it directly with ML Engineer / MLOps Engineer job descriptions
- Create a system design diagram explanation for interviews