This project explores various techniques for diabetes prediction, combining classical Machine Learning approaches with innovative Quantum Machine Learning (QML) methods. The goal is to compare the performance and effectiveness of different models in classifying the presence or absence of diabetes using real-world datasets.
.
├── src/ # Source code
│ ├── Dataset_Preprocessing.ipynb # Dataset preprocessing and analysis
│ ├── ML_classification.ipynb # Classical classification models (ML)
│ └── VQC.ipynb # Quantum classifier (Variational Quantum Circuit)
│
├── dataset/ # Datasets used
│ ├── diabetes.csv # Main dataset
│ ├── final_diabetes_dataset_3.csv # Preprocessed dataset with 3 features
│ ├── final_diabetes_dataset_4.csv # Preprocessed dataset with 4 features
│ └── final_diabetes_dataset_5.csv # Preprocessed dataset with 5 features
│
└── README.md
The main objective is to predict diabetes based on medical parameters (such as glucose, blood pressure, BMI, etc.), testing different classification approaches:
- Traditional Machine Learning models (e.g., Decision Tree)
- Quantum Classifier (VQC – Variational Quantum Circuit)
The main dataset (diabetes.csv) is based on the well-known Pima Indians Diabetes Dataset. Preprocessed versions are also available, using different normalization and feature selection techniques.
Key features:
- Number of pregnancies
- Glucose
- Blood pressure
- Skin thickness
- Insulin
- BMI
- Diabetes pedigree
- Age
- Diabetes diagnosis (0 = no, 1 = yes)
To run the notebooks, make sure the following packages are installed:
pip install numpy
pip install pandas
pip install scikit-learn
pip install matplotlib
pip install seaborn
pip install qiskit- Preprocessing: Run
Dataset_Preprocessing.ipynbto analyze and prepare the data. - Classical Classification: Run
ML_classification.ipynbto test various ML models. - Quantum Classification: Run
VQC.ipynbto test a VQC model using Qiskit.
This project compares:
- Accuracy and metrics of classical models
- Performance of the quantum model on datasets with different numbers of features (3, 4, and 5)
- Python 3.x
- Scikit-learn
- Qiskit
- Pandas, NumPy, Matplotlib, Seaborn
- Jupyter Notebook
Project developed by Rocco Pio Vardaro and Antonio Pio Francica as part of a study/experimentation on classical and quantum technologies applied to predictive medicine.