API

API Reference

REST API for the HealthWithSevgi ML Visualization Tool. All endpoints are served by FastAPI at http://localhost:8001 (dev) or http://localhost:7860 (HuggingFace Spaces / Docker).

Base URL: /api (all endpoints below are prefixed)
Content type: application/json unless noted
Live OpenAPI schema: GET /openapi.json
Interactive Swagger UI: GET /docs
ReDoc: GET /redoc
API version: 1.3.1

Session model: all state (datasets, trained models, SHAP values) is kept in-memory on the backend with LRU eviction. No database. Call /prepare first to receive a session_id, then pass it to /train, which returns a model_id used by explain/ethics/certificate endpoints.

Health & Root

Method	Path	Response
`GET`	`/`	`{ "status": "ok", "project": "HealthWithSevgi", "version": "1.3.1" }`
`GET`	`/health`	`{ "status": "healthy" }`

These endpoints are not prefixed with /api.

Specialties

Registry of the 20 supported medical specialties.

Method	Path	Response schema	Description
`GET`	`/api/specialties`	`list[SpecialtyInfo]`	List every registered specialty
`GET`	`/api/specialties/{specialty_id}`	`SpecialtyInfo`	Fetch a specialty by id (`endocrinology_diabetes`, `cardiology_heart_failure`, …)

SpecialtyInfo includes: id, name, clinical_context, target_variable, data_source, what_ai_predicts, feature metadata.

Errors: 404 if specialty_id is unknown.

Data — Explore & Prepare

Both endpoints accept multipart/form-data. file is optional; if omitted, the built-in dataset for specialty_id is loaded.

`POST /api/explore`

Validates the dataset and returns column-level stats used by Step 2 (Data Exploration).

Form field	Type	Required	Default
`specialty_id`	string	yes
`target_col`	string	yes
`file`	CSV (≤ 50 MB, ≥ 10 rows, ≥ 2 columns)	no	uses built-in dataset

Response: DataExplorationResponse — per-column types, null counts, class balance, summary stats, sample rows.

Errors:

422 — non-CSV extension, parse failure, fewer than 10 rows / 2 columns, or unknown target_col
413 — file exceeds 50 MB

`POST /api/prepare`

Applies the Step 3 preprocessing pipeline (split / missing / normalize / SMOTE / outliers) and returns a session_id for use by /train.

Form field	Type	Default
`specialty_id`	string	—
`target_col`	string	—
`test_size`	float (0.1–0.4)	`0.2`
`missing_strategy`	`median` \| `mode` \| `drop`	`median`
`normalization`	`zscore` \| `minmax` \| `none`	`zscore`
`use_smote`	bool	`false`
`outlier_handling`	`none` \| `iqr` \| `zscore_clip`	`none`
`session_id`	string (reuses existing session if provided)	auto-generated UUID
`file`	CSV (same limits as `/explore`)	built-in dataset

Response: PrepResponse — session_id, train_size, test_size, features_count, class_distribution_before, class_distribution_after, smote_applied, normalization_applied, norm_samples (before/after values for a few features).

ML — Train & Compare

`POST /api/train`

Trains one of eight models on the prepared session.

Request body (TrainRequest):

{
  "session_id": "uuid-from-/prepare",
  "model_type": "knn",
  "params": { "n_neighbors": 5, "metric": "euclidean" },
  "tune": false,
  "use_feature_selection": false
}

model_type enum: knn, svm, decision_tree, random_forest, logistic_regression, naive_bayes, xgboost, lightgbm.

Parameter schemas per model:

Model	Params
`knn`	`n_neighbors` (1–25), `metric` (`euclidean`/`manhattan`)
`svm`	`kernel` (`linear`/`rbf`/`poly`/`sigmoid`), `C` (0.01–100)
`decision_tree`	`max_depth` (1–20), `criterion` (`gini`/`entropy`)
`random_forest`	`n_estimators` (10–500), `max_depth` (1–20)
`logistic_regression`	`C` (0.001–100), `max_iter` (50–2000)
`naive_bayes`	`var_smoothing` (1e-12–1e-3)
`xgboost`	`n_estimators` (10–500), `max_depth` (1–15), `learning_rate` (0.01–0.5)
`lightgbm`	`n_estimators` (10–500), `max_depth` (-1–15), `learning_rate` (0.01–0.5)

Response: TrainResponse — model_id, metrics (accuracy, sensitivity, specificity, precision, F1, AUC-ROC, MCC), confusion matrix, ROC/PR curves, feature names, training time.

Errors: 404 if session_id unknown · 422 on training failure.

Compare

Method	Path	Description
`POST`	`/api/compare/{model_id}`	Add a trained model to the comparison list
`GET`	`/api/compare/{session_id}`	Get the current comparison list (sorted by AUC-ROC)
`DELETE`	`/api/compare/{session_id}`	Clear the comparison list (returns `204`)
`GET`	`/api/models/{model_id}`	Get minimal model metadata (`model_type`, `params`, `feature_names`, `classes`)

Explainability

`GET /api/explain/global/{model_id}` → `GlobalExplainabilityResponse`

SHAP-based global feature importance (descending), clinical names, top-feature clinical note, and cumulative explained variance for Step 6.

`GET /api/explain/patient/{model_id}/{patient_index}` → `SinglePatientExplainResponse`

SHAP waterfall for a single patient — base value, predicted class/probability, and per-feature shap_value with plain-language narration. patient_index must be within [0, len(X_test)-1].

`GET /api/explain/sample-patients/{model_id}` → `SamplePatientsResponse`

Returns three representative patients from the test set — low-risk (min predicted probability), mid-risk (closest to 0.5), and high-risk (max probability). Each entry carries index, risk_level, probability, and a one-line summary used in the Step 6 patient dropdown.

`POST /api/explain/what-if` → `WhatIfResponse`

Recomputes predicted probability when a single feature is overridden.

{
  "model_id": "…",
  "patient_index": 12,
  "feature_name": "serum_creatinine",
  "new_value": 1.4
}

Errors: 400 if patient_index out of range or feature_name not in the trained feature list.

Ethics & Bias

`GET /api/ethics/{model_id}` → `EthicsResponse`

Subgroup fairness table (by gender + age bands), bias warnings (sensitivity gap > 10pp), representation warnings (demographic gap > 15pp), overall sensitivity, and EU AI Act checklist state.

`POST /api/ethics/checklist`

Toggles one of the eight EU AI Act checklist items for a given model.

{ "model_id": "…", "item_id": "model_explainability", "checked": true }

Insights (LLM)

`GET /api/insights/{model_id}`

Calls the InsightService (MedGemma / Gemini) with a fully-assembled clinical context (specialty, metrics, SHAP, fairness data, sample patients) and returns three parallel outputs:

{
  "ethics_insight":    "...",
  "case_studies":      [ ... ],
  "eu_ai_act_insights": [ ... ]
}

Errors: 422 if metrics are not available (model never trained) · 500 on LLM failure.

Certificate (PDF)

`POST /api/generate-certificate`

Returns a ReportLab-rendered PDF (application/pdf, Content-Disposition: attachment) with the active domain, model, six core metrics, bias findings, and checklist state.

Request body (CertificateRequest):

{
  "model_id": "…",
  "session_id": "…",
  "checklist_state": { "model_explainability": true, "data_transparency": true },
  "clinician_name": "Healthcare Professional",
  "institution": "Healthcare Institution"
}

clinician_name and institution are optional (defaults shown). Typical generation time: < 1 s (measured 0.69 s in Sprint 4 QA).

Error Format

FastAPI HTTPException responses share the same JSON shape:

{ "detail": "Target column 'age' not found. Available: [\"glucose\", \"bmi\", ...]" }

Status	Meaning
`400`	Malformed request (bad patient index, missing feature name)
`404`	Unknown `specialty_id`, `session_id`, or `model_id`
`413`	Uploaded CSV exceeds 50 MB
`422`	Dataset validation failure, training failure, unknown target column
`500`	Unhandled server-side error (explainability, insights, certificate generation)

Typical End-to-End Flow

POST /api/explore        → validate + stats      (Step 2)
POST /api/prepare        → session_id            (Step 3)
POST /api/train          → model_id              (Step 4)
GET  /api/explain/global/{model_id}              (Step 6)
GET  /api/explain/sample-patients/{model_id}     (Step 6)
GET  /api/explain/patient/{model_id}/{idx}       (Step 6 waterfall)
POST /api/explain/what-if                        (Step 6 what-if)
GET  /api/ethics/{model_id}                      (Step 7)
POST /api/ethics/checklist                       (Step 7 checklist toggle)
POST /api/generate-certificate                   (Step 7 download)

See Architecture for the layered view and service map.

API

API Reference

Table of Contents

Health & Root

Specialties

Data — Explore & Prepare

POST /api/explore

POST /api/prepare

ML — Train & Compare

POST /api/train

Compare

Explainability

GET /api/explain/global/{model_id} → GlobalExplainabilityResponse

GET /api/explain/patient/{model_id}/{patient_index} → SinglePatientExplainResponse

GET /api/explain/sample-patients/{model_id} → SamplePatientsResponse

POST /api/explain/what-if → WhatIfResponse

Ethics & Bias

GET /api/ethics/{model_id} → EthicsResponse

POST /api/ethics/checklist

Insights (LLM)

GET /api/insights/{model_id}

Certificate (PDF)

POST /api/generate-certificate

Error Format

Typical End-to-End Flow

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

`POST /api/explore`

`POST /api/prepare`

`POST /api/train`

`GET /api/explain/global/{model_id}` → `GlobalExplainabilityResponse`

`GET /api/explain/patient/{model_id}/{patient_index}` → `SinglePatientExplainResponse`

`GET /api/explain/sample-patients/{model_id}` → `SamplePatientsResponse`

`POST /api/explain/what-if` → `WhatIfResponse`

`GET /api/ethics/{model_id}` → `EthicsResponse`

`POST /api/ethics/checklist`

`GET /api/insights/{model_id}`

`POST /api/generate-certificate`