Skip to content

Commit 7f43ed6

Browse files
committed
update: schema to reflect sample data
1 parent f8d11b4 commit 7f43ed6

9 files changed

Lines changed: 1799 additions & 33 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,4 @@ data/cohort_analysis_ready_file_template_Identified_01-27-25.xlsx
4444
data/course_analysis_ready_file_template_Identified_01_27_25.xlsx
4545
data/financialaid_analysis_ready_file_template.xlsx
4646
database_summary_20251022.xlsx
47+
database_summary_20251027_002130.xlsx

SCHEMA_UPDATE_SUMMARY.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# Database Schema Update Summary
2+
3+
## Overview
4+
The database has been updated with comprehensive student data schemas. The codebase has been synchronized to match the new database structure.
5+
6+
## Database Structure (Current)
7+
8+
### Cohort Table - 89 Columns
9+
**Core Identifying Fields:**
10+
- `id`, `Institution_ID`, `Cohort`, `Student_GUID`, `Cohort_Term`, `Student_Age`
11+
12+
**Demographics:**
13+
- `Race`, `Ethnicity`, `Gender`, `First_Gen`
14+
15+
**Academic Information:**
16+
- Enrollment details, placement scores, GPA tracking
17+
- Credits attempted/earned across 4 years
18+
- Gateway course completion (Math & English)
19+
- Developmental course tracking
20+
- Retention and persistence metrics
21+
22+
**Completion Tracking:**
23+
- Years to credential at cohort and other institutions
24+
- Separate tracking for Bachelor's, Associate's, and Certificates
25+
- Institution details (STATE, CARNEGIE, LOCALE classifications)
26+
27+
**Special Status:**
28+
- `NASPA_First_Generation`, `Incarcerated_Status`, `Military_Status`
29+
- `Employment_Status`, `Disability_Status`, `Foreign_Language_Completion`
30+
31+
**Metadata:**
32+
- `school`, `dataset_type`, `created_at`
33+
34+
### Course Table - 39 Columns
35+
**Student & Institution:**
36+
- `Student_GUID`, `Institution_ID`, `Cohort`, `Cohort_Term`
37+
- Demographics: `Race`, `Ethnicity`, `Gender`, `Student_Age`
38+
39+
**Course Details:**
40+
- `Course_Prefix`, `Course_Number`, `Section_ID`, `Course_Name`
41+
- `Course_CIP`, `Course_Type`, `Course_Begin_Date`, `Course_End_Date`
42+
43+
**Academic Indicators:**
44+
- `Math_or_English_Gateway`, `Co_requisite_Course`
45+
- `Core_Course`, `Core_Course_Type`, `Core_Competency_Completed`
46+
47+
**Performance:**
48+
- `Grade`, `Number_of_Credits_Attempted`, `Number_of_Credits_Earned`
49+
50+
**Delivery & Transfer:**
51+
- `Delivery_Method`, `Enrolled_at_Other_Institutions`
52+
- External institution tracking (STATE, CARNEGIE, LOCALE)
53+
54+
**Instructor:**
55+
- `Course_Instructor_Employment_Status`, `Course_Instructor_Rank`
56+
57+
**Metadata:**
58+
- `school`, `dataset_type`, `created_at`
59+
60+
### Financial Aid Table - 25 Columns
61+
**Student Identification:**
62+
- `Student_ID`, `Institution_ID`, `Cohort`, `Cohort_Term`, `Academic_Year`
63+
64+
**Personal Information:**
65+
- `First_Name`, `Middle_Name`, `Last_Name`
66+
- `SSN`, `Student_Age`, `Date_of_Birth`
67+
68+
**Financial Status:**
69+
- `Dependency_Status`, `Housing_Status`
70+
71+
**Financial Aid Details:**
72+
- `Cost_of_Attendance`, `EFC` (Expected Family Contribution)
73+
- `Total_Institutional_Grants`, `Total_State_Grants`, `Total_Federal_Grants`
74+
- `Unmet_Need`, `Net_Price`, `Applied_Aid`
75+
76+
**Metadata:**
77+
- `school`, `dataset_type`, `created_at`
78+
79+
## Code Changes Made
80+
81+
### 1. Pydantic Schemas Updated (`api/schemas.py`)
82+
- **CohortRecord**: Updated from 5 fields to 89 fields
83+
- **CourseRecord**: Updated from 6 fields to 39 fields
84+
- **FinancialAidRecord**: Updated from 7 fields to 25 fields
85+
- All fields marked as `Optional` to handle varying data completeness
86+
- Added `Decimal` type import for decimal fields (GPA, completion times, etc.)
87+
88+
### 2. Database Setup Script (`db_operations/db_setup.py`)
89+
- Added documentation header explaining the file contains legacy schema
90+
- Points to `database_schema.json` and `api/schemas.py` for current schema
91+
- Preserved for historical reference
92+
93+
### 3. Connection Handler (`db_operations/connection.py`)
94+
- Updated `format_records()` to properly handle all `Decimal` types
95+
- Now converts all Decimal fields to float for JSON serialization (not just 'amount')
96+
97+
### 4. API Routers (No Changes Required)
98+
- Routers use `SELECT *` which automatically fetches all columns
99+
- Pydantic validation handles field mapping automatically
100+
- All existing endpoints remain compatible
101+
102+
### 5. Schema Export Tool Created
103+
- New file: `export_schema.py` - exports complete schema to JSON
104+
- New file: `database_schema.json` - detailed column definitions with types
105+
- New file: `check_schema.py` - verifies database table structures
106+
107+
## Databases
108+
109+
The system manages 5 institutional databases:
110+
1. **AL** - Bishop_State_Community_College
111+
2. **CSUSB** - California_State_University_San_Bernardino
112+
3. **KCTCS** - Kentucky_Community_and_Technical_College_System
113+
4. **KY** - Thomas_More_University
114+
5. **OH** - University_of_Akron
115+
116+
All databases share the same table structure (cohort, course, financial_aid).
117+
118+
## API Compatibility
119+
120+
All existing API endpoints remain functional:
121+
- `GET /api/{school}/cohorts` - Returns full cohort records with all 89 fields
122+
- `GET /api/{school}/courses` - Returns full course records with all 39 fields
123+
- `GET /api/{school}/financial-aid` - Returns full financial aid records with all 25 fields
124+
- `GET /api/{school}/{table}/count` - Returns record counts
125+
126+
The Pydantic models will automatically validate and serialize the data, excluding any `None` values in responses by default.
127+
128+
## Files Created/Modified
129+
130+
**New Files:**
131+
- `export_schema.py` - Schema export utility
132+
- `check_schema.py` - Schema verification utility
133+
- `database_schema.json` - Complete schema documentation
134+
- `SCHEMA_UPDATE_SUMMARY.md` - This file
135+
136+
**Modified Files:**
137+
- `api/schemas.py` - Updated all record models
138+
- `db_operations/db_setup.py` - Added documentation header
139+
- `db_operations/connection.py` - Enhanced format_records()
140+
141+
## Next Steps (Optional)
142+
143+
Consider these enhancements:
144+
1. Add field-specific filtering to API endpoints
145+
2. Create aggregate endpoints for common queries
146+
3. Add data validation rules beyond type checking
147+
4. Implement caching for frequently accessed data
148+
5. Add search/filter capabilities for large datasets
149+
150+
## Verification
151+
152+
To verify the updates work correctly:
153+
```bash
154+
# Check schema
155+
python check_schema.py
156+
157+
# Export schema to JSON
158+
python export_schema.py
159+
160+
# Test database connections
161+
python db_operations/connection.py
162+
```
163+
164+
---
165+
**Update Date:** 2025-10-27
166+
**Schema Version:** 2.0 (Comprehensive Student Data)

0 commit comments

Comments
 (0)