|
| 1 | +# Database Schema Update Summary |
| 2 | + |
| 3 | +## Overview |
| 4 | +The database has been updated with comprehensive student data schemas. The codebase has been synchronized to match the new database structure. |
| 5 | + |
| 6 | +## Database Structure (Current) |
| 7 | + |
| 8 | +### Cohort Table - 89 Columns |
| 9 | +**Core Identifying Fields:** |
| 10 | +- `id`, `Institution_ID`, `Cohort`, `Student_GUID`, `Cohort_Term`, `Student_Age` |
| 11 | + |
| 12 | +**Demographics:** |
| 13 | +- `Race`, `Ethnicity`, `Gender`, `First_Gen` |
| 14 | + |
| 15 | +**Academic Information:** |
| 16 | +- Enrollment details, placement scores, GPA tracking |
| 17 | +- Credits attempted/earned across 4 years |
| 18 | +- Gateway course completion (Math & English) |
| 19 | +- Developmental course tracking |
| 20 | +- Retention and persistence metrics |
| 21 | + |
| 22 | +**Completion Tracking:** |
| 23 | +- Years to credential at cohort and other institutions |
| 24 | +- Separate tracking for Bachelor's, Associate's, and Certificates |
| 25 | +- Institution details (STATE, CARNEGIE, LOCALE classifications) |
| 26 | + |
| 27 | +**Special Status:** |
| 28 | +- `NASPA_First_Generation`, `Incarcerated_Status`, `Military_Status` |
| 29 | +- `Employment_Status`, `Disability_Status`, `Foreign_Language_Completion` |
| 30 | + |
| 31 | +**Metadata:** |
| 32 | +- `school`, `dataset_type`, `created_at` |
| 33 | + |
| 34 | +### Course Table - 39 Columns |
| 35 | +**Student & Institution:** |
| 36 | +- `Student_GUID`, `Institution_ID`, `Cohort`, `Cohort_Term` |
| 37 | +- Demographics: `Race`, `Ethnicity`, `Gender`, `Student_Age` |
| 38 | + |
| 39 | +**Course Details:** |
| 40 | +- `Course_Prefix`, `Course_Number`, `Section_ID`, `Course_Name` |
| 41 | +- `Course_CIP`, `Course_Type`, `Course_Begin_Date`, `Course_End_Date` |
| 42 | + |
| 43 | +**Academic Indicators:** |
| 44 | +- `Math_or_English_Gateway`, `Co_requisite_Course` |
| 45 | +- `Core_Course`, `Core_Course_Type`, `Core_Competency_Completed` |
| 46 | + |
| 47 | +**Performance:** |
| 48 | +- `Grade`, `Number_of_Credits_Attempted`, `Number_of_Credits_Earned` |
| 49 | + |
| 50 | +**Delivery & Transfer:** |
| 51 | +- `Delivery_Method`, `Enrolled_at_Other_Institutions` |
| 52 | +- External institution tracking (STATE, CARNEGIE, LOCALE) |
| 53 | + |
| 54 | +**Instructor:** |
| 55 | +- `Course_Instructor_Employment_Status`, `Course_Instructor_Rank` |
| 56 | + |
| 57 | +**Metadata:** |
| 58 | +- `school`, `dataset_type`, `created_at` |
| 59 | + |
| 60 | +### Financial Aid Table - 25 Columns |
| 61 | +**Student Identification:** |
| 62 | +- `Student_ID`, `Institution_ID`, `Cohort`, `Cohort_Term`, `Academic_Year` |
| 63 | + |
| 64 | +**Personal Information:** |
| 65 | +- `First_Name`, `Middle_Name`, `Last_Name` |
| 66 | +- `SSN`, `Student_Age`, `Date_of_Birth` |
| 67 | + |
| 68 | +**Financial Status:** |
| 69 | +- `Dependency_Status`, `Housing_Status` |
| 70 | + |
| 71 | +**Financial Aid Details:** |
| 72 | +- `Cost_of_Attendance`, `EFC` (Expected Family Contribution) |
| 73 | +- `Total_Institutional_Grants`, `Total_State_Grants`, `Total_Federal_Grants` |
| 74 | +- `Unmet_Need`, `Net_Price`, `Applied_Aid` |
| 75 | + |
| 76 | +**Metadata:** |
| 77 | +- `school`, `dataset_type`, `created_at` |
| 78 | + |
| 79 | +## Code Changes Made |
| 80 | + |
| 81 | +### 1. Pydantic Schemas Updated (`api/schemas.py`) |
| 82 | +- **CohortRecord**: Updated from 5 fields to 89 fields |
| 83 | +- **CourseRecord**: Updated from 6 fields to 39 fields |
| 84 | +- **FinancialAidRecord**: Updated from 7 fields to 25 fields |
| 85 | +- All fields marked as `Optional` to handle varying data completeness |
| 86 | +- Added `Decimal` type import for decimal fields (GPA, completion times, etc.) |
| 87 | + |
| 88 | +### 2. Database Setup Script (`db_operations/db_setup.py`) |
| 89 | +- Added documentation header explaining the file contains legacy schema |
| 90 | +- Points to `database_schema.json` and `api/schemas.py` for current schema |
| 91 | +- Preserved for historical reference |
| 92 | + |
| 93 | +### 3. Connection Handler (`db_operations/connection.py`) |
| 94 | +- Updated `format_records()` to properly handle all `Decimal` types |
| 95 | +- Now converts all Decimal fields to float for JSON serialization (not just 'amount') |
| 96 | + |
| 97 | +### 4. API Routers (No Changes Required) |
| 98 | +- Routers use `SELECT *` which automatically fetches all columns |
| 99 | +- Pydantic validation handles field mapping automatically |
| 100 | +- All existing endpoints remain compatible |
| 101 | + |
| 102 | +### 5. Schema Export Tool Created |
| 103 | +- New file: `export_schema.py` - exports complete schema to JSON |
| 104 | +- New file: `database_schema.json` - detailed column definitions with types |
| 105 | +- New file: `check_schema.py` - verifies database table structures |
| 106 | + |
| 107 | +## Databases |
| 108 | + |
| 109 | +The system manages 5 institutional databases: |
| 110 | +1. **AL** - Bishop_State_Community_College |
| 111 | +2. **CSUSB** - California_State_University_San_Bernardino |
| 112 | +3. **KCTCS** - Kentucky_Community_and_Technical_College_System |
| 113 | +4. **KY** - Thomas_More_University |
| 114 | +5. **OH** - University_of_Akron |
| 115 | + |
| 116 | +All databases share the same table structure (cohort, course, financial_aid). |
| 117 | + |
| 118 | +## API Compatibility |
| 119 | + |
| 120 | +All existing API endpoints remain functional: |
| 121 | +- `GET /api/{school}/cohorts` - Returns full cohort records with all 89 fields |
| 122 | +- `GET /api/{school}/courses` - Returns full course records with all 39 fields |
| 123 | +- `GET /api/{school}/financial-aid` - Returns full financial aid records with all 25 fields |
| 124 | +- `GET /api/{school}/{table}/count` - Returns record counts |
| 125 | + |
| 126 | +The Pydantic models will automatically validate and serialize the data, excluding any `None` values in responses by default. |
| 127 | + |
| 128 | +## Files Created/Modified |
| 129 | + |
| 130 | +**New Files:** |
| 131 | +- `export_schema.py` - Schema export utility |
| 132 | +- `check_schema.py` - Schema verification utility |
| 133 | +- `database_schema.json` - Complete schema documentation |
| 134 | +- `SCHEMA_UPDATE_SUMMARY.md` - This file |
| 135 | + |
| 136 | +**Modified Files:** |
| 137 | +- `api/schemas.py` - Updated all record models |
| 138 | +- `db_operations/db_setup.py` - Added documentation header |
| 139 | +- `db_operations/connection.py` - Enhanced format_records() |
| 140 | + |
| 141 | +## Next Steps (Optional) |
| 142 | + |
| 143 | +Consider these enhancements: |
| 144 | +1. Add field-specific filtering to API endpoints |
| 145 | +2. Create aggregate endpoints for common queries |
| 146 | +3. Add data validation rules beyond type checking |
| 147 | +4. Implement caching for frequently accessed data |
| 148 | +5. Add search/filter capabilities for large datasets |
| 149 | + |
| 150 | +## Verification |
| 151 | + |
| 152 | +To verify the updates work correctly: |
| 153 | +```bash |
| 154 | +# Check schema |
| 155 | +python check_schema.py |
| 156 | + |
| 157 | +# Export schema to JSON |
| 158 | +python export_schema.py |
| 159 | + |
| 160 | +# Test database connections |
| 161 | +python db_operations/connection.py |
| 162 | +``` |
| 163 | + |
| 164 | +--- |
| 165 | +**Update Date:** 2025-10-27 |
| 166 | +**Schema Version:** 2.0 (Comprehensive Student Data) |
0 commit comments