CSCI323 (UOWD) project, in partnership with IRA Realty (Hyderabad).
How have residential property prices appreciated across IRA Realty's key micro-markets between 2018 and 2023, and what do current listing prices suggest about future valuation trends?
Six micro-markets are studied: Adibatla, Kokapet, Kollur, Mamidipally, Pocharam, Shamshabad.
A 2-minute walkthrough of the pipeline, charts, and headline findings: demo.mp4 (committed to this repo).
The pipeline lives in the ira_realty_analysis/ subfolder. From the repo root:
cd ira_realty_analysis
pip install -r requirements.txt
python main.pyOutputs land in ira_realty_analysis/outputs/charts/ (9 PNGs) and
ira_realty_analysis/outputs/summary_report.txt.
| # | Script | What it does |
|---|---|---|
| 1 | step1_load_clean.py |
Load 3 sheets; impute 84+37 missing Bedrooms with the median per property type; merge sales with demographics on Location + Year. |
| 2 | step2_eda.py |
Six EDA charts (price trend, total appreciation, CAGR, historical vs listings, demographic correlation, price by property type). |
| 3 | step3_features.py |
Aggregate to one row per (Location, Year) to a 36 x 14 feature matrix (6 location dummies + Year + 5 demographic features + target Avg_Price_Per_SqFt). |
| 4 | step4_models.py |
Time-based split (train 2018-2021, test 2022-2023). Train Linear Regression and Random Forest; report MAE, RMSE, R2, MAPE; pick the lower-RMSE model. |
| 5 | step5_future.py |
Linearly extrapolate each demographic feature to 2024-2026 per location; predict prices with the chosen model. |
| 6 | step6_validation.py |
Compare predicted 2024-2025 average against current listing averages; flag UNDERPRICED / FAIR / OVERPRICED at +/-10%. |
| 7 | step7_report.py |
Write outputs/summary_report.txt. |
Price trend. All six markets rose smoothly across 2018-2023; Mamidipally and Adibatla are the premium markets, Shamshabad the cheapest.
Total appreciation and CAGR. Shamshabad led growth (+63.8% total, 10.38% CAGR); Pocharam was slowest (+39.8%, 6.93% CAGR). The fastest-vs-slowest spread is ~24 percentage points.
Historical vs current listings. Current asking prices sit below the 2018-2023 historical mean in every market; listings are not pricing in the appreciation already observed.
What drives price? Among the demographic variables, Employment_Rate_Percent correlates most strongly with Price_Per_SqFt (r ≈ +0.32). Other features are weaker drivers individually; Location itself does most of the heavy lifting.
By property type. Villas command the highest median INR/sqft (~INR 14,800), Apartments mid-range, Plots lowest (~INR 6,500).
| Model | MAE (INR/sqft) | RMSE (INR/sqft) | R2 | MAPE |
|---|---|---|---|---|
| Linear Regression | 785.07 | 920.70 | 0.939 | 6.79% |
| Random Forest | 2,050.39 | 2,906.74 | 0.390 | 13.98% |
Linear Regression wins decisively on the 2022-2023 holdout. With only 24 training rows, Random Forest overfits and extrapolates poorly to unseen years. Linear Regression captures the dominant signal cleanly: a per-location level plus a steady annual drift.
| Location | 2023 Actual | 2024 | 2025 | 2026 | 3-Yr Growth |
|---|---|---|---|---|---|
| Adibatla | INR 17,596 | INR 17,557 | INR 18,370 | INR 19,183 | +9.0% |
| Kokapet | INR 10,838 | INR 11,956 | INR 12,804 | INR 13,652 | +26.0% |
| Kollur | INR 10,915 | INR 12,115 | INR 12,954 | INR 13,792 | +26.4% |
| Mamidipally | INR 17,808 | INR 17,386 | INR 18,234 | INR 19,082 | +7.2% |
| Pocharam | INR 9,680 | INR 12,304 | INR 13,213 | INR 14,122 | +45.9% |
| Shamshabad | INR 8,341 | INR 9,798 | INR 10,615 | INR 11,433 | +37.1% |
Lower-tier markets (Pocharam, Shamshabad) show the largest forecast upside, consistent with their stronger 2018-2023 CAGR.
Comparing the 2024-2025 forecast average vs current listing averages (+/-10% threshold):
| Location | Model 2024-25 Avg | Listing Avg | Diff % | Verdict |
|---|---|---|---|---|
| Adibatla | INR 17,963 | INR 15,078 | -16.1% | UNDERPRICED |
| Kokapet | INR 12,380 | INR 9,608 | -22.4% | UNDERPRICED |
| Kollur | INR 12,534 | INR 9,179 | -26.8% | UNDERPRICED |
| Mamidipally | INR 17,810 | INR 14,076 | -21.0% | UNDERPRICED |
| Pocharam | INR 12,758 | INR 9,324 | -26.9% | UNDERPRICED |
| Shamshabad | INR 10,207 | INR 6,123 | -40.0% | UNDERPRICED |
Every market reads as underpriced relative to the model's forecast. Caveats: listings are asking prices (not transacted), the 2024 forecast is one year ahead of those asking prices anyway, and demographic extrapolation assumes the 2018-2023 trend continues unbroken.
- Only 36 location-year rows for modeling, which keeps models simple and penalises complex ones.
- Sales data has only
Sale_Year, no exact dates, so within-year seasonality is invisible. - Single builder (IRA Realty); won't generalise to all of Hyderabad.
- Demographic projection is a naive linear trend; it won't react to policy, metro, or macro shocks.
- Listing prices are not transacted prices, so the "underpriced" verdict is partly bid-ask spread.
323_assignment/
├── README.md
├── demo.mp4 2-minute walkthrough video
└── ira_realty_analysis/
├── data/IRA_Realty_Project_Datasets.xlsx
├── steps/ step1...step7 modules + config.py
├── outputs/
│ ├── charts/ 9 PNG charts
│ └── summary_report.txt
├── main.py runs the whole pipeline
├── requirements.txt
└── LICENSE
Code is MIT-licensed (see ira_realty_analysis/LICENSE). The dataset under
ira_realty_analysis/data/ is proprietary to IRA Realty and is included for
academic use only; it is not covered by the MIT license and may not be
redistributed.








