Skip to content

AitchEm-bot/323_assignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

IRA Realty: Hyderabad Micro-Market Price Analysis

CSCI323 (UOWD) project, in partnership with IRA Realty (Hyderabad).

How have residential property prices appreciated across IRA Realty's key micro-markets between 2018 and 2023, and what do current listing prices suggest about future valuation trends?

Six micro-markets are studied: Adibatla, Kokapet, Kollur, Mamidipally, Pocharam, Shamshabad.


Demo video

A 2-minute walkthrough of the pipeline, charts, and headline findings: demo.mp4 (committed to this repo).


Quick start

The pipeline lives in the ira_realty_analysis/ subfolder. From the repo root:

cd ira_realty_analysis
pip install -r requirements.txt
python main.py

Outputs land in ira_realty_analysis/outputs/charts/ (9 PNGs) and ira_realty_analysis/outputs/summary_report.txt.


Pipeline (one step per script in ira_realty_analysis/steps/)

# Script What it does
1 step1_load_clean.py Load 3 sheets; impute 84+37 missing Bedrooms with the median per property type; merge sales with demographics on Location + Year.
2 step2_eda.py Six EDA charts (price trend, total appreciation, CAGR, historical vs listings, demographic correlation, price by property type).
3 step3_features.py Aggregate to one row per (Location, Year) to a 36 x 14 feature matrix (6 location dummies + Year + 5 demographic features + target Avg_Price_Per_SqFt).
4 step4_models.py Time-based split (train 2018-2021, test 2022-2023). Train Linear Regression and Random Forest; report MAE, RMSE, R2, MAPE; pick the lower-RMSE model.
5 step5_future.py Linearly extrapolate each demographic feature to 2024-2026 per location; predict prices with the chosen model.
6 step6_validation.py Compare predicted 2024-2025 average against current listing averages; flag UNDERPRICED / FAIR / OVERPRICED at +/-10%.
7 step7_report.py Write outputs/summary_report.txt.

Step 2: EDA findings

Price trend. All six markets rose smoothly across 2018-2023; Mamidipally and Adibatla are the premium markets, Shamshabad the cheapest.

Price trend by location

Total appreciation and CAGR. Shamshabad led growth (+63.8% total, 10.38% CAGR); Pocharam was slowest (+39.8%, 6.93% CAGR). The fastest-vs-slowest spread is ~24 percentage points.

Total appreciation CAGR by location

Historical vs current listings. Current asking prices sit below the 2018-2023 historical mean in every market; listings are not pricing in the appreciation already observed.

Historical vs listing average

What drives price? Among the demographic variables, Employment_Rate_Percent correlates most strongly with Price_Per_SqFt (r ≈ +0.32). Other features are weaker drivers individually; Location itself does most of the heavy lifting.

Correlation heatmap

By property type. Villas command the highest median INR/sqft (~INR 14,800), Apartments mid-range, Plots lowest (~INR 6,500).

Price by property type


Step 4: Model results

Model MAE (INR/sqft) RMSE (INR/sqft) R2 MAPE
Linear Regression 785.07 920.70 0.939 6.79%
Random Forest 2,050.39 2,906.74 0.390 13.98%

Linear Regression wins decisively on the 2022-2023 holdout. With only 24 training rows, Random Forest overfits and extrapolates poorly to unseen years. Linear Regression captures the dominant signal cleanly: a per-location level plus a steady annual drift.

Predicted vs Actual


Step 5: Forecast 2024-2026 (Linear Regression)

Location 2023 Actual 2024 2025 2026 3-Yr Growth
Adibatla INR 17,596 INR 17,557 INR 18,370 INR 19,183 +9.0%
Kokapet INR 10,838 INR 11,956 INR 12,804 INR 13,652 +26.0%
Kollur INR 10,915 INR 12,115 INR 12,954 INR 13,792 +26.4%
Mamidipally INR 17,808 INR 17,386 INR 18,234 INR 19,082 +7.2%
Pocharam INR 9,680 INR 12,304 INR 13,213 INR 14,122 +45.9%
Shamshabad INR 8,341 INR 9,798 INR 10,615 INR 11,433 +37.1%

Lower-tier markets (Pocharam, Shamshabad) show the largest forecast upside, consistent with their stronger 2018-2023 CAGR.

Forecast 2024-2026


Step 6: Listing assessment

Comparing the 2024-2025 forecast average vs current listing averages (+/-10% threshold):

Location Model 2024-25 Avg Listing Avg Diff % Verdict
Adibatla INR 17,963 INR 15,078 -16.1% UNDERPRICED
Kokapet INR 12,380 INR 9,608 -22.4% UNDERPRICED
Kollur INR 12,534 INR 9,179 -26.8% UNDERPRICED
Mamidipally INR 17,810 INR 14,076 -21.0% UNDERPRICED
Pocharam INR 12,758 INR 9,324 -26.9% UNDERPRICED
Shamshabad INR 10,207 INR 6,123 -40.0% UNDERPRICED

Listing vs Model gap

Every market reads as underpriced relative to the model's forecast. Caveats: listings are asking prices (not transacted), the 2024 forecast is one year ahead of those asking prices anyway, and demographic extrapolation assumes the 2018-2023 trend continues unbroken.


Limitations

  • Only 36 location-year rows for modeling, which keeps models simple and penalises complex ones.
  • Sales data has only Sale_Year, no exact dates, so within-year seasonality is invisible.
  • Single builder (IRA Realty); won't generalise to all of Hyderabad.
  • Demographic projection is a naive linear trend; it won't react to policy, metro, or macro shocks.
  • Listing prices are not transacted prices, so the "underpriced" verdict is partly bid-ask spread.

Repo layout

323_assignment/
├── README.md
├── demo.mp4                       2-minute walkthrough video
└── ira_realty_analysis/
    ├── data/IRA_Realty_Project_Datasets.xlsx
    ├── steps/                     step1...step7 modules + config.py
    ├── outputs/
    │   ├── charts/                9 PNG charts
    │   └── summary_report.txt
    ├── main.py                    runs the whole pipeline
    ├── requirements.txt
    └── LICENSE

License

Code is MIT-licensed (see ira_realty_analysis/LICENSE). The dataset under ira_realty_analysis/data/ is proprietary to IRA Realty and is included for academic use only; it is not covered by the MIT license and may not be redistributed.

About

Real estate price analysis & forecasting for 6 Hyderabad micro-markets — data cleaning, EDA, Linear Regression vs Random Forest, demographic-driven 2024–2026 projections, and current listing fair-value validation. Built with pandas, scikit-learn, matplotlib.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages