This project explores the use of Machine Learning (ML) and Deep Learning (DL) models to predict apartment rental prices in the U.S. based on various features. In addition to regression-based price prediction, the project also includes a classification task that categorizes apartments into three pricing tiers: low, average, and high.
The dataset used is well-researched and has been analyzed by many others. While I drew inspiration from existing analyses (especially from https://www.kaggle.com/code/harleenkaurvt1930/apartment-rent-data-regression-classification/notebook), this project also incorporates novel approaches and more advanced techniques—most notably, the implementation of Neural Networks for both regression and classification tasks and the use of optimization algorithms.
-
Objectives:
- Perform EDA to find insights about features that affect the rental price of the listings.
- Filter, and transform the data.
- Create Pre-processing pipelines to transform the data and make them suitable for machine learning models.
- Dimensionality Reduction.
- Feature Engineering.
- Perform Regression models with the objective to predict the price.
- Perform Classification models to classify the listings.
- Optimization of the models.
- Regression and Error Analysis
-
Methodology:
- Data Preperation
- Data exploration and preprocessing
- Regression Analysis
- Classification Analysis
- Interpretation and meaningful insights
All that is included in the report file.
- Download the dataset from the kaggle link that is in the Dataset Description file.
- Download the notebook and run it in Google Colab or IDE of your preference.
- Run this command to install all the necessary packages if you don't have them already.
pip install numpy pandas matplotlib seaborn scikit-learn xgboost tensorflow prettytable scipy ydata-profiling