Introduction
Hey there! In today’s blog post, I want to delve into the intriguing world of gentrification. As our cities continue to evolve and urbanize, the process of gentrification has become a prominent topic of discussion. It involves significant socioeconomic changes in an area over time, often leading to the displacement of existing communities. In this article, I will share insights from our research, where we built a model to classify cities as gentrified or not. By analyzing various features such as rent, age, racial demographics, and income, we aimed to enhance our understanding of this phenomenon. So, let’s dive right in!
Data and Problem Statement
To tackle the complex issue of gentrification, we collected data on 13 distinct features and one label. Our objective was to identify cities experiencing gentrification to enable governments and policymakers to take appropriate action. The initial logistic regression model was trained using the Random Forest algorithm. However, we encountered challenges with overfitting and subpar accuracy of around 60%. To overcome these obstacles, we transitioned to the Logistic Regression algorithm, which yielded higher accuracy.
Model Building and Results
After fine-tuning the logistic regression model, we combined and eliminated features to enhance performance, ultimately reducing the feature set to 10 variables. These variables included population density, median household income, average one-bedroom apartment rent, poverty rate, average age, and racial demographic percentages. We then split the data into an 80-20 train-test ratio to evaluate our model.
The results were promising, with our model achieving a prediction accuracy of approximately 72%. This outcome signifies the potential for accurately identifying gentrified areas based on the selected features. However, there is always room for improvement, and we have identified several avenues for future enhancements.
Future Improvements and Potential Bias
To further enhance the accuracy and scope of our model, we propose several future improvements. Firstly, expanding the dataset to include more cities would offer a more comprehensive understanding of gentrification patterns across various urban areas. Additionally, collecting data from multiple years, such as from 2017 to the present, would provide valuable insights into the rate of change, offering a more accurate assessment of the phenomenon.
Furthermore, it is crucial to acknowledge potential biases that may exist within our model. By ensuring comprehensive representation of diverse racial demographics, we can strive to eliminate any unintentional bias that may arise from the data. It is essential to approach this issue with sensitivity and adopt strategies to minimize bias and enhance the fairness of our predictions.
Here is a video of my team presenting this project in a colloquium at ASDRP (Aspiring Student Directed Research Program) in Fremont, California on Aug 29, 2022.
Conclusion
In conclusion, our research sheds light on the captivating world of gentrification and its implications for urban communities. By developing a logistic regression model, we were able to classify cities as gentrified or not, based on crucial socioeconomic features. Our model achieved an impressive prediction accuracy of approximately 72%, providing a valuable tool for policymakers and governments to make informed decisions.
Understanding and predicting gentrification is a continuous journey, and our study has identified various opportunities for further improvements. By expanding our dataset, incorporating historical data, and addressing potential biases, we can refine our model’s accuracy and insights.
If you’re interested in exploring the fascinating field of gentrification prediction yourself, you can download the Jupyter Notebook from our GitHub repository here.
References:
I hope you found this article insightful and engaging. Understanding gentrification is vital as we navigate the ever-changing dynamics of our urban landscapes. Let’s continue to explore, learn, and create positive change in our cities.