Introduction:
Crime is an unfortunate reality that affects our communities, our safety, and our well-being. As someone deeply concerned about the issue, I embarked on a journey to understand crime patterns better and explore ways to predict its occurrence. In this blog post, I will share with you the key concepts I’ve learned and how city crime data can be leveraged to make informed predictions. By the end of this article, you’ll have a deeper understanding of the subject and the tools to explore crime prediction on your own.
I did this research project at ASDRP (Aspiring Students Directed Research Program) based in Fremont, California, under the guidance of my research mentor Mr. Suresh Subramaniam.
Understanding the Significance of Crime:
Crime is not just a statistic; it profoundly impacts individuals’ lives and their overall health outcomes. The presence of crime in our neighborhoods can create fear, hinder community development, and jeopardize the well-being of innocent people. It is essential for us to tackle this issue head-on and find ways to address it effectively.
The Power of Data Collection:
To predict crime accurately, we need reliable and comprehensive data. For this project, I collected data from approximately 200 cities using publicly available sources such as the US Census and other websites. The dataset included a range of features, including crime rate, race demographics, median household income, political affiliations, population density, employment rate, education rate, disabilities, homelessness rate, police count per population, average age, and stress score. This diverse set of variables provides us with valuable insights into the factors that may contribute to crime rates.
Building the Crime Prediction Model:
Using this dataset, I built a predictive model to forecast crime rates. In the initial stage, my model achieved an accuracy score of 59%. While this is a promising start, I’m continuously working to improve its performance. Here are a few steps I’m taking:
- Collecting More Data: To strengthen the model, I aim to increase the dataset size by including data from an additional 200-300 cities. The larger the dataset, the more patterns and correlations we can uncover, leading to better predictions.
- Refining Feature Selection: Feature selection is a critical step in developing an accurate model. I’m experimenting with different features such as high school education, job distribution, and population changes over recent years. By fine-tuning the selection process, we can identify the most influential variables and eliminate noise from the dataset.
- Exploring Different Algorithms: I’m not limiting myself to a single algorithm. In my quest for better accuracy, I’m testing other methods, including XGBoost and neural networks. This exploration will help us determine which algorithm works best for our specific crime prediction task.
Results and Interpretation:
While my current model offers valuable insights, it is essential to interpret the results cautiously. The accuracy score of 59% indicates that there is room for improvement. As I continue to refine the model, these results will likely change. However, some key findings from the current model are worth noting:
- Feature Importance: By analyzing the model’s feature importances, I discovered that median household income, police count per population, percentage of black population, percentage of the population with disabilities, and percentage of white population were the top five features influencing crime rates. These insights can guide policymakers in implementing targeted strategies to address crime prevention effectively.
- Correlation and Causation: It’s important to understand that correlation does not imply causation. While certain variables show a strong correlation with crime rates, we must exercise caution in inferring causal relationships. Further research and analysis are necessary to establish causation accurately.
- Bias in Data: Machine learning models are only as good as the data they are trained on. It’s crucial to acknowledge that any unintentional bias in the collected data can lead to biased results. As we work towards a fair and accurate model, we must ensure our data collection processes are inclusive and unbiased.
Conclusion:
Predicting crime rates is a complex task, but with the power of data and advanced modeling techniques, we can make significant strides in understanding and combating crime. By harnessing city crime data and continuously refining our models, we can create a more secure and prosperous future for our communities.
I invite you to explore the GitHub repository (https://github.com/Harshaan-Chugh/crime-prediction/) where you can download the Jupyter Notebook and delve deeper into the process of crime prediction. Together, let’s work towards a safer and more harmonious society.
Remember, predicting crime is just the beginning. The real impact lies in how we utilize these predictions to inform policies, allocate resources, and create effective strategies to prevent crime and build thriving communities.
Download the Jupyter Notebook from GitHub and embark on your own journey of crime prediction today. Together, let’s make a difference!
References:
- WalletHub – Most & Least Stressed Cities (https://wallethub.com/edu/most-least-stressed-cities/22759)
- Ballotpedia – Largest Cities in the United States by Population (https://ballotpedia.org/Largest_cities_in_the_United_States_by_population)
- World Population Review – US Cities (https://worldpopulationreview.com/us-cities)
- US Census Bureau (https://www.census.gov)
- Governing – Police Officers per Capita (https://www.governing.com/archive/police-officers-per-capita-rates-employment-for-city-departments.html)
- Homeless Shelter Directory (https://www.homelessshelterdirectory.org/)
- UAB Institute for Human Rights Blog – Thoughts on Homelessness in Birmingham (https://www.uab.edu/humanrights/homelessness-in-birmingham)
- Rocket City Now – City Leaders, Law Enforcement Working to Help with Homelessness (https://www.rocketcitynow.com/)
- Juneau Empire – Juneau’s Homeless Population Dips Slightly (https://www.juneauempire.com/)
- Anchorage Coalition to End Homelessness – 2021 Point in Time Count and Quality By Name List Announcement (https://www.acenh.org/)
- AZ Central – Scottsdale’s Response to Homelessness is Falling Short, Some Say (https://www.azcentral.com/)
- East Valley Tribune – Mesa Police Take New Approach to Homeless People Downtown (https://www.eastvalleytribune.com/)
- KGUN 9 – The State of Homelessness in Arizona and Pima County (https://www.kgun9.com/)
- US Interagency Council on Homelessness – Homelessness Statistics (https://www.usich.gov/homelessness-statistics/ar/)
- Spectrum News – Fayetteville City Council Look into Housing for the Homeless (https://spectrumlocalnews.com/)
- Riverview Hope Campus – Homelessness in Fort Smith (https://riverviewhopecampus.org/)
- Los Angeles Times – LA Voters Are Frustrated, Impatient Over Persistent Homelessness Crisis (https://www.latimes.com/)
- KOAA News – Homeless Population Growing in Aurora, Mayor Plans to Reintroduce Camping Ban (https://kdvr.com/)
- Fort Collins Rescue Mission – Homelessness in Our City (https://fortcollinsrescuemission.org/)
- Press Telegram – Long Beach Announces New Date for Homeless Count (https://www.presstelegram.com/)
- Hari VCO – Homeless Point In Time (PIT) Count (https://harivco.org/)
- Wikipedia – Chula Vista, California (https://en.wikipedia.org/wiki/Chula_Vista,_California)
- Fremont – Homeless Newsletter (https://www.fremont.gov/)
- San Francisco Public Press – Lawsuit Claims San Francisco Inflated Homeless Count Numbers (https://www.sfpublicpress.org/)
- The Daily World – 2021 Pacific County Homeless Count (https://www.thedailyworld.com/)
- Wikipedia – Quincy, Massachusetts (https://en.wikipedia.org/wiki/Quincy,_Massachusetts)
- The Washington Post – In Rockville, A ‘Flood’ of Concerns About Rising Homelessness (https://www.washingtonpost.com/)
- National Alliance to End Homelessness (https://endhomelessness.org/)
- Gwinnett Daily Post – Slightly Fewer Homeless People in Gwinnett Counted in Recent Point-in-Time Survey (https://www.gwinnettdailypost.com/)
- Indianapolis Recorder – Homeless Population Increase Challenges Local Service Providers (https://indianapolisrecorder.com/)
- Hartford Courant – Homeless Population in Connecticut Inches Downward in 2021, but Still Higher Than Previous Years (https://www.courant.com/)
- Scottsdale Progress – Scottsdale Officially Names New Police Chief (https://www.scottsdaleprogress.com/)
- Charleston Gazette-Mail – Justice Appoints Grafton Native Powell to Lead State Police (https://www.wvgazettemail.com/)
- Myrtle Beach Sun News – Horry County Council Names New Police Chief (https://www.myrtlebeachonline.com/)
- The Baltimore Sun – Baltimore Police Commissioner Harrison Names New Deputy Commissioner, Director of Professional Standards and Accountability (https://www.baltimoresun.com/)
- El Paso Times – El Paso Police Department Promotes Six Officers (https://www.elpasotimes.com/)
- The Boston Globe – New Cohasset Police Chief Aims to Connect with Community (https://www.bostonglobe.com/)
- St. Louis Post-Dispatch – After 2021 Rioting and Looting, Police and City Officials Reflect on Security Lessons Learned (https://www.stltoday.com/)
- Chattanooga Times Free Press – Chattanooga Police Looking for Men Who Robbed, Shot Man in Dayton (https://www.timesfreepress.com/)
- The Arizona Republic – Phoenix Police Investigating Double Homicide in Cesar Chavez Park (https://www.azcentral.com/)
- The Washington Post – One Man Killed, Another Injured in Laurel Shooting (https://www.washingtonpost.com/)
- NBC New York – Man Found Fatally Shot Inside Abandoned Vehicle in Brooklyn: Police (https://www.nbcnewyork.com/)
- The Denver Post – Two People Injured in Early Morning Downtown Denver Shooting (https://www.denverpost.com/)
- Austin American-Statesman – Man Dies After Being Shot in Northeast Austin, Police Say (https://www.statesman.com/)
- Los Angeles Times – Man Killed, Two Others Wounded in Shooting Outside Hollywood Strip Club (https://www.latimes.com/)
- The San Diego Union-Tribune – Man Shot in Leg at Apartment Complex in Spring Valley (https://www.sandiegouniontribune.com/)
- The Atlanta Journal-Constitution – Wounded Woman Drives Herself to Grady Hospital After Shooting (https://www.ajc.com/)
- NBC Chicago – 2 Men Killed, 6 Others Injured in Separate Shootings Across Chicago (https://www.nbcchicago.com/)
- Detroit Free Press – Man, 27, Fatally Shot Outside Detroit Gas Station (https://www.freep.com/)
- Home – Youth Risk Behavior Surveillance System (https://www.cdc.gov/healthyyouth/data/yrbs/index.htm)
- Centers for Disease Control and Prevention – Data & Statistics (https://www.cdc.gov/injury/wisqars/overview/key_data.html)
- The National Campaign to Prevent Teen and Unplanned Pregnancy (https://thenationalcampaign.org/)
- Substance Abuse and Mental Health Services Administration – Data, Outcomes, and Quality (https://www.samhsa.gov/data/)
- The Trevor Project – Research (https://www.thetrevorproject.org/research/)
- The Office of Adolescent Health – Teen Pregnancy Prevention Data and Statistics (https://www.hhs.gov/ash/oah/adolescent-development/reproductive-health-and-teen-pregnancy/teen-pregnancy-prevention-data-and-statistics/index.html)
- The Annie E. Casey Foundation – Kids Count Data Center (https://datacenter.kidscount.org/)
- Census Bureau – Median Household Income (https://www.census.gov/topics/income-poverty/income.html)
- United States Department of Labor – Bureau of Labor Statistics (https://www.bls.gov/)
- American FactFinder (https://data.census.gov/cedsci/)
- Data.gov (https://www.data.gov/)
- GitHub Repository – Crime Prediction (https://github.com/Harshaan-Chugh/crime-prediction/)