In the subsequent stage of my analysis, I harnessed geospatial data and related libraries to visually represent instances of police shootings across the United States. To achieve this, I extracted the ‘latitude’ and ‘longitude’ attributes from the dataset and meticulously filtered out any null values within these columns. The following steps involved the creation of an accurate geographical map of the United States, supplemented with the inclusion of distinctive red markers, ultimately forming a Geospatial Scatter Plot. The resulting visualization delivers a geographically precise depiction of where these incidents transpired, thus providing valuable insights into their distribution throughout the country. By plotting these incidents on the map, it becomes readily apparent where concentrations of police shootings occur, facilitating a deeper comprehension of regional trends and patterns.
The capability to visualize the scatter plot for individual states of the United States empowers policymakers, researchers, and the public to gain profound insights into the geographical dimensions of police shootings. This, in turn, has the potential to foster more informed discussions and drive actions aimed at addressing this significant issue. As a case in point, I’ve crafted a similar Geospatial Scatter Plot for the state of Massachusetts, a visual representation of which is included below.
Moving forward in my analysis, my agenda includes delving into the realm of GeoHistograms and exploring the application of clustering algorithms, as per our professor’s guidance in the previous class. Specifically, our professor introduced two distinct clustering techniques: K-Means and DBSCAN. It’s worth noting that K-Means entails the need to predefine the value of K, which can be a limiting factor. My goal is to implement both of these algorithms in Python and evaluate whether they yield meaningful clusters when applied to the geographic locations of the shooting data. This phase promises to uncover additional layers of insights and patterns within the dataset, contributing to a more comprehensive understanding of this critical issue.