I used a dataset from The Washington Post covering the years 2015 to 2023 to investigate age-related trends in the usage of weapons in fatal occurrences. I thoroughly analysed the data using Python, and ANOVA statistical testing showed a statistically significant age-related trend in the sorts of weapons used. Although some data overlap was visible, the first box plot visualisation showed age variances across various weapon kinds.
I used a violin plot to improve the visualisation for a sizable dataset. After encoding categorical characteristics, I then investigated using a logistic regression model to predict a person’s likelihood of carrying a weapon based on their age. Problems surfaced, such as convergence warnings and vague metrics, which led to changes like feature scaling and report editing.
The final model had a 58% accuracy rate, and the classification report that was produced indicated areas that needed work, especially with handling imbalanced classes and multi-label prediction. My recommendations for improvement included correcting imbalances, investigating more intricate models, and taking into account further features to improve prediction accuracy. The thorough investigation of weapon usage patterns highlighted how iterative data analysis and model construction are processes.
Report comprises three key components: F1-score (harmonic mean of precision and recall), recall (sensitivity, capturing actual positives), and precision (accuracy of positive predictions). The aforementioned metrics evaluate the model’s precision in identifying and categorising occurrences, which is essential for comprehending its efficacy across diverse dataset classes.