Analysis of Variance or ANOVA is a statistical method that is used to analyze the differences among group means in a sample. In the washington post data that I have, can it help me determine if there are any significant differences in the ages of people shot across different races? Let’s see.
- Null Hypothesis (H0): There is no significant difference in the means of age across different races.
- (In the frequentist interpretation, a small p-value suggests that the observed data is unlikely to have occurred by random chance, leading to the rejection of the null hypothesis. It’s called ‘frequentist’ because it considers probabilities as frequencies of events occurring over repeated experiments. It’s contrasting with the Bayesian approach, where probabilities can also represent degrees of uncertainty.)
- Alternative Hypothesis (H1): Significant difference in the means of age across different races.
There are few assumptions of ANOVA
- The data is normally distributed.
- The variance between each groups is approximately equal.
Let’s do the Shapiro-Wilk test for normality and Levene’s test for Homogeneity of Variances.
Here the homogeneity of variances is not met and normality is not detected in any major group, it raises concerns about the robustness of the ANOVA results. I still proceeded with ANOVA out of curiosity and obtained the following results.
Considering the violations of assumptions, I must interpret these results with caution. While the results suggest significant differences, the reliability is questionable although I intuitively believe them after eyeballing the data for a long time.
I am considering additional analyses, such as Welch’s ANOVA or non-parametric tests, to see if the results are consistent across different methods.