Chi-squared tests with p-value heatmapping as a post hoc

Author

Dr Richard Timmerman

Modified

March 11, 2025

Abstract

The purpose of this document is briefly explain Chi-square (\(\chi^2\))tests for independence, and their implementation in R. The data featured concerns readily available homicides in London data and is subject to revision; a more palatable dataset may be selected in the the near future.

This approach includes the synthesis of a heatmaps intended to unpack p-values that are relative to the Chi-squared test statistic.

1 What is a Chi-square (\(\chi^2\)) test?

A Chi-squared (\(\chi^2\)) is used to test whether the values recorded in one variable are related (or dependent) on values captured in another variable. Normally, this technique is applied to aggregated nominal or ordinal data values.

This process works on the hypothesis (\(H_0\)) that the variables are independent (unrelated/no association). This matters when it comes to the p-value and its interpretation, where a p-value that is \(\gt \alpha\) (0.05) gives evidence that rejects the \(H_0\) in favour of the variables being related (\(H_1\)).

The actual test statistic (Equation 1) is most useful when examined against its critical counterpart (\(\chi^2c\)) in a critical values table. Here, if a calculated \(\chi^2 \gt \chi^2c\) the \(H_0\) is rejected.

\[ \chi^2 = \sum \frac{\left ( O_{i} - E_{i}\right)^2}{E_{i}} \tag{1}\]

\(O_i\) refers to the observed values, and \(E_i\) are the expected values.

1.1 Practical example

For the sake of demonstration, let’s conduct a \(\chi^2\) test on London homicides recorded between 2003 and 2023 (source: https://data.london.gov.uk/dataset/mps-homicide-dashboard-data). There are 2,885 observations.

# Loading the data...
homicides <- read.csv("Homicide Victim 2003 - September 2023.csv")
names(homicides) <- abbreviate(names(homicides))

# For reference
print(names(homicides))
 [1] "C..V" "Ag.G" "Sex"  "M..K" "Dm.A" "Rc.D" "H.O." "Sl.S" "Brgh" "O.O."

A loose hypothesis based on trending news media is that black populations are more likely to be homicide victims in London. A quick examination of the number of homicides by ethnicity (Figure 1) shows that the majority of homicide victims are white; however, only 13.5% of London’s population is ‘Black British’, behind ‘Asian’ (20.5%); further investigation is needed.

Figure 1: Frequency of homicides by ethnic group (recorded by the on-site officer).

Although simple, the hypothesis is loaded and warrants extended study. However, we can at least try to understand whether certain homicides types are associated with certain ethnic groups. We can achieve this using a \(\chi^2\) test.

1.2 Contingency table synthesis

The first step is to create a contingency table (cross-tabulation) using the xtabs(...) function in R. Here, we will examine the relationship between observed ethnicity (O.O.) and method of killing (M..K).

ct_homicides <- xtabs(~O.O.+M..K, data = homicides)
ct_homicides
                        M..K
O.O.                     Blunt Implement  Blunt instrument
  Asian                                 0               45
  Black                                 0               31
  Not Reported / Unknown                0                2
  Other                                 0                5
  White                                 1              116
                        M..K
O.O.                     Knife or Sharp Implement  Not known/Not Recorded 
  Asian                                        205                      26
  Black                                        629                      29
  Not Reported / Unknown                        10                       2
  Other                                         28                       3
  White                                        616                      93
                        M..K
O.O.                     Other Method of Killing  Physical Assault, no weapon 
  Asian                                        80                           62
  Black                                        73                           43
  Not Reported / Unknown                        6                            1
  Other                                         5                            1
  White                                       251                          190
                        M..K
O.O.                     Shooting
  Asian                        14
  Black                       231
  Not Reported / Unknown        3
  Other                         6
  White                        78

1.3 The \(\chi^2\) test result

Now we can perform the \(\chi^2\) using the chisq.test(...) function:

chisq.test(ct_homicides)

    Pearson's Chi-squared test

data:  ct_homicides
X-squared = 386.22, df = 24, p-value < 2.2e-16

The result returns a large \(\chi^2\) statistic (greater than 36.4150285) and a p-value that is significantly less than the \(\alpha\). We accept the \(H_0\) and state that the data is independent; there isn’t a relationship between ethnicity and a method of killing. A more detailed view of how this result is arrived at can be achieved using the xchisq.test(...) function from the mosaic package, where it is possible to see the contributions made to the \(\chi^2\) statistic.

Note

Owing to its raw nature, this data is unlikely to conform to \(\chi^2\) distribution, leading to type-I and/or II errors. Therefore, bootstrapping is necessary, via Monte Carlo simulations, to simulate the p-value. The simulate.p.value = TRUE carries out 2000 \(\chi^2\) calculations based on 2000 runs. The number of runs can be altered using B = argument.

if(!require(mosaic)) install.packages("mosaic")
options(scipen = 9999)
xchisq.test(ct_homicides, simulate.p.value = TRUE,)

    Pearson's Chi-squared test with simulated p-value (based on 2000
    replicates)

data:  x
X-squared = 386.22, df = NA, p-value = 0.0004998

    0       45      205       26       80       62       14   
(  0.1497) ( 29.7983) (222.8132) ( 22.9102) ( 62.1421) ( 44.4728) ( 49.7137)
[  0.1497] [  7.7552] [  1.4241] [  0.4167] [  5.1319] [  6.9077] [ 25.6563]
<-0.387> < 2.785> <-1.193> < 0.646> < 2.265> < 2.628> <-5.065>
             
    0       31      629       29       73       43      231   
(  0.3591) ( 71.4607) (534.3390) ( 54.9421) (149.0260) (106.6523) (119.2208)
[  0.3591] [ 22.9086] [ 16.7697] [ 12.2491] [ 38.7849] [ 37.9890] [104.8021]
<-0.599> <-4.786> < 4.095> <-3.500> <-6.228> <-6.164> <10.237>
             
    0        2       10        2        6        1        3   
(  0.0083) (  1.6555) ( 12.3785) (  1.2728) (  3.4523) (  2.4707) (  2.7619)
[  0.0083] [  0.0717] [  0.4570] [  0.4155] [  1.8801] [  0.8755] [  0.0205]
<-0.091> < 0.268> <-0.676> < 0.645> < 1.371> <-0.936> < 0.143>
             
    0        5       28        3        5        1        6   
(  0.0166) (  3.3109) ( 24.7570) (  2.5456) (  6.9047) (  4.9414) (  5.5237)
[  0.0166] [  0.8617] [  0.4248] [  0.0811] [  0.5254] [  3.1438] [  0.0411]
<-0.129> < 0.928> < 0.652> < 0.285> <-0.725> <-1.773> < 0.203>
             
    1      116      616       93      251      190       78   
(  0.4662) ( 92.7747) (693.7123) ( 71.3293) (193.4749) (138.4627) (154.7799)
[  0.6112] [  5.8142] [  8.7056] [  6.5838] [ 17.1037] [ 19.1827] [ 38.0873]
< 0.782> < 2.411> <-2.951> < 2.566> < 4.136> < 4.380> <-6.171>
             
key:
    observed
    (expected)
    [contribution to X-squared]
    <Pearson residual>

Careful observation of the output above reveals that the largest contribution to the \(\chi^2\) comes from the recorded Black ethnic group, and are associated with physical assault and shooting; also noteworthy is the contribution of white homicide victims associated with shootings. Although the p-value is \(\lt \alpha\), we can still pick out the parts that are potentially significant using a post hoc based on a deeper examination of the p-value statistic.

2 Post hoc heat mapping

Although the \(\chi^2\) test tells us whether or not variables are related

The most straightforward way to examine the p-value in scenarios of bivariate dis/association, is with a heat map matrix. This is possible with the pheatmap package.

Prior to this, we need to calculate the division of the p-value statistic. We begin by saving the xchisq.test(...) output as an object, and then extracting the following elements from it:

  • \(\chi^2\) contribution (let’s call it \(\psi\))
  • The \(\chi^2\) itself

A division then occurs (\(p =\frac{\psi}{\chi^2}\)), that should isolate the p-values for each observation featured in the \(\chi^2\) test.

chi_sq_result <- xchisq.test(ct_homicides, simulate.p.value = TRUE)
propchi <- chi_sq_result$contribution / chi_sq_result$statistic
propchi

Already, we can see the sizeable contributions to p-value by Black victims of homicides; let’s visualise this using the pheatmap package (see Figure 2).

if(!require(pheatmap)) install.packages("pheatmap")

pheatmap(propchi)
Figure 2: Heat map showing p-value contributions

Although statistically significant (resulting in independence or dissociation), it can be argued that, between 2003 and 2023, where \(\approx\geqslant\) 0.05 is threshold for dependency, it can be inferred that black homicide victims are more likely to have died in a shooting incident, ‘another method’ or physical assault (no weapon). Similarly for white ethnicities, there is a stronger likelihood of a victim dying in a shooting incident.

Again, the overarching finding here is that there is no association between a method of killing and the ethnic background of the victim. Although there are more black and white victims of crime recorded in this dataset, it can be said that ‘crime does not discriminate’.