Summary table split by Hp detection with lax filtering, only one technology (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. A simple test for R-loop enrichment in Hp infected tumors would be to compare the contingency tables between having Hp and having an R-loop mutation. In the R_loop_mutation line the p-value for the chi-squared test is 0.7, there is no enrichment. I tried to filter the mutations, where I only chose SNVs with a predicted deleterious or damaging effect, since mutations could occur on the gene where the effect is benign. This reduced the number of R-loop mutations from 69 to 48, but again the p-value isn’t so low p = 0.5.
## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
## (pastp=85.2035, ipn_0:=ipoin[itp=416]=158, stp[ipn_0]=84.542).
## Increase workspace or consider using 'simulate.p.value=TRUE'
| Characteristic1 | N | FALSE, N = 175 | TRUE, N = 92 | p-value2 |
|---|---|---|---|---|
| Hp_total_str | 267 | 0 (0%) | 36 (39%) | <0.001 |
| R_loop_mutation | 267 | 47 (27%) | 22 (24%) | 0.7 |
| R_loop_mut_damaging | 267 | 29 (17%) | 19 (21%) | 0.5 |
| Subtype | 267 | >0.9 | ||
| CIN | 88 (50%) | 48 (52%) | ||
| EBV | 18 (10%) | 7 (7.6%) | ||
| GS | 33 (19%) | 17 (18%) | ||
| MSI | 36 (21%) | 20 (22%) | ||
| TP53.mutation | 267 | 0.8 | ||
| 0 | 88 (50%) | 49 (53%) | ||
| 1 | 85 (49%) | 43 (47%) | ||
| NA | 2 (1.1%) | 0 (0%) | ||
| PIK3CA.mutation | 267 | 0.5 | ||
| 0 | 141 (81%) | 71 (77%) | ||
| 1 | 32 (18%) | 21 (23%) | ||
| NA | 2 (1.1%) | 0 (0%) | ||
| KRAS.mutation | 267 | 0.2 | ||
| 0 | 161 (92%) | 80 (87%) | ||
| 1 | 12 (6.9%) | 12 (13%) | ||
| NA | 2 (1.1%) | 0 (0%) | ||
| MSI.status | 267 | >0.9 | ||
| MSI-H | 36 (21%) | 20 (22%) | ||
| MSI-L | 28 (16%) | 15 (16%) | ||
| MSS | 111 (63%) | 57 (62%) | ||
| Hypermutated | 267 | 0.7 | ||
| 0 | 136 (78%) | 74 (80%) | ||
| 1 | 37 (21%) | 18 (20%) | ||
| NA | 2 (1.1%) | 0 (0%) | ||
| TNM.Stage | 267 | |||
| Stage_IA | 4 (2.3%) | 4 (4.3%) | ||
| Stage_IB | 11 (6.3%) | 8 (8.7%) | ||
| Stage_IIA | 34 (19%) | 19 (21%) | ||
| Stage_IIB | 38 (22%) | 17 (18%) | ||
| Stage_IIIA | 22 (13%) | 13 (14%) | ||
| Stage_IIIB | 37 (21%) | 14 (15%) | ||
| Stage_IIIC | 10 (5.7%) | 3 (3.3%) | ||
| Stage_IV | 12 (6.9%) | 6 (6.5%) | ||
| X | 7 (4.0%) | 8 (8.7%) | ||
| Lauren.Class | 267 | 0.3 | ||
| Diffuse | 42 (24%) | 19 (21%) | ||
| Intestinal | 112 (64%) | 66 (72%) | ||
| Mixed | 15 (8.6%) | 3 (3.3%) | ||
| NA | 6 (3.4%) | 4 (4.3%) | ||
|
1
Statistics presented: n (%)
2
Statistical tests performed: chi-square test of independence; Fisher's exact test
|
||||
Summary table split by Hp detection with stringent filtering, at least two technologies (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. In the R_loop_mutation line the p-value for the chi-squared test is 0.6, there is no enrichment. For R-loop damaging mutations, the p-value also is not significant, p = 0.5.
## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
## (pastp=29.7503, ipn_0:=ipoin[itp=131]=2803, stp[ipn_0]=29.2407).
## Increase workspace or consider using 'simulate.p.value=TRUE'
| Characteristic1 | N | FALSE, N = 231 | TRUE, N = 36 | p-value2 |
|---|---|---|---|---|
| Hp_total_lax | 267 | 56 (24%) | 36 (100%) | <0.001 |
| R_loop_mutation | 267 | 58 (25%) | 11 (31%) | 0.6 |
| R_loop_mut_damaging | 267 | 39 (17%) | 9 (25%) | 0.3 |
| Subtype | 267 | 0.3 | ||
| CIN | 121 (52%) | 15 (42%) | ||
| EBV | 23 (10.0%) | 2 (5.6%) | ||
| GS | 43 (19%) | 7 (19%) | ||
| MSI | 44 (19%) | 12 (33%) | ||
| TP53.mutation | 267 | 0.12 | ||
| 0 | 113 (49%) | 24 (67%) | ||
| 1 | 116 (50%) | 12 (33%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| PIK3CA.mutation | 267 | 0.4 | ||
| 0 | 186 (81%) | 26 (72%) | ||
| 1 | 43 (19%) | 10 (28%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| KRAS.mutation | 267 | 0.064 | ||
| 0 | 212 (92%) | 29 (81%) | ||
| 1 | 17 (7.4%) | 7 (19%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| MSI.status | 267 | 0.13 | ||
| MSI-H | 44 (19%) | 12 (33%) | ||
| MSI-L | 39 (17%) | 4 (11%) | ||
| MSS | 148 (64%) | 20 (56%) | ||
| Hypermutated | 267 | 0.3 | ||
| 0 | 185 (80%) | 25 (69%) | ||
| 1 | 44 (19%) | 11 (31%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| TNM.Stage | 267 | |||
| Stage_IA | 5 (2.2%) | 3 (8.3%) | ||
| Stage_IB | 17 (7.4%) | 2 (5.6%) | ||
| Stage_IIA | 44 (19%) | 9 (25%) | ||
| Stage_IIB | 49 (21%) | 6 (17%) | ||
| Stage_IIIA | 34 (15%) | 1 (2.8%) | ||
| Stage_IIIB | 47 (20%) | 4 (11%) | ||
| Stage_IIIC | 12 (5.2%) | 1 (2.8%) | ||
| Stage_IV | 16 (6.9%) | 2 (5.6%) | ||
| X | 7 (3.0%) | 8 (22%) | ||
| Lauren.Class | 267 | 0.4 | ||
| Diffuse | 53 (23%) | 8 (22%) | ||
| Intestinal | 154 (67%) | 24 (67%) | ||
| Mixed | 17 (7.4%) | 1 (2.8%) | ||
| NA | 7 (3.0%) | 3 (8.3%) | ||
|
1
Statistics presented: n (%)
2
Statistical tests performed: chi-square test of independence; Fisher's exact test
|
||||
To check if there is a specific subtype effect on R-loop mutation and Hp infection, we can do the same chi-squared tests on Hp vs R-loop after subsetting by subtype. You can see here that again there doesn’t seem to be a significant enrichment for R-loop mutations in Hp positive tumors in a particular subtype.
## There was an error in 'add_p()' for variable 'R_loop_mut_damaging' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## There was an error in 'add_p()' for variable 'MSI.status' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## Styling functions `bold_labels()`, `bold_levels()`, `italicize_labels()`, and `italicize_levels()` need to be re-applied after `tbl_merge()`.
| Characteristic | CIN | EBV | GS | MSI | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| N | FALSE, N = 881 | TRUE, N = 481 | p-value2 | N | FALSE, N = 181 | TRUE, N = 71 | p-value3 | N | FALSE, N = 331 | TRUE, N = 171 | p-value3 | N | FALSE, N = 361 | TRUE, N = 201 | p-value4 | |
| Hp_total_str | 136 | 0 (0%) | 15 (31%) | <0.001 | 25 | 0 (0%) | 2 (29%) | 0.070 | 50 | 0 (0%) | 7 (41%) | <0.001 | 56 | 0 (0%) | 12 (60%) | <0.001 |
| R_loop_mutation | 136 | 12 (14%) | 6 (12%) | >0.9 | 25 | 3 (17%) | 0 (0%) | 0.5 | 50 | 4 (12%) | 1 (5.9%) | 0.6 | 56 | 28 (78%) | 15 (75%) | >0.9 |
| R_loop_mut_damaging | 136 | 7 (8.0%) | 4 (8.3%) | >0.9 | 25 | 0 (0%) | 0 (0%) | 50 | 1 (3.0%) | 1 (5.9%) | >0.9 | 56 | 21 (58%) | 14 (70%) | 0.6 | |
| MSI.status | 136 | 0.6 | 25 | 0.5 | 50 | 0.7 | 56 | |||||||||
| MSI-L | 19 (22%) | 13 (27%) | 3 (17%) | 0 (0%) | 6 (18%) | 2 (12%) | ||||||||||
| MSS | 69 (78%) | 35 (73%) | 15 (83%) | 7 (100%) | 27 (82%) | 15 (88%) | ||||||||||
| MSI-H | 36 (100%) | 20 (100%) | ||||||||||||||
| Hypermutated | 136 | 0.7 | 25 | 0.5 | 50 | 0.5 | 56 | 0.5 | ||||||||
| 0 | 85 (97%) | 45 (94%) | 15 (83%) | 7 (100%) | 31 (94%) | 17 (100%) | 5 (14%) | 5 (25%) | ||||||||
| 1 | 3 (3.4%) | 3 (6.2%) | 3 (17%) | 0 (0%) | 31 (86%) | 15 (75%) | ||||||||||
| NA | 2 (6.1%) | 0 (0%) | ||||||||||||||
|
1
Statistics presented: n (%)
2
Statistical tests performed: chi-square test of independence; Fisher's exact test
3
Statistical tests performed: Fisher's exact test
4
Statistical tests performed: Fisher's exact test; chi-square test of independence
|
||||||||||||||||
I added a new track where I point out the “damaging” R-loop mutations. As you can see there isn’t a specific clustering of R-loop mutations with Hp presence.
## All mutation types: Missense_Mutation, Amplification,
## Frame_Shift_InDel, Deletion, Splice_Site, Nonsense_Mutation,
## In_Frame_InDel
Comparing mutated genes between 92 Hp positive and 175 Hp negative. If a gene is mutated more than once it is only counted once. The Fisher test is used to assess if the proportion of a mutated gene between the two group is different. 443 genes have a p < 0.05
## Gene list complete
A blank cell means the gene was not mutated in either group
I plotted the genes that have a p < 0.05 in a heatmap. There could be a mutation pattern in MSI subtype patients with HP.
We can do the same fisher test for within the subtype. Only MSI and CIN subtypes had significant genes. 410 in MSi and 92 in CIN
| Characteristic | CIN | EBV | GS | MSI | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| N | FALSE, N = 881 | TRUE, N = 481 | p-value2 | N | FALSE, N = 181 | TRUE, N = 71 | p-value2 | N | FALSE, N = 331 | TRUE, N = 171 | p-value2 | N | FALSE, N = 361 | TRUE, N = 201 | p-value2 | |
| TMB | 136 | 1.52 (0.94) | 2.12 (1.63) | 0.024 | 25 | 2.48 (3.25) | 1.04 (0.30) | 0.080 | 50 | 3.30 (13.16) | 0.84 (0.77) | 0.3 | 56 | 16 (7) | 21 (17) | 0.3 |
|
1
Statistics presented: mean (SD)
2
Statistical tests performed: t-test
|
||||||||||||||||