Summary table split by Hp detection with lax filtering, only one technology (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. A simple test for R-loop enrichment in Hp infected tumors would be to compare the contingency tables between having Hp and having an R-loop mutation. In the R_loop_mutation line the p-value for the chi-squared test is 0.7, there is no enrichment. I tried to filter the mutations, where I only chose SNVs with a predicted deleterious or damaging effect, since mutations could occur on the gene where the effect is benign. This reduced the number of R-loop mutations from 69 to 48, but again the p-value isn’t so low p = 0.5.
## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
## (pastp=62.9435, ipn_0:=ipoin[itp=221]=21, stp[ipn_0]=60.5559).
## Increase workspace or consider using 'simulate.p.value=TRUE'
| Characteristic1 | N | FALSE, N = 153 | TRUE, N = 114 | p-value2 |
|---|---|---|---|---|
| Hp_total_str | 267 | 0 (0%) | 53 (46%) | <0.001 |
| R_loop_mutation | 267 | 42 (27%) | 27 (24%) | 0.6 |
| R_loop_mut_damaging | 267 | 25 (16%) | 23 (20%) | 0.5 |
| Subtype | 267 | >0.9 | ||
| CIN | 79 (52%) | 57 (50%) | ||
| EBV | 13 (8.5%) | 12 (11%) | ||
| GS | 28 (18%) | 22 (19%) | ||
| MSI | 33 (22%) | 23 (20%) | ||
| TP53.mutation | 267 | 0.4 | ||
| 0 | 75 (49%) | 62 (54%) | ||
| 1 | 76 (50%) | 52 (46%) | ||
| NA | 2 (1.3%) | 0 (0%) | ||
| PIK3CA.mutation | 267 | 0.2 | ||
| 0 | 125 (82%) | 87 (76%) | ||
| 1 | 26 (17%) | 27 (24%) | ||
| NA | 2 (1.3%) | 0 (0%) | ||
| KRAS.mutation | 267 | 0.5 | ||
| 0 | 139 (91%) | 102 (89%) | ||
| 1 | 12 (7.8%) | 12 (11%) | ||
| NA | 2 (1.3%) | 0 (0%) | ||
| MSI.status | 267 | 0.8 | ||
| MSI-H | 33 (22%) | 23 (20%) | ||
| MSI-L | 26 (17%) | 17 (15%) | ||
| MSS | 94 (61%) | 74 (65%) | ||
| Hypermutated | 267 | 0.3 | ||
| 0 | 116 (76%) | 94 (82%) | ||
| 1 | 35 (23%) | 20 (18%) | ||
| NA | 2 (1.3%) | 0 (0%) | ||
| TNM.Stage | 267 | |||
| Stage_IA | 2 (1.3%) | 6 (5.3%) | ||
| Stage_IB | 11 (7.2%) | 8 (7.0%) | ||
| Stage_IIA | 30 (20%) | 23 (20%) | ||
| Stage_IIB | 35 (23%) | 20 (18%) | ||
| Stage_IIIA | 22 (14%) | 13 (11%) | ||
| Stage_IIIB | 30 (20%) | 21 (18%) | ||
| Stage_IIIC | 9 (5.9%) | 4 (3.5%) | ||
| Stage_IV | 8 (5.2%) | 10 (8.8%) | ||
| X | 6 (3.9%) | 9 (7.9%) | ||
| Lauren.Class | 267 | 0.2 | ||
| Diffuse | 37 (24%) | 24 (21%) | ||
| Intestinal | 97 (63%) | 81 (71%) | ||
| Mixed | 14 (9.2%) | 4 (3.5%) | ||
| NA | 5 (3.3%) | 5 (4.4%) | ||
|
1
Statistics presented: n (%)
2
Statistical tests performed: chi-square test of independence; Fisher's exact test
|
||||
Summary table split by Hp detection with stringent filtering, at least two technologies (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. In the R_loop_mutation line the p-value for the chi-squared test is 0.6, there is no enrichment. For R-loop damaging mutations, the p-value also is not significant, p = 0.5.
## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
## (pastp=38.9765, ipn_0:=ipoin[itp=70]=2395, stp[ipn_0]=38.2125).
## Increase workspace or consider using 'simulate.p.value=TRUE'
| Characteristic1 | N | FALSE, N = 214 | TRUE, N = 53 | p-value2 |
|---|---|---|---|---|
| Hp_total_lax | 267 | 61 (29%) | 53 (100%) | <0.001 |
| R_loop_mutation | 267 | 55 (26%) | 14 (26%) | >0.9 |
| R_loop_mut_damaging | 267 | 37 (17%) | 11 (21%) | 0.7 |
| Subtype | 267 | 0.7 | ||
| CIN | 111 (52%) | 25 (47%) | ||
| EBV | 21 (9.8%) | 4 (7.5%) | ||
| GS | 40 (19%) | 10 (19%) | ||
| MSI | 42 (20%) | 14 (26%) | ||
| TP53.mutation | 267 | 0.2 | ||
| 0 | 104 (49%) | 33 (62%) | ||
| 1 | 108 (50%) | 20 (38%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| PIK3CA.mutation | 267 | 0.6 | ||
| 0 | 172 (80%) | 40 (75%) | ||
| 1 | 40 (19%) | 13 (25%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| KRAS.mutation | 267 | 0.2 | ||
| 0 | 196 (92%) | 45 (85%) | ||
| 1 | 16 (7.5%) | 8 (15%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| MSI.status | 267 | 0.2 | ||
| MSI-H | 42 (20%) | 14 (26%) | ||
| MSI-L | 32 (15%) | 11 (21%) | ||
| MSS | 140 (65%) | 28 (53%) | ||
| Hypermutated | 267 | 0.5 | ||
| 0 | 171 (80%) | 39 (74%) | ||
| 1 | 41 (19%) | 14 (26%) | ||
| NA | 2 (0.9%) | 0 (0%) | ||
| TNM.Stage | 267 | |||
| Stage_IA | 4 (1.9%) | 4 (7.5%) | ||
| Stage_IB | 14 (6.5%) | 5 (9.4%) | ||
| Stage_IIA | 41 (19%) | 12 (23%) | ||
| Stage_IIB | 45 (21%) | 10 (19%) | ||
| Stage_IIIA | 31 (14%) | 4 (7.5%) | ||
| Stage_IIIB | 45 (21%) | 6 (11%) | ||
| Stage_IIIC | 12 (5.6%) | 1 (1.9%) | ||
| Stage_IV | 13 (6.1%) | 5 (9.4%) | ||
| X | 9 (4.2%) | 6 (11%) | ||
| Lauren.Class | 267 | 0.2 | ||
| Diffuse | 50 (23%) | 11 (21%) | ||
| Intestinal | 141 (66%) | 37 (70%) | ||
| Mixed | 17 (7.9%) | 1 (1.9%) | ||
| NA | 6 (2.8%) | 4 (7.5%) | ||
|
1
Statistics presented: n (%)
2
Statistical tests performed: chi-square test of independence; Fisher's exact test
|
||||
To check if there is a specific subtype effect on R-loop mutation and Hp infection, we can do the same chi-squared tests on Hp vs R-loop after subsetting by subtype. You can see here that again there doesn’t seem to be a significant enrichment for R-loop mutations in Hp positive tumors in a particular subtype.
## There was an error in 'add_p()' for variable 'R_loop_mut_damaging' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## There was an error in 'add_p()' for variable 'MSI.status' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## Styling functions `bold_labels()`, `bold_levels()`, `italicize_labels()`, and `italicize_levels()` need to be re-applied after `tbl_merge()`.
| Characteristic | CIN | EBV | GS | MSI | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| N | FALSE, N = 791 | TRUE, N = 571 | p-value2 | N | FALSE, N = 131 | TRUE, N = 121 | p-value3 | N | FALSE, N = 281 | TRUE, N = 221 | p-value3 | N | FALSE, N = 331 | TRUE, N = 231 | p-value2 | |
| Hp_total_str | 136 | 0 (0%) | 25 (44%) | <0.001 | 25 | 0 (0%) | 4 (33%) | 0.039 | 50 | 0 (0%) | 10 (45%) | <0.001 | 56 | 0 (0%) | 14 (61%) | <0.001 |
| R_loop_mutation | 136 | 11 (14%) | 7 (12%) | >0.9 | 25 | 3 (23%) | 0 (0%) | 0.2 | 50 | 3 (11%) | 2 (9.1%) | >0.9 | 56 | 25 (76%) | 18 (78%) | >0.9 |
| R_loop_mut_damaging | 136 | 6 (7.6%) | 5 (8.8%) | >0.9 | 25 | 0 (0%) | 0 (0%) | 50 | 0 (0%) | 2 (9.1%) | 0.2 | 56 | 19 (58%) | 16 (70%) | 0.5 | |
| MSI.status | 136 | >0.9 | 25 | 0.2 | 50 | >0.9 | 56 | |||||||||
| MSI-L | 18 (23%) | 14 (25%) | 3 (23%) | 0 (0%) | 5 (18%) | 3 (14%) | ||||||||||
| MSS | 61 (77%) | 43 (75%) | 10 (77%) | 12 (100%) | 23 (82%) | 19 (86%) | ||||||||||
| MSI-H | 33 (100%) | 23 (100%) | ||||||||||||||
| Hypermutated | 136 | 0.7 | 25 | 0.2 | 50 | 0.5 | 56 | 0.3 | ||||||||
| 0 | 76 (96%) | 54 (95%) | 10 (77%) | 12 (100%) | 26 (93%) | 22 (100%) | 4 (12%) | 6 (26%) | ||||||||
| 1 | 3 (3.8%) | 3 (5.3%) | 3 (23%) | 0 (0%) | 29 (88%) | 17 (74%) | ||||||||||
| NA | 2 (7.1%) | 0 (0%) | ||||||||||||||
|
1
Statistics presented: n (%)
2
Statistical tests performed: chi-square test of independence; Fisher's exact test
3
Statistical tests performed: Fisher's exact test
|
||||||||||||||||
Comparing mutated genes between 114 Hp positive and 153 Hp negative. If a gene is mutated more than once it is only counted once. The Fisher test is used to assess if the proportion of a mutated gene between the two group is different. 232 genes have a p < 0.05
## Gene list complete
I plotted the genes that have a p < 0.05 in a heatmap. There could be a mutation pattern in MSI subtype patients with HP.
We can do the same fisher test for within the subtype. EBV, MSI and CIN subtypes had significant genes. 2 in EBV, 181 in MSi and 39 in CIN
| Characteristic | CIN | EBV | GS | MSI | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| N | FALSE, N = 791 | TRUE, N = 571 | p-value2 | N | FALSE, N = 131 | TRUE, N = 121 | p-value2 | N | FALSE, N = 281 | TRUE, N = 221 | p-value2 | N | FALSE, N = 331 | TRUE, N = 231 | p-value2 | |
| TMB | 136 | 1.54 (0.98) | 2.00 (1.53) | 0.052 | 25 | 3.00 (3.73) | 1.08 (0.28) | 0.088 | 50 | 3.77 (14.27) | 0.81 (0.68) | 0.3 | 56 | 16 (7) | 20 (16) | 0.2 |
|
1
Statistics presented: mean (SD)
2
Statistical tests performed: t-test
|
||||||||||||||||