Hp gene detection and R-loop (lax filtering)

Summary table split by Hp detection with lax filtering, only one technology (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. A simple test for R-loop enrichment in Hp infected tumors would be to compare the contingency tables between having Hp and having an R-loop mutation. In the R_loop_mutation line the p-value for the chi-squared test is 0.7, there is no enrichment. I tried to filter the mutations, where I only chose SNVs with a predicted deleterious or damaging effect, since mutations could occur on the gene where the effect is benign. This reduced the number of R-loop mutations from 69 to 48, but again the p-value isn’t so low p = 0.5.

## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
##   (pastp=62.9435, ipn_0:=ipoin[itp=221]=21, stp[ipn_0]=60.5559).
## Increase workspace or consider using 'simulate.p.value=TRUE'
Characteristic1 N FALSE, N = 153 TRUE, N = 114 p-value2
Hp_total_str 267 0 (0%) 53 (46%) <0.001
R_loop_mutation 267 42 (27%) 27 (24%) 0.6
R_loop_mut_damaging 267 25 (16%) 23 (20%) 0.5
Subtype 267 >0.9
CIN 79 (52%) 57 (50%)
EBV 13 (8.5%) 12 (11%)
GS 28 (18%) 22 (19%)
MSI 33 (22%) 23 (20%)
TP53.mutation 267 0.4
0 75 (49%) 62 (54%)
1 76 (50%) 52 (46%)
NA 2 (1.3%) 0 (0%)
PIK3CA.mutation 267 0.2
0 125 (82%) 87 (76%)
1 26 (17%) 27 (24%)
NA 2 (1.3%) 0 (0%)
KRAS.mutation 267 0.5
0 139 (91%) 102 (89%)
1 12 (7.8%) 12 (11%)
NA 2 (1.3%) 0 (0%)
MSI.status 267 0.8
MSI-H 33 (22%) 23 (20%)
MSI-L 26 (17%) 17 (15%)
MSS 94 (61%) 74 (65%)
Hypermutated 267 0.3
0 116 (76%) 94 (82%)
1 35 (23%) 20 (18%)
NA 2 (1.3%) 0 (0%)
TNM.Stage 267
Stage_IA 2 (1.3%) 6 (5.3%)
Stage_IB 11 (7.2%) 8 (7.0%)
Stage_IIA 30 (20%) 23 (20%)
Stage_IIB 35 (23%) 20 (18%)
Stage_IIIA 22 (14%) 13 (11%)
Stage_IIIB 30 (20%) 21 (18%)
Stage_IIIC 9 (5.9%) 4 (3.5%)
Stage_IV 8 (5.2%) 10 (8.8%)
X 6 (3.9%) 9 (7.9%)
Lauren.Class 267 0.2
Diffuse 37 (24%) 24 (21%)
Intestinal 97 (63%) 81 (71%)
Mixed 14 (9.2%) 4 (3.5%)
NA 5 (3.3%) 5 (4.4%)

1 Statistics presented: n (%)

2 Statistical tests performed: chi-square test of independence; Fisher's exact test

Hp gene detection and R-loop (stringent filtering)

Summary table split by Hp detection with stringent filtering, at least two technologies (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. In the R_loop_mutation line the p-value for the chi-squared test is 0.6, there is no enrichment. For R-loop damaging mutations, the p-value also is not significant, p = 0.5.

## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
##   (pastp=38.9765, ipn_0:=ipoin[itp=70]=2395, stp[ipn_0]=38.2125).
## Increase workspace or consider using 'simulate.p.value=TRUE'
Characteristic1 N FALSE, N = 214 TRUE, N = 53 p-value2
Hp_total_lax 267 61 (29%) 53 (100%) <0.001
R_loop_mutation 267 55 (26%) 14 (26%) >0.9
R_loop_mut_damaging 267 37 (17%) 11 (21%) 0.7
Subtype 267 0.7
CIN 111 (52%) 25 (47%)
EBV 21 (9.8%) 4 (7.5%)
GS 40 (19%) 10 (19%)
MSI 42 (20%) 14 (26%)
TP53.mutation 267 0.2
0 104 (49%) 33 (62%)
1 108 (50%) 20 (38%)
NA 2 (0.9%) 0 (0%)
PIK3CA.mutation 267 0.6
0 172 (80%) 40 (75%)
1 40 (19%) 13 (25%)
NA 2 (0.9%) 0 (0%)
KRAS.mutation 267 0.2
0 196 (92%) 45 (85%)
1 16 (7.5%) 8 (15%)
NA 2 (0.9%) 0 (0%)
MSI.status 267 0.2
MSI-H 42 (20%) 14 (26%)
MSI-L 32 (15%) 11 (21%)
MSS 140 (65%) 28 (53%)
Hypermutated 267 0.5
0 171 (80%) 39 (74%)
1 41 (19%) 14 (26%)
NA 2 (0.9%) 0 (0%)
TNM.Stage 267
Stage_IA 4 (1.9%) 4 (7.5%)
Stage_IB 14 (6.5%) 5 (9.4%)
Stage_IIA 41 (19%) 12 (23%)
Stage_IIB 45 (21%) 10 (19%)
Stage_IIIA 31 (14%) 4 (7.5%)
Stage_IIIB 45 (21%) 6 (11%)
Stage_IIIC 12 (5.6%) 1 (1.9%)
Stage_IV 13 (6.1%) 5 (9.4%)
X 9 (4.2%) 6 (11%)
Lauren.Class 267 0.2
Diffuse 50 (23%) 11 (21%)
Intestinal 141 (66%) 37 (70%)
Mixed 17 (7.9%) 1 (1.9%)
NA 6 (2.8%) 4 (7.5%)

1 Statistics presented: n (%)

2 Statistical tests performed: chi-square test of independence; Fisher's exact test

Hp status and R-loop mutation by Subtype

To check if there is a specific subtype effect on R-loop mutation and Hp infection, we can do the same chi-squared tests on Hp vs R-loop after subsetting by subtype. You can see here that again there doesn’t seem to be a significant enrichment for R-loop mutations in Hp positive tumors in a particular subtype.

## There was an error in 'add_p()' for variable 'R_loop_mut_damaging' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## There was an error in 'add_p()' for variable 'MSI.status' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## Styling functions `bold_labels()`, `bold_levels()`, `italicize_labels()`, and `italicize_levels()` need to be re-applied after `tbl_merge()`.
Characteristic CIN EBV GS MSI
N FALSE, N = 791 TRUE, N = 571 p-value2 N FALSE, N = 131 TRUE, N = 121 p-value3 N FALSE, N = 281 TRUE, N = 221 p-value3 N FALSE, N = 331 TRUE, N = 231 p-value2
Hp_total_str 136 0 (0%) 25 (44%) <0.001 25 0 (0%) 4 (33%) 0.039 50 0 (0%) 10 (45%) <0.001 56 0 (0%) 14 (61%) <0.001
R_loop_mutation 136 11 (14%) 7 (12%) >0.9 25 3 (23%) 0 (0%) 0.2 50 3 (11%) 2 (9.1%) >0.9 56 25 (76%) 18 (78%) >0.9
R_loop_mut_damaging 136 6 (7.6%) 5 (8.8%) >0.9 25 0 (0%) 0 (0%) 50 0 (0%) 2 (9.1%) 0.2 56 19 (58%) 16 (70%) 0.5
MSI.status 136 >0.9 25 0.2 50 >0.9 56
MSI-L 18 (23%) 14 (25%) 3 (23%) 0 (0%) 5 (18%) 3 (14%)
MSS 61 (77%) 43 (75%) 10 (77%) 12 (100%) 23 (82%) 19 (86%)
MSI-H 33 (100%) 23 (100%)
Hypermutated 136 0.7 25 0.2 50 0.5 56 0.3
0 76 (96%) 54 (95%) 10 (77%) 12 (100%) 26 (93%) 22 (100%) 4 (12%) 6 (26%)
1 3 (3.8%) 3 (5.3%) 3 (23%) 0 (0%) 29 (88%) 17 (74%)
NA 2 (7.1%) 0 (0%)

1 Statistics presented: n (%)

2 Statistical tests performed: chi-square test of independence; Fisher's exact test

3 Statistical tests performed: Fisher's exact test

Analyze mutations patterns of Hp positive and negative patients

Comparing mutated genes between 114 Hp positive and 153 Hp negative. If a gene is mutated more than once it is only counted once. The Fisher test is used to assess if the proportion of a mutated gene between the two group is different. 232 genes have a p < 0.05

## Gene list complete

There are some genes that have different mutation frequences between the two groups

I plotted the genes that have a p < 0.05 in a heatmap. There could be a mutation pattern in MSI subtype patients with HP.

looking at Subtype specific mutation patterns between Hp positive and negative patients

We can do the same fisher test for within the subtype. EBV, MSI and CIN subtypes had significant genes. 2 in EBV, 181 in MSi and 39 in CIN

EBV table

MSI table

Heatmap just for MSI subtype with significant mutation frequencies between Hp positive and negative

CIN table

Heatmap just for MSI subtype with significant mutation frequencies between Hp positive and negative

CIN the average TMB between Hp positive and negative is not significant anymore same with MSI.

Characteristic CIN EBV GS MSI
N FALSE, N = 791 TRUE, N = 571 p-value2 N FALSE, N = 131 TRUE, N = 121 p-value2 N FALSE, N = 281 TRUE, N = 221 p-value2 N FALSE, N = 331 TRUE, N = 231 p-value2
TMB 136 1.54 (0.98) 2.00 (1.53) 0.052 25 3.00 (3.73) 1.08 (0.28) 0.088 50 3.77 (14.27) 0.81 (0.68) 0.3 56 16 (7) 20 (16) 0.2

1 Statistics presented: mean (SD)

2 Statistical tests performed: t-test

CAG pathogenicity