Hp gene detection and R-loop (lax filtering)

Summary table split by Hp detection with lax filtering, only one technology (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. A simple test for R-loop enrichment in Hp infected tumors would be to compare the contingency tables between having Hp and having an R-loop mutation. In the R_loop_mutation line the p-value for the chi-squared test is 0.7, there is no enrichment. I tried to filter the mutations, where I only chose SNVs with a predicted deleterious or damaging effect, since mutations could occur on the gene where the effect is benign. This reduced the number of R-loop mutations from 69 to 48, but again the p-value isn’t so low p = 0.5.

## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
##   (pastp=85.2035, ipn_0:=ipoin[itp=416]=158, stp[ipn_0]=84.542).
## Increase workspace or consider using 'simulate.p.value=TRUE'
Characteristic1 N FALSE, N = 175 TRUE, N = 92 p-value2
Hp_total_str 267 0 (0%) 36 (39%) <0.001
R_loop_mutation 267 47 (27%) 22 (24%) 0.7
R_loop_mut_damaging 267 29 (17%) 19 (21%) 0.5
Subtype 267 >0.9
CIN 88 (50%) 48 (52%)
EBV 18 (10%) 7 (7.6%)
GS 33 (19%) 17 (18%)
MSI 36 (21%) 20 (22%)
TP53.mutation 267 0.8
0 88 (50%) 49 (53%)
1 85 (49%) 43 (47%)
NA 2 (1.1%) 0 (0%)
PIK3CA.mutation 267 0.5
0 141 (81%) 71 (77%)
1 32 (18%) 21 (23%)
NA 2 (1.1%) 0 (0%)
KRAS.mutation 267 0.2
0 161 (92%) 80 (87%)
1 12 (6.9%) 12 (13%)
NA 2 (1.1%) 0 (0%)
MSI.status 267 >0.9
MSI-H 36 (21%) 20 (22%)
MSI-L 28 (16%) 15 (16%)
MSS 111 (63%) 57 (62%)
Hypermutated 267 0.7
0 136 (78%) 74 (80%)
1 37 (21%) 18 (20%)
NA 2 (1.1%) 0 (0%)
TNM.Stage 267
Stage_IA 4 (2.3%) 4 (4.3%)
Stage_IB 11 (6.3%) 8 (8.7%)
Stage_IIA 34 (19%) 19 (21%)
Stage_IIB 38 (22%) 17 (18%)
Stage_IIIA 22 (13%) 13 (14%)
Stage_IIIB 37 (21%) 14 (15%)
Stage_IIIC 10 (5.7%) 3 (3.3%)
Stage_IV 12 (6.9%) 6 (6.5%)
X 7 (4.0%) 8 (8.7%)
Lauren.Class 267 0.3
Diffuse 42 (24%) 19 (21%)
Intestinal 112 (64%) 66 (72%)
Mixed 15 (8.6%) 3 (3.3%)
NA 6 (3.4%) 4 (4.3%)

1 Statistics presented: n (%)

2 Statistical tests performed: chi-square test of independence; Fisher's exact test

Hp gene detection and R-loop (stringent filtering)

Summary table split by Hp detection with stringent filtering, at least two technologies (RNAseq, WXS or WGS) need to have at least one read aligned to the Hp genome. In the R_loop_mutation line the p-value for the chi-squared test is 0.6, there is no enrichment. For R-loop damaging mutations, the p-value also is not significant, p = 0.5.

## There was an error in 'add_p()' for variable 'TNM.Stage' and test 'fisher.test', p-value omitted:
## Error in stats::fisher.test(data[[variable]], as.factor(data[[by]])): FEXACT error 7(location). LDSTP=18600 is too small for this problem,
##   (pastp=29.7503, ipn_0:=ipoin[itp=131]=2803, stp[ipn_0]=29.2407).
## Increase workspace or consider using 'simulate.p.value=TRUE'
Characteristic1 N FALSE, N = 231 TRUE, N = 36 p-value2
Hp_total_lax 267 56 (24%) 36 (100%) <0.001
R_loop_mutation 267 58 (25%) 11 (31%) 0.6
R_loop_mut_damaging 267 39 (17%) 9 (25%) 0.3
Subtype 267 0.3
CIN 121 (52%) 15 (42%)
EBV 23 (10.0%) 2 (5.6%)
GS 43 (19%) 7 (19%)
MSI 44 (19%) 12 (33%)
TP53.mutation 267 0.12
0 113 (49%) 24 (67%)
1 116 (50%) 12 (33%)
NA 2 (0.9%) 0 (0%)
PIK3CA.mutation 267 0.4
0 186 (81%) 26 (72%)
1 43 (19%) 10 (28%)
NA 2 (0.9%) 0 (0%)
KRAS.mutation 267 0.064
0 212 (92%) 29 (81%)
1 17 (7.4%) 7 (19%)
NA 2 (0.9%) 0 (0%)
MSI.status 267 0.13
MSI-H 44 (19%) 12 (33%)
MSI-L 39 (17%) 4 (11%)
MSS 148 (64%) 20 (56%)
Hypermutated 267 0.3
0 185 (80%) 25 (69%)
1 44 (19%) 11 (31%)
NA 2 (0.9%) 0 (0%)
TNM.Stage 267
Stage_IA 5 (2.2%) 3 (8.3%)
Stage_IB 17 (7.4%) 2 (5.6%)
Stage_IIA 44 (19%) 9 (25%)
Stage_IIB 49 (21%) 6 (17%)
Stage_IIIA 34 (15%) 1 (2.8%)
Stage_IIIB 47 (20%) 4 (11%)
Stage_IIIC 12 (5.2%) 1 (2.8%)
Stage_IV 16 (6.9%) 2 (5.6%)
X 7 (3.0%) 8 (22%)
Lauren.Class 267 0.4
Diffuse 53 (23%) 8 (22%)
Intestinal 154 (67%) 24 (67%)
Mixed 17 (7.4%) 1 (2.8%)
NA 7 (3.0%) 3 (8.3%)

1 Statistics presented: n (%)

2 Statistical tests performed: chi-square test of independence; Fisher's exact test

Hp status and R-loop mutation by Subtype

To check if there is a specific subtype effect on R-loop mutation and Hp infection, we can do the same chi-squared tests on Hp vs R-loop after subsetting by subtype. You can see here that again there doesn’t seem to be a significant enrichment for R-loop mutations in Hp positive tumors in a particular subtype.

## There was an error in 'add_p()' for variable 'R_loop_mut_damaging' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## There was an error in 'add_p()' for variable 'MSI.status' and test 'chisq.test', p-value omitted:
## Error in stats::chisq.test(data[[variable]], as.factor(data[[by]])): 'x' and 'y' must have at least 2 levels
## Styling functions `bold_labels()`, `bold_levels()`, `italicize_labels()`, and `italicize_levels()` need to be re-applied after `tbl_merge()`.
Characteristic CIN EBV GS MSI
N FALSE, N = 881 TRUE, N = 481 p-value2 N FALSE, N = 181 TRUE, N = 71 p-value3 N FALSE, N = 331 TRUE, N = 171 p-value3 N FALSE, N = 361 TRUE, N = 201 p-value4
Hp_total_str 136 0 (0%) 15 (31%) <0.001 25 0 (0%) 2 (29%) 0.070 50 0 (0%) 7 (41%) <0.001 56 0 (0%) 12 (60%) <0.001
R_loop_mutation 136 12 (14%) 6 (12%) >0.9 25 3 (17%) 0 (0%) 0.5 50 4 (12%) 1 (5.9%) 0.6 56 28 (78%) 15 (75%) >0.9
R_loop_mut_damaging 136 7 (8.0%) 4 (8.3%) >0.9 25 0 (0%) 0 (0%) 50 1 (3.0%) 1 (5.9%) >0.9 56 21 (58%) 14 (70%) 0.6
MSI.status 136 0.6 25 0.5 50 0.7 56
MSI-L 19 (22%) 13 (27%) 3 (17%) 0 (0%) 6 (18%) 2 (12%)
MSS 69 (78%) 35 (73%) 15 (83%) 7 (100%) 27 (82%) 15 (88%)
MSI-H 36 (100%) 20 (100%)
Hypermutated 136 0.7 25 0.5 50 0.5 56 0.5
0 85 (97%) 45 (94%) 15 (83%) 7 (100%) 31 (94%) 17 (100%) 5 (14%) 5 (25%)
1 3 (3.4%) 3 (6.2%) 3 (17%) 0 (0%) 31 (86%) 15 (75%)
NA 2 (6.1%) 0 (0%)

1 Statistics presented: n (%)

2 Statistical tests performed: chi-square test of independence; Fisher's exact test

3 Statistical tests performed: Fisher's exact test

4 Statistical tests performed: Fisher's exact test; chi-square test of independence

Oncoprint with everything together

I added a new track where I point out the “damaging” R-loop mutations. As you can see there isn’t a specific clustering of R-loop mutations with Hp presence.

## All mutation types: Missense_Mutation, Amplification,
## Frame_Shift_InDel, Deletion, Splice_Site, Nonsense_Mutation,
## In_Frame_InDel

Analyze mutations patterns of Hp positive and negative patients

Comparing mutated genes between 92 Hp positive and 175 Hp negative. If a gene is mutated more than once it is only counted once. The Fisher test is used to assess if the proportion of a mutated gene between the two group is different. 443 genes have a p < 0.05

## Gene list complete

No significant NFkB genes

A blank cell means the gene was not mutated in either group

There are some genes that have different mutation frequences between the two groups

I plotted the genes that have a p < 0.05 in a heatmap. There could be a mutation pattern in MSI subtype patients with HP.

looking at Subtype specific mutation patterns between Hp positive and negative patients

We can do the same fisher test for within the subtype. Only MSI and CIN subtypes had significant genes. 410 in MSi and 92 in CIN

MSI table

No significant NFkB in MSI HP mutated genes

Heatmap just for MSI subtype with significant mutation frequencies between Hp positive and negative

CIN table

No significant NFkB in CIN HP mutated genes

Heatmap just for MSI subtype with significant mutation frequencies between Hp positive and negative

Interestingly for CIN the average TMB between Hp positive and negative is significant but not for MSI

Characteristic CIN EBV GS MSI
N FALSE, N = 881 TRUE, N = 481 p-value2 N FALSE, N = 181 TRUE, N = 71 p-value2 N FALSE, N = 331 TRUE, N = 171 p-value2 N FALSE, N = 361 TRUE, N = 201 p-value2
TMB 136 1.52 (0.94) 2.12 (1.63) 0.024 25 2.48 (3.25) 1.04 (0.30) 0.080 50 3.30 (13.16) 0.84 (0.77) 0.3 56 16 (7) 21 (17) 0.3

1 Statistics presented: mean (SD)

2 Statistical tests performed: t-test

Unfortunately, pathway analysis on HP mutation signature overall, in MSI and in CIN didn’t overlap with any signaling pathways.