Are cousins informative?

Author

Michel Nivard

Can cousins inform us about the alleles we didnt inherint form our parents?

This is simualiton code to show that cousins inform us about the alleles we did not inherint form our parents.

set basic parameters, 1000 people 1 allele with an 25% minor allele frequency

n <- 1000
maf <- .25

draw the grandparents that are shared:

gf.common <- rbinom(n,2,maf)
gm.common <- rbinom(n,2,maf) 

Make the shared parent and their sib (choose father and uncle which is arbitrary ish)

ft <- rbinom(n,size = 1,prob = gf.common /2)
mt <- rbinom(n,size = 1,prob = gm.common /2)
father <- ft + mt
father.non.transmitted <- (gf.common - ft) + (gm.common- mt) 

# make some sibs (for sib design)
ft <- rbinom(n,size = 1,prob = gf.common /2)
mt <- rbinom(n,size = 1,prob = gm.common /2)
uncle <- ft + mt

MAke the “other” parent (Mother and aunt):

# make a mother for the focal grand child 
gf.1 <- rbinom(n,2,maf)
gm.1 <- rbinom(n,2,maf) 

ft <- rbinom(n,size = 1,prob = gf.1 /2)
mt <- rbinom(n,size = 1,prob = gm.1 /2)
mother <- ft + mt

MAke the focal grand child:

ft <- rbinom(n,size = 1,prob = father/2)
mt <- rbinom(n,size = 1,prob = mother/2)
children <- ft + mt
children.non.transmitted <- (father - ft) + (mother- mt) 

MAke the aunt and the cousin:

# make aunt and her parents
gf.2 <- rbinom(n,2,maf)
gm.2 <- rbinom(n,2,maf) 

ft <- rbinom(n,size = 1,prob = gf.2 /2)
mt <- rbinom(n,size = 1,prob = gm.2 /2)
aunt <- ft + mt

ft <- rbinom(n,size = 1,prob = uncle/2)
mt <- rbinom(n,size = 1,prob = aunt/2)
cousin <- ft + mt

Predict non-transmitted allele for the focal child from the cousin:

#cousing explains variance in non-transmitted allele
summary(lm(children.non.transmitted ~ children + cousin))

Call:
lm(formula = children.non.transmitted ~ children + cousin)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.6311 -0.4279 -0.4270  0.4964  1.5730 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.427018   0.029144  14.652  < 2e-16 ***
children    0.000675   0.032928   0.020  0.98365    
cousin      0.101721   0.032654   3.115  0.00189 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6123 on 997 degrees of freedom
Multiple R-squared:  0.009769,  Adjusted R-squared:  0.007783 
F-statistic: 4.918 on 2 and 997 DF,  p-value: 0.007493

Check with a phenotype:

pheno <- father + rnorm(1000)

summary(lm(pheno ~ children + cousin))

Call:
lm(formula = pheno ~ children + cousin)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.0597 -0.7745 -0.0166  0.7594  4.1176 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.06422    0.05408   1.187    0.235    
children     0.52718    0.06111   8.627  < 2e-16 ***
cousin       0.23776    0.06060   3.923 9.33e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.136 on 997 degrees of freedom
Multiple R-squared:  0.08983,   Adjusted R-squared:  0.088 
F-statistic:  49.2 on 2 and 997 DF,  p-value: < 2.2e-16
pheno2 <- children + rnorm(1000)

summary(lm(pheno2 ~ children + cousin))

Call:
lm(formula = pheno2 ~ children + cousin)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.2891 -0.7009  0.0342  0.6923  3.2530 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.04408    0.04851  -0.909    0.364    
children     1.02423    0.05481  18.689   <2e-16 ***
cousin       0.03888    0.05435   0.715    0.475    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.019 on 997 degrees of freedom
Multiple R-squared:  0.2636,    Adjusted R-squared:  0.2622 
F-statistic: 178.5 on 2 and 997 DF,  p-value: < 2.2e-16