Description:
The effect of intelligence on success has been debated - some argue that it’s overstated (e.g. Gladwell, Taleb); others say it has a decent impact, but that other factors must have a large effect based on the fact that IQ only explains so much variance (10-20%) in most outcomes (this is the position of most mainstream intelligence researchers). I’ve only seen AR Jensen emphasize the importance of accurate measurement – it seems his instincts were in the right place.
In this piece I show that it is the third perspective that is correct, and the association between IQ and socioeconomic status is deflated by the fact there is no perfect indicator of SES. Using a composite of education, occupation, assets, and income earned over 20 years, I find that about 40% of the permanent variation in socioeconomic status can be accounted for by intelligence, and this reduces by about 20% when the non-causal variation is parsed out.
Note that all of the causal explanatory variables of SES must add up to 100% - if intelligence only explains 10-20% of the variation in socioeconomic status, then 80-90% of the variation remains. Most of the research I have seen on “inherited unfair advantages” (e.g. attractiveness, height, voice tone, skin color) is that these may have a small causal effect, but that they explain very little variance in success – say 5%. This still leaves 75-85% of the variation explained – which must be some kind of random luck or personality traits. I don’t deny that these are valuable, but given the findings from twin studies, I find it hard to believe that these factors can account fully for the 80%.
DATA CLEANING
Preliminary data cleaning:
#loading data
setwd('~')
Warning: The working directory was changed to /home/asuka inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
setwd('Documents/rstuff/predsucc')
new_data <- read.csv('new_data.csv')
#demographics
new_data$Female <- new_data$R0536300-1
new_data$race <- 'Other'
new_data$race[new_data$R0538700==1] <- 'White'
new_data$race[new_data$R0538600==1] <- 'Hispanic'
new_data$race[new_data$R0538700==4] <- 'Asian'
new_data$race[new_data$R0538700==2] <- 'Black'
#iq data
subtestlist <- c('GS', 'AR', 'WK', 'PC', 'NO', 'CS', 'AI', 'SI', 'MK', 'MC', 'EI', 'AO')
new_data$GS = NA
new_data$AR = NA
new_data$WK = NA
new_data$PC = NA
new_data$NO = NA
new_data$CS = NA
new_data$AI = NA
new_data$SI = NA
new_data$MK = NA
new_data$MC = NA
new_data$EI = NA
new_data$AO = NA
new_data$testday = new_data$R9708601*1/12+new_data$R9708602
new_data$testday[is.na(new_data$testday)] <- 1997.661
new_data$bdate = new_data$R0536402 + new_data$R0536401*1/12
new_data$ageat = new_data$testday-new_data$bdate
cor.test(new_data$R9705300, new_data$ageat)
Pearson's product-moment correlation
data: new_data$R9705300 and new_data$ageat
t = 7.3547, df = 2380, p-value = 2.62e-13
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.1095651 0.1881073
sample estimates:
cor
0.1490713
j = 0
for(stest in subtestlist) {
posstring = paste("R", 9705200+j*100, sep="")
negstring = paste("R", 9706400+j*100, sep="")
stcolumnindex = getcolindex(stest, new_data)
negcolumnindex = getcolindex(negstring, new_data)
poscolumnindex = getcolindex(posstring, new_data)
new_data[, stcolumnindex] = as.numeric(pmax(new_data[, negcolumnindex]*-1, new_data[, poscolumnindex], na.rm=TRUE))
new_data[, stcolumnindex] = normalise(new_data[, stcolumnindex])
new_data[, stcolumnindex][!is.na(new_data[, stcolumnindex])] <- agecorrect(stest, agevectorname='ageat', datafr = new_data, normalizeit=T, splinex = 6)
j = j+1
}
cor.test(new_data$GS, new_data$ageat)
Pearson's product-moment correlation
data: new_data$GS and new_data$ageat
t = 2.3196e-14, df = 7125, p-value = 1
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.02321711 0.02321711
sample estimates:
cor
2.748013e-16
iq <- subset(new_data, select = c(GS, AR, WK, PC, NO, CS, AI, SI, MK, MC, EI, AO))
new_data$g2 = getpc(iq, normalizeit=T, fillmissing=F, dofa=F)
new_data$IQ <- new_data$g2*15+100
new_data$age = 2021 - new_data$bdate
Parental SES data cleaning (assets + family income + father/mother education). Severe outliers (+5SD above the mean) were removed.
poldat <- subset(new_data, select = c(R1204500, R1204700, R1302600, R1302700))
new_data$pses = getpc(poldat, normalizeit=T, fillmissing=T, dofa=F)
new_data$pses[new_data$pses > 5] <- NA
pcalol <- pca(poldat, nfactors=1, rotate="none", missing=TRUE)
print(pcalol)
Principal Components Analysis
Call: principal(r = r, nfactors = nfactors, residuals = residuals,
rotate = rotate, n.obs = n.obs, covar = covar, scores = scores,
missing = missing, impute = impute, oblique.scores = oblique.scores,
method = method, use = use, cor = cor, correct = 0.5, weight = NULL)
Standardized loadings (pattern matrix) based upon correlation matrix
PC1
SS loadings 2.30
Proportion Var 0.57
Mean item complexity = 1
Test of the hypothesis that 1 component is sufficient.
The root mean square of the residuals (RMSR) is 0.19
with the empirical chi square 3797.21 with prob < 0
Fit based upon off diagonal values = 0.83
Child income, net worth and education data cleaning – pre-2000 data is not considered. For a few children, net worth at 20 would have been measured in 1999 – I judge this as a nonfactor, so I included this variable.
column_mappings <- c("inc2000", R6827500 = "inc2001", S1055800 = "inc2002",
S3134600 = "inc2003", S4799600 = "inc2004", S6501000 = "inc2005",
S8496500 = "inc2006", T0889800 = "inc2007", T3003000 = "inc2008",
T4406000 = "inc2009", T6055500 = "inc2010", T7545600 = "inc2011",
T8976700 = "inc2013", U0956900 = "inc2015", U2857200 = "inc2017",
U4282300 = "inc2019", U5753500 = "inc2021")
names_to_change <- names(new_data) %in% names(column_mappings)
names(new_data)[names_to_change] <- column_mappings[names(new_data)[names_to_change]]
##############
degree_mappings <- c(
S2261100 = "deg2003", S4032600 = "deg2004", S5613000 = "deg2005",
S7683300 = "deg2006", T0149600 = "deg2007", T2120000 = "deg2008",
T3731100 = "deg2009", T5322200 = "deg2010", T6767100 = "deg2011",
T8241400 = "deg2013", U0137100 = "deg2015", U1990700 = "deg2017",
U3572700 = "deg2019", U5072600 = "deg2021"
)
names_to_change <- names(new_data) %in% names(degree_mappings)
names(new_data)[names_to_change] <- degree_mappings[names(new_data)[names_to_change]]
new_data <- data.frame(new_data)
deg_columns <- c("deg2003", "deg2004", "deg2005",
"deg2006", "deg2007", "deg2008", "deg2009", "deg2010", "deg2011",
"deg2013", "deg2015", "deg2017", "deg2019", "deg2021")
existing_deg_columns <- deg_columns[deg_columns %in% names(new_data)]
yearly_averages <- sapply(new_data[, existing_deg_columns], mean, na.rm = TRUE)
net_worth_mappings <- c(
Z9048900 = "nwage20",
Z9049000 = "nwage25",
Z9121900 = "nwage30",
Z9141400 = "nwage35",
Z9164500 = "nwage40"
)
names_to_change <- names(new_data) %in% names(net_worth_mappings)
names(new_data)[names_to_change] <- net_worth_mappings[names(new_data)[names_to_change]]
income_columns <- c("inc2000", "inc2001", "inc2002", "inc2003", "inc2004", "inc2005",
"inc2006", "inc2007", "inc2008", "inc2009", "inc2010", "inc2011",
"inc2013", "inc2015", "inc2017", "inc2019", "inc2021")
existing_income_columns <- income_columns[income_columns %in% names(new_data)]
yearly_averages <- sapply(new_data[, existing_income_columns], mean, na.rm = TRUE)
new_data$weighted_mean_income <- apply(new_data[, existing_income_columns], 1, function(row) {
non_missing_indices <- !is.na(row)
total_income <- sum(row[non_missing_indices], na.rm = TRUE)
total_average <- sum(yearly_averages[non_missing_indices], na.rm = TRUE)
weighted_mean <- ifelse(total_average > 0, total_income / total_average, NA)
return(weighted_mean)
})
Occupation data cleaning – placed separately, as the occupation data was more complex than the other data. Occupation rank was assigned based on the average income of the job. ‘job2000_01’ refers to the respondent’s first job in 2000, ‘job2007_03’ refers to the respondent’s third job in 2007.
occupation_mappings <- c(
S3713000 = "job2000_01", S3713100 = "job2000_02", S3713200 = "job2000_03", S3713300 = "job2000_04",
S3729000 = "job2001_01", S3729100 = "job2001_02", S3729200 = "job2001_03", S3729300 = "job2001_04",
S3757000 = "job2003_01", S3757100 = "job2003_02", S3757200 = "job2003_03", S3757300 = "job2003_04",
S5041700 = "job2004_01", S5041800 = "job2004_02", S5041900 = "job2004_03", S5042000 = "job2004_04",
S8689700 = "job2006_01", S8689800 = "job2006_02", S8689900 = "job2006_03", S8690000 = "job2006_04",
T1109400 = "job2007_01", T1109500 = "job2007_02", T1109600 = "job2007_03", T1109700 = "job2007_04",
T3186900 = "job2008_01", T3187000 = "job2008_02", T3187100 = "job2008_03", T3187200 = "job2008_04",
T4597800 = "job2009_01", T4597900 = "job2009_02", T4598000 = "job2009_03", T4598100 = "job2009_04",
T6231000 = "job2010_01", T6231100 = "job2010_02", T6231200 = "job2010_03", T6231300 = "job2010_04",
T7732100 = "job2011_01", T7732200 = "job2011_02", T7732300 = "job2011_03", T7732400 = "job2011_04",
T9133500 = "job2013_01", T9133600 = "job2013_02", T9133700 = "job2013_03", T9133800 = "job2013_04",
U1127100 = "job2015_01", U1127200 = "job2015_02", U1127300 = "job2015_03", U1127400 = "job2015_04",
U1719400 = "job2017_01", U1719500 = "job2017_02", U1719600 = "job2017_03", U1719700 = "job2017_04",
U3315700 = "job2019_01", U3315800 = "job2019_02", U3315900 = "job2019_03", U3316000 = "job2019_04",
U4820200 = "job2021_01", U4820300 = "job2021_02", U4820400 = "job2021_03", U4820500 = "job2021_04"
)
names(new_data) <- ifelse(names(new_data) %in% names(occupation_mappings),
occupation_mappings[names(new_data)],
names(new_data))
###################
new_data$temp <- normalise(new_data$weighted_mean_income)
job_columns <- grep("job", names(new_data), value = TRUE)
rankings <- list()
#ranking the variables by income
for (job in job_columns) {
new_data$temp2 <- new_data[[job]]
rankings[[job]] <- new_data %>% group_by(temp2) %>% summarise(status = mean(temp, na.rm=T))
}
#converting this to a list
average_status_by_temp2 <- list()
for (job_name in names(rankings)) {
average_status_by_temp2[[job_name]] <- aggregate(status ~ temp2, data = rankings[[job_name]], mean)
}
#combining years to reduce unreliability in ranking
combined_averages <- Reduce(function(x, y) merge(x, y, by = "temp2", all = TRUE), average_status_by_temp2)
combined_averages$sum <- rowMeans(combined_averages[, 2:61], na.rm=T)
pol <- combined_averages %>% select(sum, temp2)
Then, weighed averages were made from the years between 2000-2021 based on the data. Because SES measured later in life is higher on average, the years must be weighed based on this bias. This makes the scale a bit uninterpretable, but it has a more accurate rank order. For occupation, the highest status job an individual held in a given year was used as the main variable within years.
#####income data cleaning
income_columns <- c("inc2000", "inc2001", "inc2002", "inc2003", "inc2004", "inc2005",
"inc2006", "inc2007", "inc2008", "inc2009", "inc2010", "inc2011",
"inc2013", "inc2015", "inc2017", "inc2019", "inc2021")
existing_income_columns <- income_columns[income_columns %in% names(new_data)]
yearly_averages <- sapply(new_data[, existing_income_columns], mean, na.rm = TRUE)
new_data$weighted_mean_income <- apply(new_data[, existing_income_columns], 1, function(row) {
non_missing_indices <- !is.na(row)
total_income <- sum(row[non_missing_indices], na.rm = TRUE)
total_average <- sum(yearly_averages[non_missing_indices], na.rm = TRUE)
weighted_mean <- ifelse(total_average > 0, total_income / total_average, NA)
return(weighted_mean)
})
#####education data cleaning
new_data <- data.frame(new_data)
deg_columns <- c("deg2003", "deg2004", "deg2005",
"deg2006", "deg2007", "deg2008", "deg2009", "deg2010", "deg2011",
"deg2013", "deg2015", "deg2017", "deg2019", "deg2021")
existing_deg_columns <- deg_columns[deg_columns %in% names(new_data)]
yearly_averages <- sapply(new_data[, existing_deg_columns], mean, na.rm = TRUE)
new_data$weighted_degree <- apply(new_data[, existing_deg_columns], 1, function(row) {
non_missing_indices <- !is.na(row)
total_deg <- sum(row[non_missing_indices], na.rm = TRUE)
total_average <- sum(yearly_averages[non_missing_indices], na.rm = TRUE)
weighted_mean <- ifelse(total_deg > 0, total_deg / total_average, NA)
return(weighted_mean)
})
#########occupation data cleaning
for (jobcol in job_columns) {
temp <- paste0(jobcol, '_rank')
new_data$temp_key <- new_data[[jobcol]]
new_data <- new_data %>%
left_join(pol, by = c("temp_key" = "temp2")) %>%
mutate(!!temp := sum) %>%
select(-sum)
new_or.new_datadata <- select(new_data, -temp_key)
}
years <- seq(2000, 2021, by = 1)
for (year in years) {
rank_cols <- grep(paste0("job", year, "_\\d{2}_rank"), names(new_data), value = TRUE)
if (length(rank_cols) == 4) {
new_data <- new_data %>%
mutate(!!paste0("highest_rank_job_", year) := pmax(!!!rlang::syms(rank_cols), na.rm = TRUE))
} else {
warning(paste("Expected 4 job rank columns for year", year, "but found", length(rank_cols)))
}
}
Warning: Expected 4 job rank columns for year 2002 but found 0Warning: Expected 4 job rank columns for year 2005 but found 0Warning: Expected 4 job rank columns for year 2012 but found 0Warning: Expected 4 job rank columns for year 2014 but found 0Warning: Expected 4 job rank columns for year 2016 but found 0Warning: Expected 4 job rank columns for year 2018 but found 0Warning: Expected 4 job rank columns for year 2020 but found 0
new_data <- data.frame(new_data)
for(col in occ_columns) {
new_data[[col]] <- new_data[[col]] - min(new_data[[col]], na.rm=T)
}
occ_columns <- grep(paste0("highest"), names(new_data), value = TRUE)
existing_occ_columns <- occ_columns[occ_columns %in% names(new_data)]
yearly_averages <- sapply(new_data[, existing_occ_columns], mean, na.rm = TRUE)
new_data$weighted_occ <- apply(new_data[, existing_occ_columns], 1, function(row) {
non_missing_indices <- !is.na(row)
total_occ <- sum(row[non_missing_indices], na.rm = TRUE)
total_average <- sum(yearly_averages[non_missing_indices], na.rm = TRUE)
weighted_mean <- ifelse(total_occ > 0, total_occ / total_average, NA)
return(weighted_mean)
})
new_data <- data.frame(new_data)
for(col in occ_columns) {
new_data[[col]] <- new_data[[col]] - min(new_data[[col]], na.rm=T)
}
#net worth data cleaning
nw_columns <- net_worth_mappings <- c("nwage20", "nwage25", "nwage30","nwage35","nwage40")
existing_nw_columns <- nw_columns[nw_columns %in% names(new_data)]
yearly_averages <- sapply(new_data[, existing_nw_columns], mean, na.rm = TRUE)
new_data$weighted_nw <- apply(new_data[, existing_nw_columns], 1, function(row) {
non_missing_indices <- !is.na(row)
total_occ <- sum(row[non_missing_indices], na.rm = TRUE)
total_average <- sum(yearly_averages[non_missing_indices], na.rm = TRUE)
weighted_mean <- ifelse(total_occ > 0, total_occ / total_average, NA)
return(weighted_mean)
})
Age adjustments were made, as not everybody was of the same age.
new_data$weighted_occ[!is.na(new_data$weighted_occ)] <- agecorrect('weighted_occ', agevectorname='age', datafr = new_data, normalizeit=T, splinex = 6)
new_data$weighted_mean_income[!is.na(new_data$weighted_mean_income)] <- agecorrect('weighted_mean_income', agevectorname='age', datafr = new_data, normalizeit=T, splinex = 6)
new_data$weighted_degree[!is.na(new_data$weighted_degree)] <- agecorrect('weighted_degree', agevectorname='age', datafr = new_data, normalizeit=T, splinex = 6)
new_data$weighted_nw[!is.na(new_data$weighted_nw)] <- agecorrect('weighted_nw', agevectorname='age', datafr = new_data, normalizeit=T, splinex = 6)
Finally, the holy grail: ses – a composite of semi-permanent assets, education, income, and occupational status.
new_data$ses <- getpc(new_data %>% select(weighted_occ, weighted_mean_income, weighted_degree, weighted_nw), normalizeit = T, fillmissing=F, dofa=F)
RESULTS:
The correlation matrix of the main indicators: parental socioeconomic status (pses), child socioeconomic status (ses), child IQ (IQ), semi-permanent occupational status (weighted_occ), semi-permanent income (weighted_mean_income), semi-permanent education (weighted_degree), and semi-permanent assets (weighted_nw).
correlation_matrix(new_data %>% select(weighted_occ, weighted_mean_income, weighted_degree, weighted_nw, ses, pses, IQ))
weighted_occ weighted_mean_income weighted_degree weighted_nw ses pses IQ
weighted_occ "NA" "0.626 ***" "0.451 ***" "0.35 ***" "0.823 ***" "0.298 ***" "0.449 ***"
weighted_mean_income "0.626 ***" "NA" "0.395 ***" "0.433 ***" "0.836 ***" "0.25 ***" "0.364 ***"
weighted_degree "0.451 ***" "0.395 ***" "NA" "0.328 ***" "0.702 ***" "0.444 ***" "0.557 ***"
weighted_nw "0.35 ***" "0.433 ***" "0.328 ***" "NA" "0.67 ***" "0.29 ***" "0.275 ***"
ses "0.823 ***" "0.836 ***" "0.702 ***" "0.67 ***" "NA" "0.43 ***" "0.557 ***"
pses "0.298 ***" "0.25 ***" "0.444 ***" "0.29 ***" "0.43 ***" "NA" "0.458 ***"
IQ "0.449 ***" "0.364 ***" "0.557 ***" "0.275 ***" "0.557 ***" "0.458 ***" "NA"
The scatterplots for the semi-permanent variables. Note the linear association between IQ and every variable with the exception of parental SES.
GG_scatter(new_data, 'IQ', 'ses') + geom_smooth()
GG_scatter(new_data, 'pses', 'IQ') + geom_smooth()
GG_scatter(new_data, 'IQ', 'weighted_occ') + geom_smooth()
GG_scatter(new_data, 'IQ', 'weighted_mean_income') + geom_smooth()
GG_scatter(new_data, 'IQ', 'weighted_degree') + geom_smooth()
GG_scatter(new_data, 'IQ', 'weighted_nw') + geom_smooth()
Compare this to the ones generated for just the year 2010:
new_data$ses2010 <- getpc(new_data %>% select(highest_rank_job_2010, inc2010, deg2010, nwage30))
GG_scatter(new_data, 'IQ', 'ses2010') + geom_smooth()
GG_scatter(new_data, 'IQ', 'highest_rank_job_2010') + geom_smooth()
GG_scatter(new_data, 'IQ', 'inc2010') + geom_smooth()
GG_scatter(new_data, 'IQ', 'deg2010') + geom_smooth()
GG_scatter(new_data, 'IQ', 'nwage30') + geom_smooth()
There are several ways to estimate the true correlation between IQ and SES: - Jensen vector method - Latent correlation - Adjusting for observed unreliability.
Jensen vector method estimates the correlation between g and SES to be .56:
p <- pca(new_data %>% select(subtestlist), rotate='none', nfactors=1)
debi <- data.frame(v = rep('', length(subtestlist)), r = rep(0, length(subtestlist)))
debi$v <- NA
i = 1
for(vec in subtestlist) {
debi$v[i] <- vec
debi$r[i] <- cor.test(new_data[[vec]], new_data$ses)$estimate
i = i + 1
}
debi$v
[1] "GS" "AR" "WK" "PC" "NO" "CS" "AI" "SI" "MK" "MC" "EI" "AO"
debi$l <- p$loadings
GG_scatter(df=debi, x_var='l', y_var='r', case_names = 'v')
lr <- lm(data=debi, r ~ l)
lr
Call:
lm(formula = r ~ l, data = debi)
Coefficients:
(Intercept) l
-0.01948 0.58374
Latent correlation is .64
lat0 <- "
#LATENTS
S =~ weighted_occ + weighted_mean_income + weighted_degree + weighted_nw
G =~ GS+AR+WK+PC+NO+CS+AI+SI+MK+MC+EI+AO
S ~~ G
"
latfit1 <- sem(model = lat0, data=new_data)
summary(latfit1, fit.measures=T, standardize=T)
lavaan 0.6.17 ended normally after 29 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 33
Used Total
Number of observations 5685 8984
Model Test User Model:
Test statistic 9278.057
Degrees of freedom 103
P-value (Chi-square) 0.000
Model Test Baseline Model:
Test statistic 62419.878
Degrees of freedom 120
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.853
Tucker-Lewis Index (TLI) 0.828
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -103149.265
Loglikelihood unrestricted model (H1) -98510.237
Akaike (AIC) 206364.531
Bayesian (BIC) 206583.835
Sample-size adjusted Bayesian (SABIC) 206478.971
Root Mean Square Error of Approximation:
RMSEA 0.125
90 Percent confidence interval - lower 0.123
90 Percent confidence interval - upper 0.127
P-value H_0: RMSEA <= 0.050 0.000
P-value H_0: RMSEA >= 0.080 1.000
Standardized Root Mean Square Residual:
SRMR 0.077
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
S =~
weighted_occ 1.000 0.789 0.776
weightd_mn_ncm 0.953 0.019 49.047 0.000 0.752 0.744
weighted_degre 0.785 0.018 42.848 0.000 0.619 0.632
weighted_nw 0.655 0.019 34.318 0.000 0.517 0.503
G =~
GS 1.000 0.877 0.873
AR 0.995 0.011 90.555 0.000 0.873 0.865
WK 0.980 0.011 89.462 0.000 0.860 0.860
PC 0.971 0.011 87.402 0.000 0.852 0.850
NO 0.703 0.013 52.490 0.000 0.617 0.615
CS 0.632 0.014 45.925 0.000 0.555 0.555
AI 0.650 0.014 46.617 0.000 0.571 0.561
SI 0.723 0.013 53.846 0.000 0.634 0.626
MK 0.970 0.011 86.013 0.000 0.851 0.843
MC 0.927 0.012 79.430 0.000 0.814 0.807
EI 0.911 0.012 76.292 0.000 0.799 0.789
AO 0.797 0.013 62.835 0.000 0.699 0.698
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
S ~~
G 0.442 0.013 33.163 0.000 0.638 0.638
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.weighted_occ 0.412 0.012 33.984 0.000 0.412 0.398
.weightd_mn_ncm 0.456 0.012 37.322 0.000 0.456 0.447
.weighted_degre 0.577 0.013 44.982 0.000 0.577 0.601
.weighted_nw 0.787 0.016 49.129 0.000 0.787 0.747
.GS 0.241 0.005 45.201 0.000 0.241 0.239
.AR 0.256 0.006 45.751 0.000 0.256 0.252
.WK 0.261 0.006 46.104 0.000 0.261 0.261
.PC 0.280 0.006 46.717 0.000 0.280 0.278
.NO 0.626 0.012 51.780 0.000 0.626 0.622
.CS 0.693 0.013 52.192 0.000 0.693 0.692
.AI 0.708 0.014 52.153 0.000 0.708 0.685
.SI 0.623 0.012 51.683 0.000 0.623 0.608
.MK 0.296 0.006 47.094 0.000 0.296 0.290
.MC 0.354 0.007 48.578 0.000 0.354 0.349
.EI 0.388 0.008 49.145 0.000 0.388 0.378
.AO 0.514 0.010 50.907 0.000 0.514 0.512
S 0.622 0.020 30.957 0.000 1.000 1.000
G 0.770 0.019 41.310 0.000 1.000 1.000
Adjusting for the observed unreliability (omega total), a correlation of .64 is obtained.
reliability(new_data %>% select(weighted_occ, weighted_mean_income, weighted_degree, weighted_nw))
keys not specified, all items will be scored
Measures of reliability
reliability(keys = new_data %>% select(weighted_occ, weighted_mean_income,
weighted_degree, weighted_nw))
omega_h alpha omega.tot Uni tau cong max.split min.split mean.r med.r n.items CFI ECV
All_items 0.56 0.75 0.81 0.95 0.95 0.99 0.8 0.71 0.43 0.41 4 0.97 0.87
reliability(new_data %>% select(subtestlist))
keys not specified, all items will be scored
Measures of reliability
reliability(keys = new_data %>% select(subtestlist))
omega_h alpha omega.tot Uni tau cong max.split min.split mean.r med.r n.items CFI ECV
All_items 0.71 0.94 0.95 0.92 0.93 0.98 0.96 0.85 0.55 0.56 12 0.86 0.85
corrforatt(new_data, r1=0.95, r2=0.81, 'IQ', 'ses')
[1] "Corrected corr: 0.635085625039677"
[1] "Upper lim: 0.655230749020211"
[1] "Lower lim: 0.614348432898653"
Given the correlation between IQ and SES is so high, for the relationship to be noncausal, the confounders in the relationship must have extremely strong relationships with intelligence to bias it. The most obvious one is parental SES - adjusting for it does decrease the observed relationship, but only by about 80%.
#zero order IQ ~ SES (g2 = IQ standardized to mean = 0 and SD = 1)
lr <- lm(data=new_data, ses ~ g2)
summary(lr)
Call:
lm(formula = ses ~ g2, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.2999 -0.5506 -0.1043 0.4225 6.5533
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.05145 0.01102 4.668 3.12e-06 ***
g2 0.55089 0.01089 50.573 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8311 on 5683 degrees of freedom
(3299 observations deleted due to missingness)
Multiple R-squared: 0.3104, Adjusted R-squared: 0.3102
F-statistic: 2558 on 1 and 5683 DF, p-value: < 2.2e-16
#zero order parental SES ~ SES
lr <- lm(data=new_data, ses ~ pses)
summary(lr)
Call:
lm(formula = ses ~ pses, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.9023 -0.6180 -0.1263 0.4488 6.3260
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0004859 0.0106215 0.046 0.964
pses 0.4379165 0.0108104 40.509 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9031 on 7228 degrees of freedom
(1754 observations deleted due to missingness)
Multiple R-squared: 0.185, Adjusted R-squared: 0.1849
F-statistic: 1641 on 1 and 7228 DF, p-value: < 2.2e-16
#parental SES + IQ
lr <- lm(data=new_data, ses ~ pses + g2)
summary(lr)
Call:
lm(formula = ses ~ pses + g2, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.1434 -0.5340 -0.0971 0.4054 6.3898
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03589 0.01075 3.34 0.000844 ***
pses 0.22793 0.01226 18.59 < 2e-16 ***
g2 0.44642 0.01199 37.23 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8074 on 5676 degrees of freedom
(3305 observations deleted due to missingness)
Multiple R-squared: 0.3499, Adjusted R-squared: 0.3496
F-statistic: 1527 on 2 and 5676 DF, p-value: < 2.2e-16
#parental SES + IQ + demographics (reference race is Asian)
lr <- lm(data=new_data, ses ~ g2 + race + Female + pses)
summary(lr)
Call:
lm(formula = ses ~ g2 + race + Female + pses, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.2751 -0.5131 -0.0918 0.4045 6.1967
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.63595 0.07943 8.006 1.43e-15 ***
g2 0.42177 0.01283 32.862 < 2e-16 ***
raceBlack -0.59841 0.08277 -7.230 5.46e-13 ***
raceHispanic -0.41654 0.08334 -4.998 5.96e-07 ***
raceOther -0.52113 0.11130 -4.682 2.91e-06 ***
raceWhite -0.49289 0.07974 -6.181 6.81e-10 ***
Female -0.22103 0.02116 -10.444 < 2e-16 ***
pses 0.23175 0.01260 18.395 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.7951 on 5671 degrees of freedom
(3305 observations deleted due to missingness)
Multiple R-squared: 0.37, Adjusted R-squared: 0.3692
F-statistic: 475.8 on 7 and 5671 DF, p-value: < 2.2e-16
Causality can also be tested with kin controls – using NLSYlinks. I find roughly the same association (B = .40)
library(NlsyLinks)
nl <- NlsyLinks::Links97PairExpanded
ndreduced <- new_data %>% select('X', 'IQ', 'ses')
temp <- full_join(ndreduced, nl, by = c("X" = "SubjectID_S2"))
temp2 <- temp %>% select('X', 'IQ', 'ses', 'R', 'SubjectID_S1')
temp2$SubjectID_S2 <- temp2$X
kin <- full_join(new_data, temp2, by = c("X" = "SubjectID_S1")) %>% filter(!is.na(SubjectID_S2)) %>% select('IQ.x', 'ses.x', 'IQ.y', 'ses.y', 'R', 'SubjectID_S2', 'X')
kin$sesdiff[!is.na(kin$ses.x) & !is.na(kin$ses.y)] <- kin$ses.y[!is.na(kin$ses.x) & !is.na(kin$ses.y)] - kin$ses.x[!is.na(kin$ses.x) & !is.na(kin$ses.y)]
kin$iqdiff[!is.na(kin$IQ.x) & !is.na(kin$IQ.y)] <- kin$IQ.y[!is.na(kin$IQ.x) & !is.na(kin$IQ.y)] - kin$IQ.x[!is.na(kin$IQ.x) & !is.na(kin$IQ.y)]
lr <- lm(data=kin, normalise(sesdiff) ~ normalise(iqdiff))
summary(lr)
Call:
lm(formula = normalise(sesdiff) ~ normalise(iqdiff), data = kin)
Residuals:
Min 1Q Median 3Q Max
-4.6883 -0.5006 -0.0007 0.5050 3.8940
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.01181 0.02687 -0.439 0.66
normalise(iqdiff) 0.40229 0.02653 15.165 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.901 on 1123 degrees of freedom
(8221 observations deleted due to missingness)
Multiple R-squared: 0.17, Adjusted R-squared: 0.1692
F-statistic: 230 on 1 and 1123 DF, p-value: < 2.2e-16
Random notes:
correlation_matrix(subset(new_data, select = c(GS, AR, WK, PC, NO, CS, AI, SI, MK, MC, EI, AO)))
GS AR WK PC NO CS AI SI MK MC EI AO
GS "NA" "0.717 ***" "0.81 ***" "0.732 ***" "0.465 ***" "0.42 ***" "0.524 ***" "0.597 ***" "0.699 ***" "0.707 ***" "0.73 ***" "0.562 ***"
AR "0.717 ***" "NA" "0.717 ***" "0.741 ***" "0.606 ***" "0.504 ***" "0.445 ***" "0.488 ***" "0.803 ***" "0.682 ***" "0.637 ***" "0.631 ***"
WK "0.81 ***" "0.717 ***" "NA" "0.766 ***" "0.505 ***" "0.445 ***" "0.477 ***" "0.544 ***" "0.696 ***" "0.663 ***" "0.7 ***" "0.536 ***"
PC "0.732 ***" "0.741 ***" "0.766 ***" "NA" "0.543 ***" "0.522 ***" "0.414 ***" "0.436 ***" "0.732 ***" "0.661 ***" "0.648 ***" "0.616 ***"
NO "0.465 ***" "0.606 ***" "0.505 ***" "0.543 ***" "NA" "0.571 ***" "0.216 ***" "0.213 ***" "0.665 ***" "0.403 ***" "0.4 ***" "0.407 ***"
CS "0.42 ***" "0.504 ***" "0.445 ***" "0.522 ***" "0.571 ***" "NA" "0.191 ***" "0.206 ***" "0.554 ***" "0.397 ***" "0.361 ***" "0.462 ***"
AI "0.524 ***" "0.445 ***" "0.477 ***" "0.414 ***" "0.216 ***" "0.191 ***" "NA" "0.589 ***" "0.363 ***" "0.535 ***" "0.566 ***" "0.335 ***"
SI "0.597 ***" "0.488 ***" "0.544 ***" "0.436 ***" "0.213 ***" "0.206 ***" "0.589 ***" "NA" "0.404 ***" "0.611 ***" "0.619 ***" "0.398 ***"
MK "0.699 ***" "0.803 ***" "0.696 ***" "0.732 ***" "0.665 ***" "0.554 ***" "0.363 ***" "0.404 ***" "NA" "0.64 ***" "0.596 ***" "0.632 ***"
MC "0.707 ***" "0.682 ***" "0.663 ***" "0.661 ***" "0.403 ***" "0.397 ***" "0.535 ***" "0.611 ***" "0.64 ***" "NA" "0.683 ***" "0.632 ***"
EI "0.73 ***" "0.637 ***" "0.7 ***" "0.648 ***" "0.4 ***" "0.361 ***" "0.566 ***" "0.619 ***" "0.596 ***" "0.683 ***" "NA" "0.513 ***"
AO "0.562 ***" "0.631 ***" "0.536 ***" "0.616 ***" "0.407 ***" "0.462 ***" "0.335 ***" "0.398 ***" "0.632 ***" "0.632 ***" "0.513 ***" "NA"
GG_denhist(new_data, 'ses')
Warning: Removed 1747 rows containing non-finite outside the scale range (`stat_bin()`).Warning: Removed 1747 rows containing non-finite outside the scale range (`stat_density()`).
lr <- lm(data=new_data, ses ~ rcs(IQ, 6))
lr2 <- lm(data=new_data, ses ~ IQ)
summary(lr)
Call:
lm(formula = ses ~ rcs(IQ, 6), data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.3844 -0.5415 -0.1028 0.4213 6.5849
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.918271 0.322454 -9.050 < 2e-16 ***
rcs(IQ, 6)IQ 0.028206 0.004135 6.821 9.99e-12 ***
rcs(IQ, 6)IQ' 0.022039 0.027666 0.797 0.426
rcs(IQ, 6)IQ'' -0.089581 0.159116 -0.563 0.573
rcs(IQ, 6)IQ''' 0.262472 0.375416 0.699 0.484
rcs(IQ, 6)IQ'''' -0.399247 0.423308 -0.943 0.346
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8301 on 5679 degrees of freedom
(3299 observations deleted due to missingness)
Multiple R-squared: 0.3126, Adjusted R-squared: 0.312
F-statistic: 516.6 on 5 and 5679 DF, p-value: < 2.2e-16
summary(lr2)
Call:
lm(formula = ses ~ IQ, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.2999 -0.5506 -0.1043 0.4225 6.5533
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.6211743 0.0734583 -49.30 <2e-16 ***
IQ 0.0367263 0.0007262 50.57 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8311 on 5683 degrees of freedom
(3299 observations deleted due to missingness)
Multiple R-squared: 0.3104, Adjusted R-squared: 0.3102
F-statistic: 2558 on 1 and 5683 DF, p-value: < 2.2e-16
anova(lr, lr2)
Analysis of Variance Table
Model 1: ses ~ rcs(IQ, 6)
Model 2: ses ~ IQ
Res.Df RSS Df Sum of Sq F Pr(>F)
1 5679 3912.8
2 5683 3925.7 -4 -12.906 4.6829 0.0008985 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
lr <- lm(data=new_data, weighted_occ ~ rcs(IQ, 6))
lr2 <- lm(data=new_data, weighted_occ ~ IQ)
summary(lr)
Call:
lm(formula = weighted_occ ~ rcs(IQ, 6), data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.6554 -0.5686 -0.1272 0.4037 11.6442
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.333661 0.325612 -7.167 8.47e-13 ***
rcs(IQ, 6)IQ 0.022597 0.004169 5.421 6.14e-08 ***
rcs(IQ, 6)IQ' 0.008587 0.027403 0.313 0.754
rcs(IQ, 6)IQ'' -0.012960 0.156218 -0.083 0.934
rcs(IQ, 6)IQ''' 0.023765 0.367076 0.065 0.948
rcs(IQ, 6)IQ'''' 0.035494 0.414546 0.086 0.932
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8937 on 6867 degrees of freedom
(2111 observations deleted due to missingness)
Multiple R-squared: 0.2063, Adjusted R-squared: 0.2058
F-statistic: 357.1 on 5 and 6867 DF, p-value: < 2.2e-16
summary(lr2)
Call:
lm(formula = weighted_occ ~ IQ, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.4591 -0.5728 -0.1365 0.3972 11.6179
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.9819070 0.0732491 -40.71 <2e-16 ***
IQ 0.0301350 0.0007238 41.63 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8962 on 6871 degrees of freedom
(2111 observations deleted due to missingness)
Multiple R-squared: 0.2014, Adjusted R-squared: 0.2013
F-statistic: 1733 on 1 and 6871 DF, p-value: < 2.2e-16
anova(lr, lr2)
Analysis of Variance Table
Model 1: weighted_occ ~ rcs(IQ, 6)
Model 2: weighted_occ ~ IQ
Res.Df RSS Df Sum of Sq F Pr(>F)
1 6867 5484.6
2 6871 5518.4 -4 -33.817 10.585 1.504e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
lr <- lm(data=new_data, weighted_mean_income ~ rcs(IQ, 6))
lr2 <- lm(data=new_data, weighted_mean_income ~ IQ)
summary(lr)
Call:
lm(formula = weighted_mean_income ~ rcs(IQ, 6), data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.3353 -0.5749 -0.1548 0.3632 7.3451
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.776596 0.345574 -8.035 1.1e-15 ***
rcs(IQ, 6)IQ 0.029233 0.004422 6.611 4.1e-11 ***
rcs(IQ, 6)IQ' -0.034379 0.028976 -1.186 0.235
rcs(IQ, 6)IQ'' 0.138421 0.164896 0.839 0.401
rcs(IQ, 6)IQ''' -0.164734 0.386764 -0.426 0.670
rcs(IQ, 6)IQ'''' 0.088057 0.436068 0.202 0.840
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9313 on 6720 degrees of freedom
(2258 observations deleted due to missingness)
Multiple R-squared: 0.1336, Adjusted R-squared: 0.133
F-statistic: 207.2 on 5 and 6720 DF, p-value: < 2.2e-16
summary(lr2)
Call:
lm(formula = weighted_mean_income ~ IQ, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.2069 -0.5738 -0.1542 0.3694 7.3192
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.4283238 0.0773067 -31.41 <2e-16 ***
IQ 0.0244714 0.0007626 32.09 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9314 on 6724 degrees of freedom
(2258 observations deleted due to missingness)
Multiple R-squared: 0.1328, Adjusted R-squared: 0.1327
F-statistic: 1030 on 1 and 6724 DF, p-value: < 2.2e-16
anova(lr, lr2)
Analysis of Variance Table
Model 1: weighted_mean_income ~ rcs(IQ, 6)
Model 2: weighted_mean_income ~ IQ
Res.Df RSS Df Sum of Sq F Pr(>F)
1 6720 5828.0
2 6724 5833.4 -4 -5.3466 1.5412 0.1873
lr <- lm(data=new_data, weighted_degree ~ rcs(IQ, 6))
lr2 <- lm(data=new_data, weighted_degree ~ IQ)
summary(lr)
Call:
lm(formula = weighted_degree ~ rcs(IQ, 6), data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.90593 -0.56529 0.00559 0.57516 2.83367
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.798243 0.293098 -9.547 < 2e-16 ***
rcs(IQ, 6)IQ 0.026233 0.003756 6.984 3.15e-12 ***
rcs(IQ, 6)IQ' 0.062558 0.024935 2.509 0.0121 *
rcs(IQ, 6)IQ'' -0.307091 0.142919 -2.149 0.0317 *
rcs(IQ, 6)IQ''' 0.648306 0.337196 1.923 0.0546 .
rcs(IQ, 6)IQ'''' -0.745258 0.381496 -1.954 0.0508 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8191 on 6825 degrees of freedom
(2153 observations deleted due to missingness)
Multiple R-squared: 0.3117, Adjusted R-squared: 0.3112
F-statistic: 618.2 on 5 and 6825 DF, p-value: < 2.2e-16
summary(lr2)
Call:
lm(formula = weighted_degree ~ IQ, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-2.97677 -0.56357 -0.01213 0.58718 2.93416
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.5935964 0.0667844 -53.81 <2e-16 ***
IQ 0.0366114 0.0006607 55.41 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8197 on 6829 degrees of freedom
(2153 observations deleted due to missingness)
Multiple R-squared: 0.3102, Adjusted R-squared: 0.3101
F-statistic: 3070 on 1 and 6829 DF, p-value: < 2.2e-16
anova(lr, lr2)
Analysis of Variance Table
Model 1: weighted_degree ~ rcs(IQ, 6)
Model 2: weighted_degree ~ IQ
Res.Df RSS Df Sum of Sq F Pr(>F)
1 6825 4578.6
2 6829 4588.8 -4 -10.249 3.8194 0.004187 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
lr <- lm(data=new_data, weighted_nw ~ rcs(IQ, 6))
lr2 <- lm(data=new_data, weighted_nw ~ IQ)
summary(lr)
Call:
lm(formula = weighted_nw ~ rcs(IQ, 6), data = new_data)
Residuals:
Min 1Q Median 3Q Max
-1.3100 -0.5217 -0.2644 0.2234 21.6795
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.044175 0.366523 -2.849 0.0044 **
rcs(IQ, 6)IQ 0.008985 0.004709 1.908 0.0564 .
rcs(IQ, 6)IQ' 0.018696 0.031903 0.586 0.5579
rcs(IQ, 6)IQ'' 0.014569 0.184709 0.079 0.9371
rcs(IQ, 6)IQ''' -0.110611 0.438322 -0.252 0.8008
rcs(IQ, 6)IQ'''' -0.017266 0.496242 -0.035 0.9722
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9852 on 5857 degrees of freedom
(3121 observations deleted due to missingness)
Multiple R-squared: 0.07801, Adjusted R-squared: 0.07722
F-statistic: 99.11 on 5 and 5857 DF, p-value: < 2.2e-16
summary(lr2)
Call:
lm(formula = weighted_nw ~ IQ, data = new_data)
Residuals:
Min 1Q Median 3Q Max
-1.3886 -0.5493 -0.2566 0.2162 21.6864
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.8110157 0.0850398 -21.30 <2e-16 ***
IQ 0.0184757 0.0008432 21.91 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9861 on 5861 degrees of freedom
(3121 observations deleted due to missingness)
Multiple R-squared: 0.07571, Adjusted R-squared: 0.07555
F-statistic: 480.1 on 1 and 5861 DF, p-value: < 2.2e-16
anova(lr, lr2)
Analysis of Variance Table
Model 1: weighted_nw ~ rcs(IQ, 6)
Model 2: weighted_nw ~ IQ
Res.Df RSS Df Sum of Sq F Pr(>F)
1 5857 5684.8
2 5861 5699.0 -4 -14.189 3.6547 0.005599 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
kin %>% group_by(R) %>% summarise(rses = cor.test(ses.x, ses.y)$estimate/0.81, cor.test(ses.x, ses.y)$parameter+2)