結婚暮らし班のデータをもとに、どのような人がモテるのかを分析する。
library(tidyverse)
## -- Attaching packages -------------------------------------- tidyverse 1.2.1 --
## √ ggplot2 3.2.1 √ purrr 0.3.2
## √ tibble 2.1.3 √ dplyr 0.8.3
## √ tidyr 1.0.0 √ stringr 1.4.0
## √ readr 1.3.1 √ forcats 0.4.0
## -- Conflicts ----------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
read_csv("20191112_Rawdata_4.csv") %>%
mutate(
kosai_ninzu = recode(
q20,`1` = 0,
`2` = 1,
`3` = 2.5,
`4` = 4.5,
`5` = 7,
.default = NA_real_),
eduyear = recode(q2,
`1` = 9,
`2` = 12,
`3` = 12,
`4` = 14,
`5` = 14,
`6` = 16,
`7` = 18,
.default = NA_real_),
SEX2 = recode_factor(SEX,`1` = "男",
`2` = "女",
.default = NA_character_),
kao_jishin = na_if(q15s1, 6),
kao_jishin = 6 - kao_jishin,
style_jishin = na_if(q15s2, 6),
style_jishin = 6 - style_jishin,
fashion_jishin = na_if(q15s3, 6),
fashion_jishin = 6 - fashion_jishin,
commu_jishin = na_if(q15s4, 6),
commu_jishin = 6 - commu_jishin,
omoiyari_jishin = na_if(q15s5, 6),
omoiyari_jishin = 6 - omoiyari_jishin
) -> data
## Parsed with column specification:
## cols(
## .default = col_double(),
## MID = col_character(),
## DAY = col_character(),
## q2t8 = col_character(),
## q3t14 = col_character(),
## q3t14_coded = col_character(),
## q3_coded = col_character(),
## q4t7 = col_character(),
## q5t13 = col_character(),
## q5t13_coded = col_character(),
## q6t8 = col_character(),
## q6t8_coded = col_character(),
## q8t13 = col_character(),
## q8t13_coded = col_character(),
## q9t8 = col_character(),
## q31t1 = col_character()
## )
## See spec(...) for full column specifications.
交際人数のヒストグラム
hist(data$kosai_ninzu)
男女別のボックスプロット
boxplot(kosai_ninzu ~ SEX2, data = data)
男女別t検定
t.test(kosai_ninzu ~ SEX2, data = data)
##
## Welch Two Sample t-test
##
## data: kosai_ninzu by SEX2
## t = -4.0161, df = 595.49, p-value = 6.674e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.1735986 -0.4027368
## sample estimates:
## mean in group 男 mean in group 女
## 2.646259 3.434426
女性のほうが有意にモテる。
性別ごと年齢と交際人数の関係
ggplot(data,aes(x = AGE, y = kosai_ninzu,
shape = SEX2, color = SEX2)) +
geom_point() +
geom_smooth(method = "lm")
## Warning: Removed 63 rows containing non-finite values (stat_smooth).
## Warning: Removed 63 rows containing missing values (geom_point).
summary(lm(kosai_ninzu ~ AGE + SEX2, data = data))
##
## Call:
## lm(formula = kosai_ninzu ~ AGE + SEX2, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9595 -2.0296 -0.4503 1.6362 4.9165
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.75893 0.52988 1.432 0.152597
## AGE 0.06308 0.01709 3.690 0.000245 ***
## SEX2女 0.80371 0.19421 4.138 4e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.376 on 596 degrees of freedom
## (63 observations deleted due to missingness)
## Multiple R-squared: 0.04807, Adjusted R-squared: 0.04487
## F-statistic: 15.05 on 2 and 596 DF, p-value: 4.212e-07
年齢が一定の場合女性であることが平均交際人数を0.8人押し上げる。性別に関係なく年齢が1歳上がるごとに交際人数は0.06人増える。
性別ごとの学歴の関係
ggplot(data,aes(x = eduyear, y = kosai_ninzu,
shape = SEX2, color = SEX2)) +
geom_point() +
geom_smooth(method = "lm")
## Warning: Removed 76 rows containing non-finite values (stat_smooth).
## Warning: Removed 76 rows containing missing values (geom_point).
summary(lm(kosai_ninzu ~ eduyear + SEX2, data = data))
##
## Call:
## lm(formula = kosai_ninzu ~ eduyear + SEX2, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5285 -2.3458 -0.7848 1.8143 4.5038
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.25248 0.63543 3.545 0.000424 ***
## eduyear 0.02708 0.04375 0.619 0.536268
## SEX2女 0.78868 0.19755 3.992 7.38e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.39 on 583 degrees of freedom
## (76 observations deleted due to missingness)
## Multiple R-squared: 0.02704, Adjusted R-squared: 0.02371
## F-statistic: 8.103 on 2 and 583 DF, p-value: 0.0003382
教育年数の効果はないようだ。
顔やスタイル、コミュニケーションに自信がある人ほどモテるだろう(あるいは逆の因果?)。まずは相関関係を確認。
data %>%
select(kao_jishin,style_jishin,fashion_jishin,commu_jishin,omoiyari_jishin) %>%
cor(., use = "pairwise")
## kao_jishin style_jishin fashion_jishin commu_jishin
## kao_jishin 1.0000000 0.6184152 0.4137306 0.4639191
## style_jishin 0.6184152 1.0000000 0.4227878 0.4141882
## fashion_jishin 0.4137306 0.4227878 1.0000000 0.5163264
## commu_jishin 0.4639191 0.4141882 0.5163264 1.0000000
## omoiyari_jishin 0.3438046 0.3229733 0.4348087 0.5066071
## omoiyari_jishin
## kao_jishin 0.3438046
## style_jishin 0.3229733
## fashion_jishin 0.4348087
## commu_jishin 0.5066071
## omoiyari_jishin 1.0000000
年齢性別教育年数という基本変数に各自信項目をぶちこんで回帰分析。
summary(lm(kosai_ninzu ~ AGE + SEX2 + eduyear +
kao_jishin + style_jishin + fashion_jishin +
commu_jishin + omoiyari_jishin, data = data))
##
## Call:
## lm(formula = kosai_ninzu ~ AGE + SEX2 + eduyear + kao_jishin +
## style_jishin + fashion_jishin + commu_jishin + omoiyari_jishin,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.4491 -1.6791 -0.3033 1.6760 5.3566
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.916732 0.891013 -2.151 0.03190 *
## AGE 0.079067 0.016948 4.665 3.88e-06 ***
## SEX2女 0.539689 0.198980 2.712 0.00689 **
## eduyear -0.003870 0.042956 -0.090 0.92825
## kao_jishin 0.193151 0.109479 1.764 0.07825 .
## style_jishin -0.003388 0.102746 -0.033 0.97371
## fashion_jishin 0.284801 0.100315 2.839 0.00469 **
## commu_jishin 0.270875 0.099240 2.729 0.00655 **
## omoiyari_jishin 0.068941 0.110797 0.622 0.53405
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.246 on 543 degrees of freedom
## (110 observations deleted due to missingness)
## Multiple R-squared: 0.1494, Adjusted R-squared: 0.1368
## F-statistic: 11.92 on 8 and 543 DF, p-value: 1.015e-15
年齢が高いこと、女性であること、ファッションの自信があること、コミュニケーションの自信があることが有意に交際人数を高める。