homework4

社會階層與環保意識探索式分析

我們運用MCA的分析方式來運算在不同的性別、教育程度、政黨偏好、個人與家庭收入等不同的社會階級中，對於環保是否有不同的作為。因此，我們選擇了使用免洗使用免洗餐具、瓶裝水、使用免洗餐具、瓶裝水、手帕的使用頻率、願不願意為了環境支付較高物價和稅等變數，藉由這些變數來探索之間是否具有潛在的關聯。

Running Code

資料讀取與清理

sjPlot::set_theme(theme.font="PingFang TC")
library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

load("tscs202r.rda")
tscs202rforMCA<- select(homework, c(# 核心變數 (core vars)
  a1,a13,h16,h18,h19,a16, # 性別、教育程度、政黨偏好、個人收入、家庭收入
,a17,a18,a19,b11a,b11b
))

##資料清理與描述性圖表

library(sjlabelled)


Attaching package: 'sjlabelled'

The following object is masked from 'package:dplyr':

    as_label

tscs202 <- na.omit(tscs202rforMCA)
names(tscs202)

 [1] "a1"   "a13"  "h16"  "h18"  "h19"  "a16"  "a17"  "a18"  "a19"  "b11a"
[11] "b11b"

par(mfrow=c(2,3))
for (i in 1:ncol(tscs202)) {
  plot(tscs202[,i], main=colnames(tscs202)[i],
       ylab = "Count", col="steelblue", las = 2, ylim=c(0,1500))}

透過資料檔的整理與編碼製成了10個描述性的圖表。 a1代表性別，1.男性，2.女性。 a13代表教育程度，1.國小以下，2.國高中，3.大學（專）以上，4.其他。 h16代表政黨偏好，1.泛藍，2.泛綠，3.中間，4.其他。 h18代表個人月收入，1.5萬以下，2.5_10萬，3.1020萬，4.20萬以上 h19代表家庭月收入，1.5萬以下，2.5_10萬，3.1020萬，4.20~100萬，5.100萬以上。 a16代表購物時，常不常加購購物袋或索取提袋？1.經常，2.有時，3.從不。 a17代表用餐時，常不常使用店家所提供的免洗餐具？1.經常，2.有時，3.從不。 a18代表在外喝水時，常不常使用瓶裝水或紙杯？1.經常，2.有時，3.從不。 a19代表用餐時，常不常以手帕代替面紙或濕紙巾之使用？1.經常，2.有時，3.從不。 b11a代表願不願意為了保護環境而支付高出很多的物價？1.願意，2.不願意，3.無所謂。 b11b代表願不願意為了保護環境而支付高出很多的稅？1.願意，2.不願意，3.無所謂。

##維次陡坡圖

library(FactoMineR)
library(factoextra)

Loading required package: ggplot2


Attaching package: 'ggplot2'

The following object is masked from 'package:sjlabelled':

    as_label

Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa

res<- MCA(tscs202, ncp=6, graph= F) 
#ncp 為主觀定的維次個數
#quanti 輔助連續變數
#quali 輔助類別變數

fviz_screeplot(res, ncp=14)

本研究從圖表中可以看出所有變數主要構成了3個維次，從第3個維次後折線走向較為平緩，不像前棉來的陡峭。因此我們可以判斷出在第4個維次後對全體變數共同變異的總量的變異貢獻較前3個維次小。

##為次歸納描述

library(corrplot)

corrplot 0.92 loaded

corrplot(res$var$cos2, is.corr=FALSE, tl.cex=.6)

在歸納描述的圖表中，我們可以簡單歸納出構成個維次的重要變數為：第一維次：a13、h18、h19、b11a、b11b 第二維次：a1、a17、a18 第三維次：a17、a18、b11a、b11b

##變數關聯分佈圖

plot(res, axes=c(1, 2), new.plot=TRUE, choix="var", 
     col.var="red", col.quali.sup="darkgreen", 
     label=c("quali.sup", "quanti.sup", "var"), 
     invisible=c("ind"), 
     autoLab = "yes",
     #  title="The Distribution of Variables on the MCA Factor Map",
     title="", cex=0.8,
     xlim=c(0,0.6), ylim=c(0, 0.6))

在這張圖中我們可以發現，第一維次最重要的變數是「政黨偏好」（h16）。而第二維次最重要的變數是「用餐時，常不常以手帕代替面紙或濕紙巾之使用？」（a19）

###變數類別關係圖

plot(res, axes=c(1, 2), new.plot=TRUE, 
     col.var="red", col.ind="black", col.ind.sup="black",
     col.quali.sup="darkgreen", col.quanti.sup="blue",
     label=c("var"), cex=0.8, 
     selectMod = "cos2 70",
     invisible=c("ind", "quali.sup"), 
     autoLab = "yes",
     #title="Distribution of Elements on the MCA Factor Map") 
     title="")

Warning: ggrepel: 13 unlabeled data points (too many overlaps). Consider
increasing max.overlaps

plot(res, axes=c(1, 2), new.plot=TRUE, 
     col.var="red", col.ind="black", col.ind.sup="black",
     col.quali.sup="darkgreen", col.quanti.sup="blue",
     label=c("var"), cex=0.8, 
     selectMod = "cos2 70",
     invisible=c("ind", "quali.sup"), 
     xlim = c(-1,1), ylim = c(-0.5,0.5),
     autoLab = "yes",
     title="")

Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_point()`).

Warning: Removed 13 rows containing missing values or values outside the scale range
(`geom_text_repel()`).

最具維度辨識力的變數類別組合

plot(res, axes=c(1, 2), new.plot=TRUE, 
     col.var="red", col.ind="black", col.ind.sup="black",
     col.quali.sup="darkgreen", col.quanti.sup="blue",
     label=c("var"), cex=0.8, 
     selectMod = "cos2 30",  #共52個變數
     invisible=c("ind", "quali.sup"), 
     xlim=c(-2,2), ylim=c(-2,2), 
     autoLab = "yes",
     # title="Top 30 Critical Elements on the MCA Factor Map") 
     title="")

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_text_repel()`).

Warning: ggrepel: 2 unlabeled data points (too many overlaps). Consider
increasing max.overlaps

第一象限的特徵描述：在第一象限的變數中，我們可以發現這是一個由高教育程度（a13_3）與高個人與家庭收入（h18_3、h18_4、h19_4、h19_5）組成，在政治偏好上屬於中間（h16_3），令人意外的是，這群人會願意為環境支付較高的物價（b11a_1）和稅金（b11b_1）。

第二象限的特徵描述：在第二象限的變數中，可以發現這是由低教育程度（a13_1）與低家庭收入（h19_1）組成，他們他們從不加購購物袋（a16_3）、不使用免洗餐具（a17_3）、不使用瓶裝水（a18_3），並且經常使用手帕並且經常使用手帕代替面紙（a19_1）。

第三象限的特徵描述：在第三象限的變數中，可以發現這是由中等教育程度（a13_2)，但低個人收入（h18_1）的族群組成。他們不願意為環境支付較高的物價（b11a_2）和稅金（b11b_3）。

第四象限的特徵描述：在第四象限的變數中，這是一個由中等個人收入（h18_2）與中等家庭收入組成（h19_2、h19_3）的群體，在圖表中發現他們比較經常使用瓶裝水。

###受訪者在1、2維度的分佈

plot(res, axes=c(1, 2), new.plot=TRUE, choix="ind", 
     col.var="red", col.quali.sup="darkgreen",
     label=c("var"),
     selectMod ="cos2 15", select="cos2 1",
     xlim=c(-2,2),
     invisible=c("quali.sup", "var"), 
     #title="The Distribution of Individuals on the MCA Factor Map")
     title="")