What is the relationship between crime rate and level of education?
A dataset privides the demographic information of the 50 states of the United States of America. Reproduce the graph shown below but report the correlation coefficients in the upper panels.
Source: U.S. Department of Commerce, Bureau of the Census (1977). Statistical Abstract of the United States.
Reported in Becker, R.A., Chambers, J.M., & Wilks, A.R. (1988). The New S Language. Wadsworth & Brooks/Cole.
Column 1: State ID
Column 2: Population estmate as of July 1, 1975
Column 3: Income per capita in 1974
Column 4: Illiteracy, percent of population in 1970
Column 5: Life expectancy in years (1969-1971)
Column 6: Murder and non-negligent manslaughter rate per 100,000 population (1976)
Column 7: Percent high-school graduates in 1970
dirr<-"K:/Dropbox/1042_dataM/state77.txt"
dta<-read.table(dirr,header=T)
panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...) {
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y, use="pair"))
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * r)
}
pairs(dta[,c(6,4,7)],upper.panel = panel.cor, lower.panel = panel.smooth,main="Correlation Coefficients and Relationship")
Deaths per 100,000 from male suicides for 5 age groups and 15 countries are given in the table below. Construct side-by-side box plots for the data from different age groups and comment briefly.
ANS: 此筆資料綜合15個國家來看,男性自殺人數的中位數隨著年齡增加而增加。
dta<-read.table("suicide.txt",h=T)
colnames(dta)<-c("Country","25-34","35-44","45-54","55-64","65-74")
boxplot(dta[-1],horizontal = T,main="Deaths per 100,000 and Age",xlab="Deaths per 100,000",ylab="Age")
A repeated measures design is a common experimental design in educational and behavioral studies. Typically, one has a sample of subjects (4 students, say) whose scores on a given test (a number series test, say) are repeatedly recorded at three time points (grade level I, II, and III, say). The R script illustrates how to use low level graphical commands to plot individual observations and to add a mean curve to the graph. Provide comments on what each of the script lines does.
edu <- rep(c("I","II","III"),4)
sbj <- paste("S", 1:4, sep="")
sbj <- rep(sbj, rep(3,4))
score <- c(19,15,19,18,11,15,21,16,19,21,8,13)
dta <- data.frame(sbj=sbj, edu=edu, score=score)#建構資料,內含subject,edu,還有score。
meandta <- aggregate(formula =score ~ edu, data = dta, FUN = mean)
meandta <- data.frame(sbj='mean',meandta)
dta <- rbind(dta, meandta)#建構包含edu三個level的資料
plot(dta$score ~ as.numeric(dta$edu), type="n", axes=F,
xlim=c(0, 4), ylim=c(min(dta$score)-1, max(dta$score+1)),
xlab="Education level", ylab="Test score" )
grid()
#連結每個subject的三個education level的點,以及平均的點
points(as.numeric(dta$edu), dta$score)
lines(as.numeric(dta$edu[dta$sbj=="S1"]), dta$score[dta$sbj=="S1"],col="cyan")
lines(as.numeric(dta$edu[dta$sbj=="S2"]), dta$score[dta$sbj=="S2"],col="cyan2")
lines(as.numeric(dta$edu[dta$sbj=="S3"]), dta$score[dta$sbj=="S3"],col="cyan3")
lines(as.numeric(dta$edu[dta$sbj=="S4"]), dta$score[dta$sbj=="S4"],col="cyan4")
lines(as.numeric(dta$edu[dta$sbj=="mean"]), dta$score[dta$sbj=="mean"],
col="blue", lwd=2)
#標記不同顏色的線的代表的subject
legend("bottomright", c("S1","S2","S3", "S4", "Mean"),
lty=1, lwd=c(rep(1,4),2), bty="n", cex=0.6,
col=c("cyan","cyan2", "cyan3", "cyan4", "blue"))
#標記三個不同的education level
text(1,7, "I")
text(2,7, "II")
text(3,7, "III")
#繪製座標軸
axis(2, min(dta$score):max(dta$score))
Draw the following figure using the variable ‘math’ in the hs0.txt dataset.
dir<-"http://titan.ccunix.ccu.edu.tw/~psycfs/dataM/Data/hs0.txt"
dta<-read.table(dir,h=T)
dta$ses<-factor(dta$ses,levels(dta$ses)[c(2,3,1)])
dta<-dta[order(dta$ses,dta$math),]
nlow<-table(dta$ses)[1]
nmid<-table(dta$ses)[2]
nhigh<-table(dta$ses)[3]
plot.new()
plot.window(xlim = c(1,200), ylim = c(30, 75))
points(dta$math,pch=20,col="skyblue")
points(dta$math)
lines((1:nlow),dta$math[dta$ses=="low"],col="blue",lty=2)
lines((nlow+1):(nlow+nmid),dta$math[dta$ses=="middle"],col="blue",lty=2)
lines((nlow+nmid+1):(nlow+nmid+nhigh),dta$math[dta$ses=="high"],col="blue",lty=2,lwd=2)
abline(h=mean(dta$math),col="grey")
grid()
axis(1,at=c(nlow/2,(nlow+nmid/2),(nlow+nmid+nhigh/2)),labels=c("low","middle","high"))
axis(2)
box()
title(main="Math scores by socioeconomic status",ylab="Math score",xlab="SES")
Researchers were interested in determining the association between family members on a measure of liberalism. They took a simple random sample of families and obtained the liberalism ratings for all family members of each family selected.
Use R to create the graph shown below. The vertical line is the grand mean. The short dash line segments indicate family means. The filled circles are observed ratings.
Source: Lindeman, R.H., et al. (1980). Introduction to Bivariate and Multivariate Analysis, p.88.
Column 1: Family ID
Column 2: Rating of liberalism
dir<-"http://titan.ccunix.ccu.edu.tw/~psycfs/dataM/Data/family.txt"
dta<-read.table(dir,h=T)
plot(dta$liberalism,dta$family,pch=16,main="Family Resemblance",xlab="Rating of Liberalism",ylab="Family",ylim=c(1,4))
abline(v=mean(dta$liberalism),lty=2)
segments(mean(dta[dta$family==1,]$liberalism), 1, x1 = mean(dta$liberalism), y1 = 1,lty=2)
segments(mean(dta[dta$family==2,]$liberalism), 2, x1 = mean(dta$liberalism), y1 = 2,lty=2)
segments(mean(dta[dta$family==3,]$liberalism), 3, x1 = mean(dta$liberalism), y1 = 3,lty=2)
segments(mean(dta[dta$family==4,]$liberalism), 4, x1 = mean(dta$liberalism), y1 = 4,lty=2)
Doll (1955) showed per capita consumption of cigarettes in 11 countries in 1930, and the death rates from lung cancer for men in 1950. Plot the graph shown below.
Source: Freedman, et al. (1997). Statistics. pp. 148-150.
Column 1: Country names
Column 2: Cigarettes consumption
Column 3: Death rate
dir<-"http://titan.ccunix.ccu.edu.tw/~psycfs/dataM/Data/cigarettes.txt"
dta<-read.table(dir,h=T)
plot(dta$consumption,dta$death,type="n",xlab="Consumption",ylab="Death rate",main="Death rate and Consumption for Each Country")
abline(g<-lm(death~consumption-1,data=dta),lty=2)
text(dta$consumption,dta$death,labels=dta$Country,cex=0.7)
The dataset consists of a sample of 14 primary school children between 8 and 12 years old. The children were asked to respond on 8 emotions and coping strategies scales for each of 6 situations: fail to fulfill assingments in class, not allowed to play with other children, forbidden to do something by the teacher, victim of bullying, too much school work, forbidden to do something by the mother.
Plot the data in some meaningful ways. You may have to manipulate data into a different format first.
Source: Roeder, I., Boekaerts, M., & Kroonenberg, P. M. (2002). The stress and coping questionnaire for children (School version and Asthma version): Construction, factor structure, and psychometric properties. Psychological Reports, 91, 29-36.
Column 1: Unpleasant (Annoy)
Column 2: Sad
Column 3: Afraid
Column 4: Angry
Column 5: Approach coping
Column 6: Avoidant coping
Column 7: Social support seeking
Column 8: Emotional reaction, especially agression
Column 9: Situation ID
Column 10: Children ID
dir<-"http://titan.ccunix.ccu.edu.tw/~psycfs/dataM/Data/coping.txt"
dta<-read.table(dir,h=T)
ls<-split(dta,dta$situation)
lsdta<-t(sapply(ls,function(x) apply(as.matrix(x[,1:8]),2,mean)))
plot.new()
plot.window(xlim = c(0,9), ylim = c(0.5,3.5))
box()
axis(1,at=c(1:8),labels=c(colnames(lsdta)),las = 2)
axis(2,las=1)
for (i in 1:6){
points(lsdta[i,],pch=i,col=i,cex=1.5)
lines(lsdta[i,],lty=i,col=i,lwd=1.5)
}
title(main="Mean score of emotion with situations",ylab="Emotion score mean(N=14)",xlab="Emotion")
grid()
legend("bottomright", legend = c(rownames(lsdta)), pch =c(1:6),col = c(1:6), text.col = c(1:6), bty = "n" )
Use basic R graphics to create a plot like the one below by the following steps:
1. Draw two independent samples of 512 values from the standard normal distribution.
2. Mark those points (x, y) > 1.96 and (x, y) < -1.96 in solid red.
3. Draw the boundary lines.
x<-rnorm(512)
y<-rnorm(512)
plot(x,y,xlim=c(-4,4),ylim=c(-4,4))
judge<-abs(x)>1.96&abs(y)>1.96
points(x[judge],y[judge],col="red",pch=16,xlab="Standard normal variate",ylab="Standard normal variate", main="Outliers in red")
abline(v=1.96,h=1.96,lty=2,col="grey")
abline(v=-1.96,h=-1.96,lty=2,col="grey")
Two measures of bullying behavior have been constructed. The first uses self-report on a 9-item Illinois Bully Scale. The second is peer nomination in which children list any person they perceive as a bully. The total number of nominations a person receives is a measure of bullying behavior.
Use R to reproduce the following plots.
Source: Espelage, D.L., Holt, M.K., & Henkel, R.R. (2003). Examination of Peer-Group Contextual Effects on Aggression During Early Adolescence. Child Development, 74, 205-220.
Column 1: Score on the Illinois Bully Scale
Column 2: Total number of peer nominations
dir<-"http://titan.ccunix.ccu.edu.tw/~psycfs/dataM/Data/bullying.txt"
dta<-read.table(dir,h=T)
hist(dta$nomination,freq = F,main="Histogram of Nomination",xlab="Numbers of Nomination",ylab="Density")
plot(density(dta$nomination),main="Nomination Distribution",xlab="N=291 Bandwidth=0.4319",ylab="Density")
hist(dta$score,freq = F,main="Histogram of Score",xlab="Score",ylab="Density")
plot(density(dta$score),main="Score Distribution",xlab="N=291 Bandwidth=0.4319",ylab="Density")
10
Use basic R graphics to create a plot like the following one:
plot.new()
plot.window(xlim = c(-1.5,1.25), ylim = c(-1, 1))
points(-0.5,0.5,cex=3.5,pch=16)
points(-0.25,0.5,cex=3.5,pch=16)
points(-0,0.5,cex=3.5,pch=16)
points(0.25,0.5,cex=3.5,pch=16)
rect(xleft=-0.55, ybottom=-0.1, xright=-0.45, ytop=0.35,col = "black")
rect(xleft=-0.3, ybottom=-0.1, xright=-0.2, ytop=0.35,col = "black")
rect(xleft=-0.05, ybottom=-0.1, xright=0.05, ytop=0.35,col = "black")
rect(xleft=0.2, ybottom=-0.1, xright=0.3, ytop=0.35,col = "black")
rect(xleft=-1.25, ybottom=0.2, xright=-0.65, ytop=0.25,col = "black")
rect(xleft=0.4, ybottom=0.2, xright=1, ytop=0.25,col = "black")
rect(xleft=-1.25, ybottom=-0.35,xright=-1.2, ytop=0.2,col = "black")
rect(xleft=0.95, ybottom=-0.35, xright=1, ytop=0.2,col = "black")