1.What is the relationship between crime rate and level of education? A dataset privides the demographic information of the 50 states of the United States of America. Reproduce the graph shown below but report the correlation coefficients in the upper panels.
sta <- read.table("data/state77.txt",h=T)
head(sta)
State Population Income Illiteracy LifeExp Murder HSGrad
1 Alabama 3615 3624 2.1 69.05 15.1 41.3
2 Alaska 365 6315 1.5 69.31 11.3 66.7
3 Arizona 2212 4530 1.8 70.55 7.8 58.1
4 Arkansas 2110 3378 1.9 70.66 10.1 39.9
5 California 21198 5114 1.1 71.71 10.3 62.6
6 Colorado 2541 4884 0.7 72.06 6.8 63.9
panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...) {
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y, use="pair"))
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * r)
}
pairs( ~ Murder + Illiteracy + HSGrad, data=sta,
upper.panel=panel.cor,lower.panel=panel.smooth)
2.Deaths per 100,000 from male suicides for 5 age groups and 15 countries are given in the table below. Construct side-by-side box plots for the data from different age groups and comment briefly.
sui <- read.table("data/de.txt",h=T)
head(sui)
Country from25to34 from35to44 from45to54 from55to64 from65to74
1 Canada 22 27 31 34 24
2 Israel 9 19 10 14 27
3 Japan 22 19 21 31 49
4 Austria 29 40 52 53 69
5 France 16 25 36 47 56
6 Germany 28 35 41 49 52
# draw side-by-side box plots
boxplot(sui[,2:6],ylab="number",col="lightblue")
5.Researchers were interested in determining the association between family members on a measure of liberalism. They took a simple random sample of families and obtained the liberalism ratings for all family members of each family selected.Use R to create the graph shown below. The vertical line is the grand mean. The short dash line segments indicate family means. The filled circles are observed ratings.
fa <- read.table("data/family.txt",h=T)
plot(fa$liberalism,fa$family,type="n",yaxt="n",xlab = "",ylab = "")
axis(2,at = seq(1, 4, 1) )
title("Family Resemblance", xlab="Rating of liberalism",ylab="Family")
set.seed(33)
points(fa$liberalism, jitter(fa$family,1), pch = 19)
grand_mean <- mean(fa$liberalism)
abline(v=grand_mean,lty=2)
family1_mean <- mean(fa[fa$family==1,]$liberalism)
family2_mean <- mean(fa[fa$family==2,]$liberalism)
family3_mean <- mean(fa[fa$family==3,]$liberalism)
family4_mean <- mean(fa[fa$family==4,]$liberalism)
segments(grand_mean,1,family1_mean,1, lty=2)
segments(family1_mean,0.9,family1_mean,1.1, lty=2)
segments(grand_mean,2,family2_mean,2, lty=2)
segments(family2_mean,1.9,family2_mean,2.1, lty=2)
segments(grand_mean,3,family3_mean,3, lty=2)
segments(family3_mean,2.9,family3_mean,3.1, lty=2)
segments(grand_mean,4,family4_mean,4, lty=2)
segments(family4_mean,3.9,family4_mean,4.1, lty=2)
6.Doll (1955) showed per capita consumption of cigarettes in 11 countries in 1930, and the death rates from lung cancer for men in 1950. Plot the graph shown below.
cig <- read.table("data/cigarettes.txt",h=T)
head(cig)
Country consumption death
1 Australia 480 180
2 Canada 500 150
3 Denmark 380 170
4 Finland 1100 350
5 UK 1100 460
6 Iceland 230 60
with(cig,plot(consumption,death,type="n",xlab = "Consumption",ylab = "Death rate"))
abline(lm(death~consumption,data=cig),lty=3)
with(cig,text(consumption,death,labels =Country))
8.Use basic R graphics to create a plot like the one below by the following steps: 1. Draw two independent samples of 512 values from the standard normal distribution. 2. Mark those points (x, y) > 1.96 and (x, y) < -1.96 in solid red. 3. Draw the boundary lines.
set.seed(123)
aa <- rnorm(512)
bb <- rnorm(512)
plot(aa,bb,main="Outliers in red",xlab = "Standard normal variate",ylab = "Standard normal variate",xlim = c(-4,4),ylim = c(-4,4),col=ifelse(((abs(aa)>1.96 & abs(bb)> 1.96)),"red", "black"), pch=ifelse(((abs(aa)>1.96 & abs(bb)> 1.96)),20, 1))
abline(h=1.96,v=1.96,lty=2,col="gray")
abline(h=-1.96,v=-1.96,lty=2,col="gray")
10.Use basic R graphics to create a plot like the following one
plot.new()
plot.window(xlim = c(1, 10), ylim = c(1, 10))
points(3,7,pch=20,cex=5)
points(4.5,7,pch=20,cex=5)
points(6,7,pch=20,cex=5)
points(7.5,7,pch=20,cex=5)
segments(3,2,3,5.5, lty=1,lwd=15)
segments(4.5,2,4.5,5.5, lty=1,lwd=15)
segments(6,2,6,5.5, lty=1,lwd=15)
segments(7.5,2,7.5,5.5, lty=1,lwd=15)
segments(1,1,1,4.5, lty=1,lwd=4)
segments(9.5,1,9.5,4.5, lty=1,lwd=4)
segments(1,4.5,2.5,4.5, lty=1,lwd=4)
segments(8,4.5,9.5,4.5, lty=1,lwd=4)