Exercise

1.What is the relationship between crime rate and level of education? A dataset privides the demographic information of the 50 states of the United States of America. Reproduce the graph shown below but report the correlation coefficients in the upper panels.

sta <- read.table("data/state77.txt",h=T)
head(sta)

       State Population Income Illiteracy LifeExp Murder HSGrad
1    Alabama       3615   3624        2.1   69.05   15.1   41.3
2     Alaska        365   6315        1.5   69.31   11.3   66.7
3    Arizona       2212   4530        1.8   70.55    7.8   58.1
4   Arkansas       2110   3378        1.9   70.66   10.1   39.9
5 California      21198   5114        1.1   71.71   10.3   62.6
6   Colorado       2541   4884        0.7   72.06    6.8   63.9

panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...) {
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y, use="pair"))
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * r)
}
pairs( ~ Murder + Illiteracy + HSGrad, data=sta,
       upper.panel=panel.cor,lower.panel=panel.smooth)

2.Deaths per 100,000 from male suicides for 5 age groups and 15 countries are given in the table below. Construct side-by-side box plots for the data from different age groups and comment briefly.

sui <- read.table("data/de.txt",h=T)
head(sui)

  Country from25to34 from35to44 from45to54 from55to64 from65to74
1  Canada         22         27         31         34         24
2  Israel          9         19         10         14         27
3   Japan         22         19         21         31         49
4 Austria         29         40         52         53         69
5  France         16         25         36         47         56
6 Germany         28         35         41         49         52

# draw side-by-side box plots
boxplot(sui[,2:6],ylab="number",col="lightblue")

5.Researchers were interested in determining the association between family members on a measure of liberalism. They took a simple random sample of families and obtained the liberalism ratings for all family members of each family selected.Use R to create the graph shown below. The vertical line is the grand mean. The short dash line segments indicate family means. The filled circles are observed ratings.

fa <- read.table("data/family.txt",h=T)
plot(fa$liberalism,fa$family,type="n",yaxt="n",xlab = "",ylab = "")
axis(2,at = seq(1, 4, 1) )
title("Family Resemblance", xlab="Rating of liberalism",ylab="Family")
set.seed(33)
points(fa$liberalism, jitter(fa$family,1), pch = 19)
grand_mean <- mean(fa$liberalism)
abline(v=grand_mean,lty=2)
family1_mean <- mean(fa[fa$family==1,]$liberalism)
family2_mean <- mean(fa[fa$family==2,]$liberalism)
family3_mean <- mean(fa[fa$family==3,]$liberalism)
family4_mean <- mean(fa[fa$family==4,]$liberalism)
segments(grand_mean,1,family1_mean,1, lty=2)
segments(family1_mean,0.9,family1_mean,1.1, lty=2)
segments(grand_mean,2,family2_mean,2, lty=2)
segments(family2_mean,1.9,family2_mean,2.1, lty=2)
segments(grand_mean,3,family3_mean,3, lty=2)
segments(family3_mean,2.9,family3_mean,3.1, lty=2)
segments(grand_mean,4,family4_mean,4, lty=2)
segments(family4_mean,3.9,family4_mean,4.1, lty=2)

6.Doll (1955) showed per capita consumption of cigarettes in 11 countries in 1930, and the death rates from lung cancer for men in 1950. Plot the graph shown below.

cig <- read.table("data/cigarettes.txt",h=T)
head(cig)

    Country consumption death
1 Australia         480   180
2    Canada         500   150
3   Denmark         380   170
4   Finland        1100   350
5        UK        1100   460
6   Iceland         230    60

with(cig,plot(consumption,death,type="n",xlab = "Consumption",ylab = "Death rate"))
abline(lm(death~consumption,data=cig),lty=3)
with(cig,text(consumption,death,labels =Country))

8.Use basic R graphics to create a plot like the one below by the following steps: 1. Draw two independent samples of 512 values from the standard normal distribution. 2. Mark those points (x, y) > 1.96 and (x, y) < -1.96 in solid red. 3. Draw the boundary lines.

set.seed(123)
aa <- rnorm(512)
bb <- rnorm(512)
plot(aa,bb,main="Outliers in red",xlab = "Standard normal variate",ylab = "Standard normal variate",xlim = c(-4,4),ylim = c(-4,4),col=ifelse(((abs(aa)>1.96 & abs(bb)> 1.96)),"red", "black"), pch=ifelse(((abs(aa)>1.96 & abs(bb)> 1.96)),20, 1))
abline(h=1.96,v=1.96,lty=2,col="gray")
abline(h=-1.96,v=-1.96,lty=2,col="gray")

10.Use basic R graphics to create a plot like the following one

plot.new()
plot.window(xlim = c(1, 10), ylim = c(1, 10))
points(3,7,pch=20,cex=5)
points(4.5,7,pch=20,cex=5)
points(6,7,pch=20,cex=5)
points(7.5,7,pch=20,cex=5)
segments(3,2,3,5.5, lty=1,lwd=15)
segments(4.5,2,4.5,5.5, lty=1,lwd=15)
segments(6,2,6,5.5, lty=1,lwd=15)
segments(7.5,2,7.5,5.5, lty=1,lwd=15)
segments(1,1,1,4.5, lty=1,lwd=4)
segments(9.5,1,9.5,4.5, lty=1,lwd=4)
segments(1,4.5,2.5,4.5, lty=1,lwd=4)
segments(8,4.5,9.5,4.5, lty=1,lwd=4)

Exercise_Plot