DataM: In-class Exercise 0413 - Trellis 1
Render the R script for replicating figures in Chapter 4 of Lattice: Multivariate Data Visualization with R (Sarkar, D. 2008) to html document with comments at each code chunk indicated by '##'.
Chunk 1
Display the data set VADeaths.
Rural Male Rural Female Urban Male Urban Female
50-54 11.7 8.7 15.4 8.4
55-59 18.1 11.7 24.3 13.6
60-64 26.9 20.3 37.0 19.3
65-69 41.0 30.9 54.6 35.1
70-74 66.0 54.3 71.1 50.0
Chunk 3
Find the methods in function dotplot{lattice}
[1] dotplot.array* dotplot.default* dotplot.formula* dotplot.matrix*
[5] dotplot.numeric* dotplot.table*
see '?methods' for accessing help and source code
Chunk 4
Use dotplot to draw four dot plots using data of four columns (groups) in VADeaths. Use groups=FALSE to not group data in the plots.
Chunk 5
Draw four dot plots using data of four columns (groups) in VADeaths.
- Do not group data in the plots.
- Assign the layout of plots to be one column and four rows.
- Set tht physical aspect ratio of the panels.
- Assign the original point.
- Assign the types of plots to be “points” and “horizontal lines”.
- Name the plot and x-axis.
dotplot(VADeaths, groups=FALSE, #1
layout=c(1, 4), #2
aspect=0.7, #3
origin=0, #4
type=c("p", "h"), #5
main="Death Rates in Virginia - 1940", #6
xlab="Rate (per 1000)") #6Chunk 6
Draw a dot plots using data of four columns (groups) in VADeaths. 1. Use default groups=TRUE to ngroup data in a single plot. 2. Assign the type of plot to be “overplotted” point and polyline. 3. Display the legend of group names at the right part of the plot. Show both points and lines on the legend. 4. Name the plot and x-axis.
dotplot(VADeaths, type="o", #2
auto.key=list(lines=TRUE, space="right"), #3
main="Death Rates in Virginia - 1940", #4
xlab="Rate (per 1000)") #4Chunk 7
Draw four barcharts using data of four columns (groups) in VADeaths.
- Do not group data in the plots.
- Assign the layout of plots to be one column and four rows.
- Set tht physical aspect ratio of the panels.
- Do not show the reference lines.
- Name the plot and x-axis.
barchart(VADeaths, groups=FALSE, #1
layout=c(1, 4), #2
aspect=0.7, #3
reference=FALSE, #4
main="Death Rates in Virginia - 1940", #5
xlab="Rate (per 100)") #5Chunk 9
Draw a barchart using data of postdoc, a table with dimension of 8x5.
- Compute the proportion of each group alone the rows (
margin=1). - Name x-axis.
- Display the legend of group (column in the table) with layout adjustment (
adj=1).
Chunk 10
Draw 5 dot plots using data of postdoc, a table with dimension of 8x5.
- Compute the proportion of each group alone the rows (
margin=1). - Do not group data in the plots.
- Name x-axis.
- Abbreviate the texts with the maximum length of 10.
dotplot(prop.table(postdoc, margin=1), #1
groups=FALSE, #2
xlab="Proportion", #3
par.strip.text=list(abbreviate=TRUE, minlength=10)) #4Chunk 11
Draw 5 dot plots using data of postdoc, a table with dimension of 8x5.
- Compute the proportion of each group alone the rows (
margin=1). - Do not group data in the plots.
- List specifying the display order (use the descending order of the median of x, proportion, here) of the panels.
- Name x-axis.
- Assign the layout of plots to be one column and five rows.
- Set tht physical aspect ratio of the panels.
- Scale y-axis.
- Order plots alone two axes
dotplot(prop.table(postdoc, margin=1), #1
groups=FALSE, #2
index.cond=function(x, y) median(x), #3
xlab="Proportion", #4
layout=c(1, 5), #5
aspect=0.6, #6
scales=list(y=list(relation="free", rot=0)), #7
prepanel=function(x, y) { #8
list(ylim=levels(reorder(y, x)))
},
panel=function(x, y, ...) { #8
panel.dotplot(x, reorder(y, x), ...)
})Chunk 13
Create a contingency table of gcsescore and gender with data of Chem97, and name it gcsescore.tab.
Chunk 14
Turn gcsescore.tab into a data frame and name it gcsescore.df.
Chunk 15
Using class(gcsescore.df$gcsescore) make us know that gcsescore, a variable in gcsescore.df is a factorial vector. Because a factorial vector uses serial integers (1, 2, …) to represent its different levels (factors), turning a factorial vector into a numeric vector directly would make a factorial vector become a meaningless vector with serial integers. Therefore, we turn a gcsescore.df$gcsescore into a character form and then turn it into a numeric form.
Chunk 16
Draw two plots to present the distribution of gcsescore with type of line and grouping of gender. Use layout=c(1, 2) to make the layout of plots to be one column and two rows.
xyplot(Freq ~ gcsescore | gender,
data = gcsescore.df,
type="h",
layout=c(1, 2),
xlab="Average GCSE Score")Chunk 17
Create a contingency table of score and gender with data of Chem97, and name it score.tab.
Chunk 18
Turn score.tab into a data frame and name it score.df.
Chunk 19
Draw a barchart the present the frequency of each score with grouping of gender. Set the original point as zero.