This Code Through is here to introduce a package that is much useful in producing advanced graphics. The Lattice Package provides a good comprehensive way for creating high level visual graphics for multivariate data sets. What specializes this package is its ability to easily produce Trellis graphs.A trellis graph is a graph that shows the distribution of a variable or the relationship between variables, separately for each level of one or more other variables. In general we can say that:
The Lattice package can provide a better display for the relationships between variables conditioned together.
Within this Code Through, we will work on the iris data set that is built in R.
A simple format to use Lattice is:
graph_type(formula, data) where:
graph_type represents the type of graph to represent formula specifies the variables or conditioned variables
R-Lattice-source:FAO-United Nations
The lattice package can produce a different array of graphs as dot plots, kernel density plots, histograms, bar charts, box plots, scatter plots, strip plots, parallel box plots as well as 3D plots and scatter plot matrices graphs.
Graphs in Lattice Package
Example number 1
Let us Consider the following question: How do the width of petals in the iris dataset vary by their species?
histogram(~Petal.Width| Species, data = iris,
main="Distribution of petal width by species",
xlab="width (cm)") So, here petal width is the dependent variable, species is the conditioning variable, and a histogram is created using the lattice package for each of the three different iris species within the dataset. The graph shows that Setosa petal appear to be larger in width in comparison to the Versicolor and Virginics species types.
One of the most powerful features of lattice graphs is the ability to work with conditioning variables.conditioning variables are factors. The lattice package provides functions for transforming a continuous variable into a data structure called a shingle and Once a continuous variable is converted to a shingle, we can use it as a conditioning variable.
Example number 2
If we want to know how does the relationship look like between sepal width and petal width conditioned on sepal length? Here we have to plot a graph where all those mentioned variables are there and conditioned over Sepal.Length. So, here we can do easy comparisons among the groups. we can notice different strips on each graph.The darker color indicates the range of values for the conditioning variable in the given set.
xyplot(Sepal.Width~Petal.Width|Sepal.Length, data=iris,
main = "Sepal Width vs. Petal.Width",
xlab = "Petal width", ylab = "Sepal width",
layout=c(3, 1), aspect=1.5) Here we have incorporated variables into the analysis by producing plot with separate panels for each of several subgroups of the observations, as determined by another variable.
Example number 3
If we want to graph the relationship between Petal width and SEpal width (considered as a continuous variable), conditioned on type of flower. We do the following:
xyplot(Petal.Width~Sepal.Width|Species,data=iris,
scales=list(cex=.8, col="red"),
xlab="Sepal WIDTH", ylab="PETAL WIDTH",
main="SEPAL WIDTH vs PETAL WIDTH by Species Type") The first thing to note from the above graph is that the x-scale axis range is the same for all compared plots. This is as a way to facilitate comparison.
Example number 4
Let’s say that we want to display the distribution of sepal length for virginica type using kernel density plots. We can do this using the below:
iris$Species <- factor(iris$Sepal.Length, levels=c(1,5),
labels=c( "Virginica"))
densityplot(~Sepal.Length, data=iris,
group=Species,
main="sepal lenght by virginica type",
xlab="sepal length")Same graph with some coloring modifications, noting that Lattice plots are trellis objects that can be printed with different themes and colors.
colors = c( "blue")
lines = c(1,2) #1
points = c(16,17)
iris$Species <- factor(iris$Sepal.Length, levels=c(5,1),
labels=c( "Virginica"))
densityplot(~Sepal.Length, data=iris,
group=Species,
main="sepal lenght by virginica type",
xlab="sepal length", pch=points, lty=lines, col=colors,
lwd=2,
)Using the Lattice package we can as well create a 3D scatter plot to do comparisons as follows:
The comparative box-and-whisker plot allows comparison between an arbitrary number of samples where plots can be displayed in a slightly different layout to emphasize a more subtle effect in the data.
here we can notice that the median sepal length does not uniformly increase from left to right as one might expect.Note that the decreasing lengths of the boxes and whiskers suggest decreasing variance, and the large number of outliers on one side indicate heavier left tails.
The below graph grid type can be yielded by using the Lattice Package where we can avoid overplotting. This grid type can provide a good and easy way to better analyze and compare datasets.
xyplot(Sepal.Length ~ Sepal.Width | Petal.Width, iris,
grid = TRUE,
scales = list(x = list(log = 10, equispaced.log = FALSE)),
panel = panel.smoothScatter)## (loaded the KernSmooth namespace)
Click Here For additional information and details about other visualization packages in R.