August, 2014
library(mosaicData) data(SwimRecords) View(SwimRecords) help(SwimRecords)
Research Question: How have 100m records varied over the years?
Variables involved:
Their types:
With numerical variables we have used:
densityplot()bwplot()histogram()favstats()Try this:
histogram(~time|year,data=SwimRecords)
We need a new graphical method to study the relationship between two numerical variables.
Use the xyplot() function:
xyplot(time~year,data=SwimRecords,
main="100m Swim Records",
xlab="year",
ylab="time (seconds)",
pch=19)
Basic Form:
\[xyplot(response \sim explanatory, data = \ldots)\]
Rising cloud: the taller the father, the taller the son!
Falling cloud: bigger x's go with smaller y's!
There's a relationship, but it's not linear!
See the cloud "leveling off"?
As you move to the right, cloud neither rises nor falls.
data(stumps) View(stumps) help(stumps)
Research Question: Are there more larvae cluster in plots where there are more cottonwood stumps?
Which is explanatory?
Which is the response?
Make the scatter plot.
Describe the relationship between number of larvae and number of stumps.
data(fuel) View(fuel) help(fuel)
Research Question: How does the speed at which the Ford Escort is driven affect its fuel efficiency?
Which is explanatory? Which is the response?
Make the scatter plot.
Describe the relationship between number of speed and fuel efficiency.
Units for efficiency are "liters per 100 km". Does a high number for efficiency represent good or poor fuel efficiency?
Use a groups argument, with a key:
xyplot(time~year,data=SwimRecords,
main="100m Swim Records",
xlab="year",
ylab="time (seconds)",
pch=19,
groups=sex,
auto.key=TRUE)
data(TenMileRace) View(TenMileRace) help(TenMileRace)
Let's look at the relationship between age (explanatory) and net time (response), for both men and women.
xyplot(net~age,data=TenMileRace,
groups=sex,
auto.key=TRUE)
Make a scatter plot for each sex:
xyplot(net~age|sex,data=TenMileRace,
main="Age and Race Time, by Sex",
xlab="age (years)",
ylab="10k time (seconds)",
layout=c(2,1))
The guys appear to run faster. Hard to tell if age make a difference.
help(verlander)
Research Question: How are these variables related?
xyplot(pz~px,data=verlander,
main="Verlander over the Plate",
pch=19)
Over-plotting! (15,307 points sitting on top of each other.)
xyplot(pz~px,data=verlander,
main="Verlander over the Plate",
pch=19,
alpha=0.10)
With alpha set to 0.10, you need 10 points stacked on each other to get full color!
Let's group by type of pitch:
xyplot(pz~px,data=verlander,
main="Verlander over the Plate",
pch=19,
alpha=0.10,
groups=pitch_type,
auto.key=list(space="right"))
Hard to tell!
Let's plot in separate panels for each pitch type:
xyplot(pz~px|pitch_type,data=verlander,
main="Verlander over the Plate",
pch=19,
alpha=0.10)
For which types of pitch is there a relationship between px and pz?
How would you describe the relationships you see (if any)?