An intermediate type — ordinal — and give as example a Likert Scale.
QUESTION: Pain Scale.
Sometimes our impressions correspond to two or more variables. Example from Leland Wilkinson, “The Grammar of Graphics”, 1999
One person from each group introduces another person from that group.
mosaic packagefetchData to read data filesLogin to RStudio and make sure the “mosaic” box on the “Packages” tab is checked. You'll probably need to do this only once for the semester, but if you happen to restart R, you'll need to do it again.
Arithmetic functions: + - * / ^ ( )
2 + 7
## [1] 9
2 * 7
## [1] 14
2^7
## [1] 128
Functions: name( argument1, argument2, ... )
sqrt(4)
## [1] 2
sin(2)
## [1] 0.9093
Named arguments: When there is more than one argument, it's convenient to be able to refer to them by their names rather than by their positions.
seq(from = 1, to = 10, by = 2)
## [1] 1 3 5 7 9
seq(from = 1, to = 10, length = 5)
## [1] 1.00 3.25 5.50 7.75 10.00
Sometimes you'll mix named and un-named arguments. The typical pattern is that the first one or two arguments you remember by position and the remaining ones you refer to by name. The basic principle is that things should be unambiguous.
seq(1, 10, by = 2)
## [1] 1 3 5 7 9
seq(10, from = 1, length = 2)
## [1] 1 10
But better not to do this.
seq(from = 1, to = 10, 2)
## [1] 1 3 5 7 9
Is the third argument length= or to=? You don't really know until you try. So use the name!
You can use the output of one computation as an input to the next calculation:
sqrt((sqrt(9))^2)
## [1] 3
QUESTIONS:
You can store data and results for later use. You referred to the stored object by a name. The name should begin with a letter and not have any punctuation other than . or _. Best to keep your names short, but not so short that you can't remember what's stored where.
Here's a pure math example (since we haven't learned any statistics yet!)
legx = 7
legy = 9
hypot = sqrt(legx^2 + legy^2)
legx/hypot # Is this a sine or a cosine
## [1] 0.6139
ang = atan2(legx, legy) # the angle in radians
sin(ang)
## [1] 0.6139
cos(ang)
## [1] 0.7894
For those interested, this technique of writing statements with quantities referred to by name is called abstraction. Just by changing the values of legx and legy, we can repeat the calculation for a different right triangle. So the calculations can be thought of as representing the general properties of a right triangle, rather than a particular calculation for one triangle.
Typically you'll use the function fetchData(). It takes as an argument the name of the dataset you want to read.
kids = fetchData("KidsFeet")
## Data KidsFeet found in package.
These data sets are coming from a web site specifically set up for Project MOSAIC. This allows us to access textbook data with a very concise command.
You can also read in your own data. Store it as a CSV file on your computer, upload it to RStudio, and then use the command
mydata = fetchData() # no argument, but you still need parens.
The object that is created, which I've stored into the name mydata is generically called a “data frame.”
names(kids)
## [1] "name" "birthmonth" "birthyear" "length" "width"
## [6] "sex" "biggerfoot" "domhand"
nrow(kids)
## [1] 39
mean(length, data = kids) # Note, one variable named
## [1] 24.72
We'll use the “tilde” punctuation through the semester. It can be interpreted in several different but related ways:
Examples:
mean(length ~ sex, data = kids)
## B G
## 25.11 24.32
median(length ~ sex, data = kids)
## B G
## 24.95 24.20
xyplot(width ~ length, data = kids)
Going a little bit further than we need to, just for fun:
xyplot(width ~ length | sex, data = kids)
xyplot(width ~ length, groups = sex, data = kids)
And some statistical graphics:
histogram(~width, data = kids)
densityplot(~width, groups = sex, data = kids)
bwplot(~width, data = kids)
bwplot(width ~ sex, data = kids)