February 6, 2013 Class Notes

Where you should be getting

apple = fetchData("M155/Stocks/aapl.csv")
## Retrieving from http://www.mosaic-web.org/go/datasets/M155/Stocks/aapl.csv
mean(Open, data = apple)
## [1] 89.33
sd(Open, data = apple)
## [1] 124.2
cps = fetchData("CPS85")
## Data CPS85 found in package.
tally(~sex, data = CPS85)
## 
##     F     M Total 
##   245   289   534
tally(~sector, data = CPS85)
## 
## clerical    const    manag    manuf    other     prof    sales  service 
##       97       20       55       68       68      105       38       83 
##    Total 
##      534
tally(~sex & sector, data = CPS85)
##        sector
## sex     clerical const manag manuf other prof sales service Total
##   F           76     0    21    24     6   52    17      49   245
##   M           21    20    34    44    62   53    21      34   289
##   Total       97    20    55    68    68  105    38      83   534
tally(~sector & sex, data = CPS85)
##           sex
## sector       F   M Total
##   clerical  76  21    97
##   const      0  20    20
##   manag     21  34    55
##   manuf     24  44    68
##   other      6  62    68
##   prof      52  53   105
##   sales     17  21    38
##   service   49  34    83
##   Total    245 289   534
tally(~sex | sector, data = CPS85)
##        sector
## sex     clerical   const   manag   manuf   other    prof   sales service
##   F      0.78351 0.00000 0.38182 0.35294 0.08824 0.49524 0.44737 0.59036
##   M      0.21649 1.00000 0.61818 0.64706 0.91176 0.50476 0.55263 0.40964
##   Total  1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
tally(~sector | sex, data = CPS85)
##           sex
## sector           F       M
##   clerical 0.31020 0.07266
##   const    0.00000 0.06920
##   manag    0.08571 0.11765
##   manuf    0.09796 0.15225
##   other    0.02449 0.21453
##   prof     0.21224 0.18339
##   sales    0.06939 0.07266
##   service  0.20000 0.11765
##   Total    1.00000 1.00000

Distributions and Standard Deviation

Example: Osteopenia.

Here's a quote from Wikipedia:

Osteopenia is a condition where bone mineral density is lower than normal. It is considered by many doctors to be a precursor to osteoporosis. However, not every person diagnosed with osteopenia will develop osteoporosis. More specifically, osteopenia is defined as a bone mineral density T-score between -1.0 and -2.5.

The article goes on:

The definition has been controversial. Steven R. Cummings, of the University of California, San Francisco, said in 2003 that “There is no basis, no biological, social, economic or treatment basis, no basis whatsoever” for using one standard deviation. Cummings added that “As a consequence, though, more than half of the population is told arbitrarily that they have a condition they need to worry about.” Quoted from this Gina Kolata article

bone density and osteopenia from surgeongeneral.gov

Bone density falls with age. The T-score is really just a Z-score, but compares a person to the distribution of young people. Some graphs

Osteopenia is defined so that about 1/6 of young people have it and much larger fractions of old people will have it.

Example: One-Day Changes in Stock Prices

The stock market goes up and down every day. By how much?

apple = fetchData("M155/Stocks/aapl.csv")
## Retrieving from http://www.mosaic-web.org/go/datasets/M155/Stocks/aapl.csv
netflix = fetchData("M155/Stocks/nflx.csv")
## Retrieving from http://www.mosaic-web.org/go/datasets/M155/Stocks/nflx.csv
facebook = fetchData("M155/Stocks/fb.csv")
## Retrieving from http://www.mosaic-web.org/go/datasets/M155/Stocks/fb.csv
apple = transform(apple, change = Close/Open)
netflix = transform(netflix, change = Close/Open)
facebook = transform(facebook, change = Close/Open)

What's the distribution? What's a typical value? Do the stocks have different typical values? What does the standard deviation tell you here?

Aside:

If you want to plot out the values versus time, you need to turn the Date variable into a format that can be used by R. Here's how:

fetchData("getDJIAdata.R")
## Retrieving from http://www.mosaic-web.org/go/datasets/getDJIAdata.R
## [1] TRUE
apple = fixStockDate(apple)
netflix = fixStockDate(netflix)
facebook = fixStockDate(facebook)

Activity: Sketch Some Distributions

You might want to start with an estimate of the mean or median and the 95% coverage interval.
Also, think about the tails of the distribution.

Spend 3 minutes using the web to see if you can get any confirmation from the Internet for your estimates.

Measurement and measurement bias

Sampling and sampling bias

Random sampling

In-Class Activity

fetchData("simulate.r")
## Retrieving from http://www.mosaic-web.org/go/datasets/simulate.r
## [1] TRUE

Instructor's write up

Descriptions of Distributions

Main points