The math attainment page has a dataset and a script of R code chunks. Generate a markdown file from the script to push the output in HTML for posting to course Moodle site. first R session using math attainment data set
## 'data.frame': 39 obs. of 3 variables:
## $ math2: int 28 56 51 13 39 41 30 13 17 32 ...
## $ math1: int 18 22 44 8 20 12 16 5 9 18 ...
## $ cc : num 328 406 387 167 328 ...
## math2 math1 cc
## 28.76923 15.35897 188.83667
## math2 math1 cc
## 10.720029 7.744224 84.842513
## math2 math1 cc
## math2 1.0000000 0.7443604 0.6570098
## math1 0.7443604 1.0000000 0.5956771
## cc 0.6570098 0.5956771 1.0000000
##
## Call:
## lm(formula = math2 ~ math1, data = dta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.430 -5.521 -0.369 4.253 20.388
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.944 2.607 4.965 1.57e-05 ***
## math1 1.030 0.152 6.780 5.57e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.255 on 37 degrees of freedom
## Multiple R-squared: 0.5541, Adjusted R-squared: 0.542
## F-statistic: 45.97 on 1 and 37 DF, p-value: 5.571e-08
# specify square plot region
par(pty="s")
# scatter plot of math2 by math1
plot(math2 ~ math1, data=dta, xlim=c(0, 60), ylim=c(0, 60),
xlab="Math score at Year 1", ylab="Math score at Year 2")
# add grid lines
grid()
# add regression line
abline(dta.lm, lty=2)
# add plot title
title("Mathematics Attainment")# specify maximum plot region
par(pty="m")
#
plot(scale(resid(dta.lm)) ~ fitted(dta.lm),
ylim=c(-3.5, 3.5), type="n",
xlab="Fitted values", ylab="Standardized residuals")
#
text(fitted(dta.lm), scale(resid(dta.lm)), labels=rownames(dta), cex=0.5)
#
grid()
# add a horizontal red dash line
abline(h=0, lty=2, col="red")The notation, women{datasets}, indicates that a data object by the name women is in the datasets package. This package is preloaded when R is invoked. Explain the difference between c(women) and c(as.matrix(women)) using the women{datasets}.
## $height
## [1] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
##
## $weight
## [1] 115 117 120 123 126 129 132 135 139 142 146 150 154 159 164
## [1] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 115 117
## [18] 120 123 126 129 132 135 139 142 146 150 154 159 164
## [1] "list"
## [1] "double"
## [1] "list"
## [1] "numeric"
Use help to examine the coding scheme for the mother’s race variable in the birthwt{MASS} dataset. The MASS comes with the base R installation but is not automatically loaded when R is invoked.
## Black Other White
## 26 67 96
How many black mothers are there in this data frame? Ans. 26 What does the following R command do? ’> c(“White”, “Black”, “Other”)[birthwt$race] birthwt\(race == 1 -> "White" birthwt\)race == 2 -> “Black” birthwt$race == 3 -> “Other”
Regarding UCBAdmissions{datasets} data object, what does the output ‘> UCBAdmissions[,1,]’> UCBAdmissions[,1,1] ’> UCBAdmissions[1,1,] of each of the above R statements mean, respectively?
## , , Dept = A
##
## Gender
## Admit Male Female
## Admitted 512 89
## Rejected 313 19
##
## , , Dept = B
##
## Gender
## Admit Male Female
## Admitted 353 17
## Rejected 207 8
##
## , , Dept = C
##
## Gender
## Admit Male Female
## Admitted 120 202
## Rejected 205 391
##
## , , Dept = D
##
## Gender
## Admit Male Female
## Admitted 138 131
## Rejected 279 244
##
## , , Dept = E
##
## Gender
## Admit Male Female
## Admitted 53 94
## Rejected 138 299
##
## , , Dept = F
##
## Gender
## Admit Male Female
## Admitted 22 24
## Rejected 351 317
[Admit,Gender,Dept]
## Dept
## Admit A B C D E F
## Admitted 512 353 120 138 53 22
## Rejected 313 207 205 279 138 351
在 Gender = Male 時 Dept 與 Admit 的列表。
## Admitted Rejected
## 512 313
在 Gender = Male, Dept = A 時 Admit 的值。
## A B C D E F
## 512 353 120 138 53 22
每一 Dept 在 Admit = Admitted, Gender = Male 的值。
Concerning the chickwts{datasets}, explain the difference between ‘> chickwts[,2]’> chickwts[“feed”]
## [1] horsebean horsebean horsebean horsebean horsebean horsebean horsebean
## [8] horsebean horsebean horsebean linseed linseed linseed linseed
## [15] linseed linseed linseed linseed linseed linseed linseed
## [22] linseed soybean soybean soybean soybean soybean soybean
## [29] soybean soybean soybean soybean soybean soybean soybean
## [36] soybean sunflower sunflower sunflower sunflower sunflower sunflower
## [43] sunflower sunflower sunflower sunflower sunflower sunflower meatmeal
## [50] meatmeal meatmeal meatmeal meatmeal meatmeal meatmeal meatmeal
## [57] meatmeal meatmeal meatmeal casein casein casein casein
## [64] casein casein casein casein casein casein casein
## [71] casein
## Levels: casein horsebean linseed meatmeal soybean sunflower
## [1] "factor"
## [1] "integer"
## [1] "data.frame"
## [1] "list"
What happens when the following command is entered? ’> help(ls(“package:MASS”)[92]) Use the observation to find out how many items there are in the package MASS.
infomation about menarche {MASS} pop up
## [1] 165
165 items in the package MASS.
A built-in demo for the data object ToothGrowth{datasets} is run by issuing the following command: ’> example(ToothGrowth)
Replicate the analysis in the style of the headache study to investigate if type of vitamin supplement has an effect on teeth growth in guinea pigs. For this exercise, we shall ignore the dose variable.
Find all possible sums from rolling three dice using R. If possible, construct a histogram for the sum of three dice. Is this a probability histogram or an empirical histogram?
d1 <- sample(1:6, size=1000, replace = TRUE)
d2 <- sample(1:6, size=1000, replace = TRUE)
d3 <- sample(1:6, size=1000, replace = TRUE)
hist(d1+d2+d3,xlab = "total",main = "three dices summary")The IQ and behavior problem page has a dataset and a script of R code chunks. Generate a markdown file from the script to push the output in HTML for posting to course Moodle site. Explain what the code chunks do with your comments in the markdown file. https://rpubs.com/PL_W/c3e3
The usBirths2015.txt is a dataset of monthly births in the US in 2015. Summarize the number of births by season.
dta <- read.table("usBirths2015.txt",header = T)
dta$season <- factor(
ifelse(dta$month %in% c("March","April","May"), "Spring",
ifelse(dta$month %in% c("June","July","August"), "Summer",
ifelse(dta$month %in% c("September","October","November"), "Autumn", "Winter"))))Ten subjects read a paragraph consisting of seven sentences. The reading time (in seconds) for each sentence was the outcome measure. The predictors are the serial position of the sentence (Sp), the number of words in the sentences (Wrds), and the number of new arguments in the sentence (New). (a) Rank subjects by their reading speeed (b) Estimate, on average, how long does it take to read a word.
## [1] 9 3 4 2 1 10 8 7 6 5
## S01 S02 S03 S04 S05 S06 S07
## 0.3569259 0.3324568 0.3261728 0.3266049 0.4856790 0.4641852 0.4518395
## S08 S09 S10
## 0.4330864 0.2368765 0.3883457