Nathan Byers
April 21, 2014
We'll be opening up an R session through IUanyWARE:
10 + 5
10 - 5
10 * 5
10 / 5
The assignment operator in R is <-
x <- 10
y <- 5
x + y
[1] 15
(the top panel shows what you should run in your script and the bottom panel shows the output)
c( ) as a container for vectorsx <- c(1, 2, 3, 4, 5)
x
[1] 1 2 3 4 5
data.frame(variable1, variable2, ...) functionprice <- c(1000, 4000, 2000, 5000, 500)
carat <- c(0.4, 0.55, 0.45, 0.65, .2)
color <- c("G", "H", "D", "E", "G")
diamonds <- data.frame(price, carat, color)
diamonds
price carat color
1 1000 0.40 G
2 4000 0.55 H
3 2000 0.45 D
4 5000 0.65 E
5 500 0.20 G
plot(x = carat, y = price)
We use the lm(y ~ x) function in R to fit
a linear regression model and the summary( )
function to see the results
fit <- lm(price ~ carat)
summary(fit)
Call:
lm(formula = price ~ carat)
Residuals:
1 2 3 4 5
-967 435 -500 370 663
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2294 1129 -2.03 0.135
carat 10652 2378 4.48 0.021 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 806 on 3 degrees of freedom
Multiple R-squared: 0.87, Adjusted R-squared: 0.827
F-statistic: 20.1 on 1 and 3 DF, p-value: 0.0207
Let's see what this regression looks like as a line on our plot
abline(fit)
carat variable to our regression equationcarat2 <- carat^2
fit2 <- lm(price ~ carat + carat2)
summary(fit2)
Call:
lm(formula = price ~ carat + carat2)
Residuals:
1 2 3 4 5
-437.4 596.0 20.2 -280.4 101.6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1169 1877 0.62 0.60
carat -8375 9502 -0.88 0.47
carat2 22616 11120 2.03 0.18
Residual standard error: 564 on 2 degrees of freedom
Multiple R-squared: 0.958, Adjusted R-squared: 0.915
F-statistic: 22.6 on 2 and 2 DF, p-value: 0.0424
carat.values <- seq(0.2, 0.7, 0.001)
curve.price <- predict(fit2, list(carat = carat.values, carat2 = carat.values^2))
plot(x = carat, y = price)
lines(carat.values, curve.price)
predict() function prediction <- predict(fit2, list(carat = 0.5, carat2 = 0.5^2))
prediction
1
2635
points(x = 0.5, y = prediction, col = "red")
For this example we use the XML package
install.packages("XML")
library(XML)
url <- "http://en.wikipedia.org/wiki/Healthcare_system"
table <- readHTMLTable(url)[[1]]
View(table)
Country Life expectancy Infant mortality
1 Australia 81.4 4.49
2 Canada 81.4 4.78
3 France 81.0 3.34
4 Germany 79.8 3.48
5 Italy 80.5 3.33
install.packages() function) then loaded before using it (the library() function)