Your R-Journal should be a list of function or techniques that you've used to complete labs and in class assignments. It will be useful to include worked examples in code.
setwd("path") #use this to set working directory of input files.
read.csv("filename") #to import a csv file. There is also read.table etc.
Use Latex syntax to write out equations nicely:
$$\bar{x} = \frac{\sum_{i=1}^N x_i}{N}$$
\[ \bar{x} = \frac{\sum_{i=1}^N x_i}{N} \]
mystd <- function(data) {
d <- (data - mean(data))^2
return(sqrt(mean(d)))
}
mystd(somedata) #to execute the function on your data
# USE {r fig.width=7, fig.height=6} to set size
plot(x, y)
or
plot(y ~ x)
plot(y = MN$YCoord, x = MN$XCoord, col = MN$HD, pch = 16, cex = 0.5, main = "Correlation Example r= .8")
#'col' changes the color of dots depending upon the value in the 'HD' column
#'pch' sets the symbol to a solid dot
#'cex' makes the dot .5 the normal size
# main gives a title
plot(linearModel) # this will plot 4 graphs to help assess model fit visually. precede with par(mfrow=c(2,2)) to see all at once.
plot(lm1, which = 1:2) #use which command to specify which plots e.g. with plot(linearModel,which=1:2)
layout(matrix(c(1, 2), 1, 2)) #similar to subplot in matlab. set up matrix with 1 row, 2 col for plots.
par(mfrow = c(1, 2))
dev.off() #to close the figure so the next plot isn't two plots
summary() #min max mean etc
names() #header names
object(, 2) # all rows, 2nd column
object$col2 # all rows, 2nd column by header name
c(1, 2, 3, 4) # combine to make a vector. Does not distinguish between row and column vectors
matrix(c(1, 2, 3, 4, 5, 6), 3, 2) # 3 rows, 2 columns
head(obj) #displays the first 5 rows of an object
## The line below selects all rows where the block column contains values
## in our list of blocks Save the result as a new object where block was
## created based on condition
HDB <- MN[MN$Block %in% blocks, ]
## arithmetic or otherwise with parts of objects based on conditions
HDB_in_sqft <- HDB_in[HDB_in$BldgArea > 0, "AssessTot"]/HDB_in[HDB_in$BldgArea >
0, "BldgArea"]
< > == !=
inHD <- MN[MN$HD ==1,] #This line places all rows in MN with a "1" code in the HD column into a new variable inHD.
is.na(MN[1:100, "HistDist"]) #returns true anywhere in the first 100 rows of MN$HistDist there is an NaN
ifelse(is.na(MN[1:100, "HistDist"]), 0, 1) #replaces NaN with 0 and non-NaN with 1. Usually assign to a new object
as.factor(obj) #tells R obj is not data but a categorial factor. summary is different
mean()
std()
sqrt()
shapiro.test() #test for normality
wilcox.test() # Wilcox rank sum test to compare distributions
t.test(x = inHD$AssessTot, y = outHD$AssessTot) # 2 sample t test. two sided.
corr(x, y) #correlation
cor.test(x, y) #correlations with more info
lm(y ~ x + I(x^n)) #linear regression. Use I(x^n) for higher order polynomials
anova(lm1, lm2) #compare regression models. Use just lm1 to compare lm1 to mean of distribution