For R help, type a ? before any command, e.g. ?plot
Loading filesOn a Mac: co2 <- read.csv(file.choose(),header = TRUE) The last bit tells R that the first row of data has column names
On a PC: co2 <- read.csv(choose.files(),header = TRUE)
When you run this line, you'll be prompted to navigate to the file you want to open.
You can also set a working directory if you want (although I haven't figured out why that's necessary):
setwd("/Users/caitlin/Dropbox/CAITLINS DOCUMENTS/CU Boulder/Courses/GEOG 5023 Quant methods - Spielman")
Getting to know your datasummary() #Returns the entire data set in the console (don't do this for large data sets!)
names() #Gives names of columns
head() #Gives the first 6 rows of data
plot(x,y) #plots the data
plot(x, y, main=“title”, sub=“subtitle”, xlab=“X-axis label”, ylab=“y-axix label”, xlim=c(xmin, xmax), ylim=c(ymin, ymax))
length() - Tells you how many observations or values are in a certain object/row, etc. sum(is.na (data$Shots)) #Checking to see if there are any na's in the Shots column. sum(!is.na (data$Shots)) #We can also ask how many “not nas” are in the Shots column (! means not) sum(data$Shots == 3) # This is how you'd get the total # of 3s i the Shots column
hist(dataName$columnName, xlab=“x label”, ylab=“ylabel”)
qqnorm(dataName$columnName) qqline(dataName$columnName) What is a quantile-quantile plot??
Shapiro-Wilk test of normality, where H0 is that the data are normally distributed shapiro.test(dataName$columnName)
Reconfiguring your dataSubset a data set based on values in a column: 2 ways
(Seth's preference) mauna <- dataName[dataName$column3 == 0,] # Create an object from a dataset containing all values from the column 3 equal to 0. The blank behind the comma tells R to do this for ALL the columns.
(Petra's prefered way, a bit more intuitive) mauna1 <- subset(co2, site==0) jolla1 <- subset(co2, site==1)
Statistical testst-test and Wilcoxon test: used to test hypotheses when two groups are compared with respect to a continuous variable like time. T-test is a parametric approach (for normal data), Wilcoxon test is a non-parametric approach (for data not normally distributed). To run a Wilcoxon test, the data must first be converted to ranks
Chi-square tests are for comparing categorical data (like male, female), not continuous data.