The run_analysis.R script is a script written to fulfill the requirements for the project in the Getting and Cleaning Data course offered by Coursera. The course was initially undertaken in August 2015.
There is no need to download the files to your hard drive or unzip them, as the script takes care of that.
The script creates a tidy data set from the combined data, according to the principles outlined by Hadley Wickham (Journal of Statistical Software, “Tidy Data”, Vol. 59, Issue 10, Sep 2014, http://www.jstatsoft.org/v59/i10) downlaodable in pdf format from http://vita.had.co.nz/papers/tidy-data.pdf.
The script writes the tidy data set to a table.
The following code can be used to read the file back into R: tidy <- read.table(“./average.txt”, quote = “”, header = TRUE, fill = TRUE, check.names = FALSE)
The data set dimensions are:
dim(tidy)
[1] 180 68
An expected output for a portion of the data set is: “Activities” “SubjectId” “TBodyAccelerationMeanX” 1 “Laying” 1 0.2215982 2 “Sitting” 1 0.2612376 3 “Standing” 1 0.2789176 4 “Walking” 1 0.2773308 5 “WalkingDownstairs” 1 0.2891883
Set the working directory in the script by changing the path in line 2 of the code. i.e. > setwd(“/Users/tina/desktop/GaCD_Project”) The path must be indicated as the arguement to setwd(), within quotation marks.
If the link to the .zip files has changed from the link given in line 14 of the code, i.e. > dataFile <- “https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip” Insert the updated https address.
Save the script and source it.