Using R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button (up in the toolbar) a document will be generated that includes both content as well as the output of any embedded R code chunks within the document (the parts in gray). You can embed an R code chunk like this:

print('hello world!')
## [1] "hello world!"

If you click the Knit button, you will get a rendering of this whole file showing both the code chunks and their output. Try it!

You will have tasks, which are things you need to code, and you will have Questions which require answers. The questions will pertain to the preceding task.

Task 1

Task 1.1

In the next code chunk, you will create two vectors with your best guess of this week’s daily high temperature, then combine those vectors to a data frame (re-watch the edureka! tutorial if you need a refresher). Finally, we’ll subset the data frame we’ve created.

# First we create a vector:
dayOfWeek = c(1,2,3,4,5)

# Second, *YOU* create a vector called "guessDailyHigh" replacing my numbers with the five weekday high temperatures from last week (complete guesses about the temperature are all you need - just put in your own numbers)
guessDailyHigh = c(0,0,0,0,0)

# Third, we will create a data.frame called "temperatureData" to hold our two vectors:
temperatureData = data.frame(dayOfWeek, guessDailyHigh)

# Finally, let's output the data.frame we've created:
print(temperatureData)
##   dayOfWeek guessDailyHigh
## 1         1              0
## 2         2              0
## 3         3              0
## 4         4              0
## 5         5              0

Question 1.1

(a) How many rows does the data.frame have?

– 5

(b) How many columns the data.frame have? (Do not count the row index that R prints)

– 2

Task 1.2

Next, we want to subset the data.frame. We do this by using the square brackets: [,]

If we put brackets after a data.frame, we are telling \(R\) that we want to subset by either rows or columns. We use a comma to separate the index for rows and for columns. First is rows. The R command temperatureData[2,] will return the 2nd row of our data.frame with both columns.

Similarly, temperatureData[,1] will return just the first column of our data.frame.

We can ask for more than one row by using 1:4 or c(1,2,3,4) to get rows 1-4.

In the code chunk below, subset the data.frame so that we see just the 3rd through 5th row.

# note that we can get the sequence 3,4,5 by using:
3:5
## [1] 3 4 5
# So, on the line below, fill in the [,] with the correct numbers to return the 3rd through 5th row:
temperatureData[,]
##   dayOfWeek guessDailyHigh
## 1         1              0
## 2         2              0
## 3         3              0
## 4         4              0
## 5         5              0
# Note that we did not replace temperatureData with the subset - there was no "=" or "<-" in the code

Question 1.2

(a) How many rows were returned when you subset temperatureData

– 3 (With temperaturedata[3:5,1:2])

(b) Which columns were returned when you left the column index empty?

– None, Error in [.data.frame(temperaturedata, 3:5) : undefined columns selected

Task 1.3

The function mean() takes the argument you give it, and returns the mean of that vector (as long as they are numbers!). Let’s see the mean temperature in your data for the first three days of the week (1st - 3rd) and the last two days of the week (4th - 5th).

To do this, we need to specify both the rows and the column of data we wish to use for our mean calculation.

# I'll do the first one for you:
earlyMean = mean(temperatureData[1:3, 2])

# on the next line, create an object lateMean=mean(temperatureData[??,??]) and fill in the index that will give us the mean temperature for the 4th and 5th day of the week:
#--latemean=mean(temperaturedata[4:5,2])--#

# Now, let's output the earlyMean:
print(earlyMean)
## [1] 0
# And you add a similar line to output the lateMean object you created

Question 1.3

(a) Which was colder - the beginning of the week, or the end (according to your output)

– earlymean = 22 , latemean =24.5 Therefor, earlymean was colder by 2.5

(b) What was the mean for the early part of the week?

OK, I’m going to write an answer for you here so that I can show you how to refer back to R objects you’ve created. If I make a little in-line R code-chunk like this: 0 then R will automatically put the answer in for you since you created an object called “earlyMean” in the code-chunk above. You’ll see the actual number when you click Knit. This is very useful for making dynamic reports that update as data is updated.

Task 1.4

Last one - we’re going to do a little plotting. First, a histogram. Then, a scatterplot. We have to tell R which column to make a histogram for. There’s THREE ways - We can use the index number 2 like we did before in Task 1.3

  • We can use the name of the column in ‘quotes’: hist(temperatureData[,'guessDailyHigh'])

  • Or, we can use a special character, the $ symbol like this: temperatureData$guessDailyHigh

    • Note: in Rmarkdown, the $ does something else. To write just a plain $, we have to put a backslash in front of it.

    • In code chunks, we don’t have to use the backslash.

# So this works:
hist(temperatureData[,'guessDailyHigh'])

# Also works:
hist(temperatureData$guessDailyHigh)

Question 1.4

(a) Do you see the plots above?

– Yes

Question 1.5

Finally, the scatterplots. We need pairs of data (each row is an x-y combination), so we have to give R two columns separately. The first “argument” to the function are the x’s, the second (after the comma) are the y’s:

plot(temperatureData[,'dayOfWeek'], temperatureData[,'guessDailyHigh'])

plot(temperatureData$dayOfWeek, temperatureData$guessDailyHigh, col='blue')

Question 1.5

(a) Now that you’ve seen a few things we can plot in R, write one thing you think would be interesting or useful to learn to plot (e.g. a piechart or….?).

– Since I am new to coding in general, this whole process will be an interesting endeavor.





## END

Click Knit to render a html file. Please open the .html file then print to a .pdf when you upload your final work (with all coding tasks completed and all answers filled in as appropriate).