This is an example of an R Markdown document. You can also generate an example of an R Markdown document by selecting R Markdown under New File in the File menu. Make sure to select HTML as the Default Output Format. Note that in an R Markdown document, surrounding text with double asterisks will produce bold text and surrounding text with single asterisks will produce italicised text.
In this particular R Markdown document, we will answer Question 2 from the Topic 01 Tutorial. We can then render this R Markdown document by pressing the Knit button to create a nice HTML report that includes written text, R code and R output. As you write an R Markdown document, it is a good idea to regularly Knit your document to check that your document is rendering properly.
For part (a), we need to calculate the sample correlation coefficient
between marketing expenditure (X) and the number of sales (Y). To do
this, we first need to create the data in R, which we do in the code
chunk below. To insert a code chunk, select Insert
Chunk in the Code menu. It is good practice to
give every code chunk a unique label (using only alphanumeric characters
and dashes, with no spaces), which will be the first term after the
letter r in the curly braces. For example,
xy-samp-cor would be an appropriate label for the first
code chunk below. Note that surrounding text with backticks will format
the text as inline code.
Within the code chunk, notice that all sections of R code have comments describing what the code is doing. To ensure readability and interpretability of your code, it is very important to always include comments that describe what your code is doing.
# REPLACE THIS TEXT WITH APPROPRIATE COMMENTS THAT DESCRIBE WHAT YOUR R CODE IS DOING.
x <- c(4150,3000,2500,10600,12000,8000,1500,6850)
y <- c(778,779,4200,250,300,6000,1500,500)
# Calculating the sample correlation between X and Y.
cor(x,y)
## [1] -0.1871415
So we see that the sample correlation coefficient between X and Y is rXY=-0.1871. Therefore, the manager’s claim that there is a positive correlation between marketing expenditure and number of sales is incorrect. Note that surrounding text with tildes will format the text as subscript.
Rather than writing out the number -0.1871 in the text, we can
include inline R code in the text by prefacing the R code with the
letter r and surrounding everything in backticks, e.g.,
rXY=-0.1871.
If we wanted to, we could also create a plot of Y against X to visualise the relationship between the two variables. Notice how we have given the plot an informative title and also appropriately labelled the x and y axes.
# Creating a plot of Y against X.
plot(x=x,y=y,xlab="Marketing Expenditure",ylab="Number of Sales",main="Scatter Plot of Num of Sales vs Marketing Expenditure")
For part (b), we now want to calculate the sample correlation coefficient between number of the rainy days per month (Z) and the number of sales (Y). All we have to do is create the number of rainy days per month data in R, which we do in the code chunk below.
# Creating the data for Z.
z <- c(6,10,25,2,7,20,18,9)
# Calculating the sample correlation between Z and Y.
cor(z,y)
## [1] 0.8249823
Therefore, the sample correlation coefficient between Z and Y is
rZY=0.825. For those who are familiar with LaTeX, LaTeX
syntax can be used to write nice-looking mathematical expressions and
formulas. Inline LaTeX expressions can be produced by encapsulating the
LaTeX syntax between the symbols /( and /),
e.g., \(r_{ZY}=0.825\).
For practice, create a plot of Y against Z in your HTML report by
inserting a code chunk below with the label y-vs-z-plot.
Make sure to comment your R code and give the plot an appropriate title
and label both axes.
From the result in part (b), we see that there is a strong positive correlation between Y and Z. One possible explanation for this might be because when there is a lot of rain, people tend to play more indoor sports which, for this country from which the data came, consists of table tennis. Therefore, more people playing table tennis would likely result in stronger levels of sales for ping pong balls.
Notice that for this R Markdown document, the rendered HTML report includes all the R code and R output. However, all the answers (including numerical answers), explanations, conclusions, etc., were properly written in the text of the R Markdown document. This is very important as it ensures that the rendered HTML report is easy to read with the results of the analysis clear to see. Note that comments in code chunks are not appropriate places to write answers, explanations, conclusions, etc. Comments should be reserved for only briefly describing what your code is doing.