R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot. ## I chose women’s data set which contains weight and weight of women.Now we have to add the data set to R data(women) attach(women) ##First make a plot of the data plot(height,weight) ## It looks to be a linear relationship between the two. Let’s change the labels on the axis of the graph to make it look nicer. plot(height, weight, ylab = “Women’s Weight”, xlab = “Womens Height”, main = “Women’s Height and Weight Data”) ## Looks a lot better now if you ask me. Now we want to create/ run regression so we can plot the prediction line in the plot. model <- lm(weight ~ height) model plot(height, weight, ylab = “Women’s Weight”, xlab = “Womens Height”, main = “Women’s Height and Weight Data”) abline(-87.52,3.45) ## Now let’s try to choose a data point, I chose the 4th row, and let’s see how closely the predicted is to the actual, the residual. women[4, ] -87.52+61*3.45 120-122.93 ## Our Assumption is we want to see and have a normally distributed error so let’s make a histogram of the residuals resid <- model\(residuals hist(resid) ##This model is decent. As you can see there is 8 negative errors and 7 positive errors so we are centered around 0 pretty precisely but the right side of the model is more dispersed than the left side. ## Lets use another couple of graphs to check the residuals qqnorm(resid) qqline(resid) plot(model\)residuals ~ height) abline(0,0) ## The first plot shows us that the residuals are pretty small. The second one however shows us a problem. As we can see there is a pattern in our residuals. Our residuals are supposed to be independent of each other but this is contradicting our assumption. Lastly we want to see the summary statistics of our model. summary(model) ## Now we can see all our the important statistics of our model. We can see our r^2 value is pretty high and our error is low so it looks to be our model is a good fit and our variables are linearly related. The intrepretation is for every inch increase in height the weight increases by about 3.45 pounds.