Directions

During ANLY 512 we will be studying the theory and practice of data visualization. We will be using R and the packages within R to assemble data and construct many different types of visualizations. We begin by studying some of the theoretical aspects of visualization. To do that we must appreciate the basic steps in the process of making a visualization.

The objective of this assignment is to introduce you to R markdown and to complete and explain basic plots before moving on to more complicated ways to graph data.

The final product of your homework (this file) should include a short summary of each graphic.

To submit this homework you will create the document in Rstudio, using the knitr package (button included in Rstudio) and then submit the document to your Rpubs account. Once uploaded you will submit the link to that document on Moodle. Please make sure that this link is hyperlinked and that I can see the visualization and the code required to create it.

Questions

Find the mtcars data in R. This is the dataset that you will use to create your graphics.

  1. Create a pie chart showing the proportion of cars from the mtcars data set that have different carb values and write a brief summary.
#load data set. 
#This data set is one of the inbuild data sets in R so, we  need not to download it or read it.
#we can directly load it to our workspace
data(mtcars)


library(ggplot2)

#making intial bar plot as a preperation to pie chart
p1<- ggplot(mtcars, aes(x="", y=length(as.factor(carb)), fill=factor(carb)))+
geom_bar(width = 1, stat = "identity")

#transform bar plotto pie chart
p2 <- p1 + coord_polar("y", start=0)+ ggtitle("Proportion of cars by carbs")+ylab("Carbs")
p2

#to check the visual by computing the proportion(percentages of cars by carb values)
a1<-100*(table(as.factor(mtcars$carb)))/length(as.factor(mtcars$carb))
a1
## 
##      1      2      3      4      6      8 
## 21.875 31.250  9.375 31.250  3.125  3.125

Comment Just by observing the pie chart we can see that carbs 2 and 4 seems to have highest proportion of cars and cars 6 & 8 corrosponds to the lowest proportion. We firther did a mathematical compution on the carb proporition and this validations our observations.

  1. Create a bar graph, that shows the number of each gear type in mtcarsand write a brief summary.
ggplot(mtcars, aes(x=as.factor(gear)))+geom_bar(width = 0.5, fill="navy")+ggtitle("Count of each Gear Type in mtcars data")+xlab("gear")

Comments Gear 3 have the gihest numbers fo car model and gear 5 have lowest proportion of car models. This may be because three gear models are most popular or most economical or there may be some other reasons as well

  1. Next show a stacked bar graph of the number of each gear type and how they are further divided out by cyland write a brief summary.
ggplot(mtcars, aes(x=as.factor(gear), fill=as.factor(cyl)))+geom_bar(width = 0.5)+ggtitle("Count of each Gear Type further devided by cycle type in mtcars data")+xlab("gear")

Comments One thing can be certainly said by observed the stacked bar graph above that the cycle proportion of a differnt models are differnt. We can see that 3 gear models have highest number of 8 cycles while 4 gear cars have highest number of 4 cycles. Similarly, 3 gear models have lowest number of 4 cycles and 5 gear models have lowest number of 6 cycles.

  1. Draw a scatter plot showing the relationship between wt and mpgand write a brief summary.
ggplot(mtcars,aes(x=wt, y=mpg))+geom_point()+ggtitle("scatter plot showing the relationship between wt and mpg")+xlab("wt")+ylab("mpg")

Comments The scatter plot shows a clear trend. Higher the wt of the model lower is its mileage. THis totally makes sense as well as if a car is heavy then it will take more energy to run the same distance.

  1. Design a visualization of your choice using the data and write a brief summary about why you chose that visualization.
qplot(as.factor(gear), mpg, data=mtcars, geom=c("boxplot"), 
    fill=gear, main="                  Mileage by Gear Number",
    xlab="Number of gears", ylab="Mileage: Miles per Gallon") 

Comment It can be clearly seen that three types of cars with 3,4 and 5 gears respectively show cealrly differnet type of output. Mileage(mpg) seems best for 4 gear cars. and worst for 3 gear cars both by observing median and IQR values. Also the distribution of mpg seems most skewed for 3 gear cars as the median line is very close to first qrartile. WIth the similar analogy 5 gear mpg values seems least skewed with media line close to center of IQR.