mtcarscyl = table(mtcars$cyl)
mtcars$cyl <- factor(mtcars$cyl) # Create a categorical variable
percent_cyl <- round(100*mtcarscyl/sum(mtcarscyl), 1)
pielabels <- paste(percent_cyl, "%", sep="")
pie(mtcarscyl,col = rainbow(length(mtcarscyl)), labels = pielabels , main = 'Number of Cylinders', cex = 0.8)
legend("right", c("4","6","8"), cex=0.6, fill= rainbow(length(mtcarscyl)))
Prior to the creation of graphs, I did the head() and summary() to better understand the dataset in order to generate meaningful graphs. The first graph above is to show percentage of each kind of number of Cylinders within this dataset. I first created a table and make the variable (cyl) into a factor to better manipulate the data and creat graph. Secondly, I create another variable calculating the percentage of each type of cylinders. Lastly, I combine the original data with the percentage to present this pir chart.
mtcars$carb <- factor(mtcars$carb)
bar <- ggplot(data=mtcars, aes(carb)) +
geom_bar(color = "grey", width = 0.3) +
ggtitle("Carburetors Type Count")+
theme_bw()
bar
The second graph I did is a bar chart with the count of one categorical variable - Carburetors. Ggplot was used and I added the title and theme into the chart to make it more clear.
mtcars$gear <- factor(mtcars$gear)
gear_cyl <- ggplot(data=mtcars, aes(gear, fill = cyl))+
geom_bar(position = "stack")+
ggtitle("Gear Type by Number of Cylinders")+
labs(colour = "Number of Cylinders")+
theme_bw()
gear_cyl
The third chart is still the bar chart, but this time with the number of each gear type with the stack showing the division ov the number of cylinder within different types of numbers of gear.
scatter <- ggplot(mtcars, aes(wt, mpg)) +
geom_point(fill = "red",colour = "red", alpha=0.9, shape = 1, size = 3) +
xlab('Weight (x 1000lbs)') + ylab('Miles per Gallon') +
geom_smooth() +
ggtitle("Relationship between Weight (1000 lbs) and Miles per gallon")+
theme_bw()
scatter
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
In the forth graph, I created a scatter plot with Weight as x variable and Miles per Gallon as y variable. I also added the smooth function in ggplot to further show the relationship between two factors.
mpg_gear <- qplot(gear, mpg, data=mtcars, geom=c("boxplot", "jitter"),
fill=gear, main="Mileage by Gear Number",
xlab="", ylab="Miles per Gallon")
mpg_gear
For the last graph I created boxplots of mpg by number of gears to see the distribution mileage across different kind of miles per gallon. Please note that the points, or observations, are overlayed and jittered.