Find the mtcars data in R. This is the dataset that you will use to create your graphics. Use that data to draw by hand graphics for the next 4 questions.
# Data
data("mtcars")
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
summary(mtcars)
## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
## Median :19.20 Median :6.000 Median :196.3 Median :123.0
## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
## drat wt qsec vs
## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
## Median :3.695 Median :3.325 Median :17.71 Median :0.0000
## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Min. :0.0000 Min. :3.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
## Median :0.0000 Median :4.000 Median :2.000
## Mean :0.4062 Mean :3.688 Mean :2.812
## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000 Max. :8.000
1. Draw a pie chart showing the proportion of cars from the mtcars data set that have different carb values.
# Calculations to draw pie chart
# calculate frequency of different carb values for mtcars dataset
t_cars<-table(mtcars$carb)
# calculate the proporation of cars
total=sum(as.vector(t_cars))
percentage=(as.vector(t_cars)) *100/total
# calculate degrees for pie chart
degrees=(as.vector(t_cars)) *360/total
t_cars_new <- rbind(as.numeric(names(t_cars)), t_cars)
rownames(t_cars_new) <- c("Carb", "Count")
# add proportion and degrees rows in the same table
t_cars_new <- as.data.frame(t_cars_new)
t_cars_new <- rbind(t_cars_new,percentage)
t_cars_new <- rbind(t_cars_new,degrees)
rownames(t_cars_new) <- c("Carb", "Count","Percentage", "Degree")
t_cars_new
## 1 2 3 4 6 8
## Carb 1.000 2.00 3.000 4.00 6.000 8.000
## Count 7.000 10.00 3.000 10.00 1.000 1.000
## Percentage 21.875 31.25 9.375 31.25 3.125 3.125
## Degree 78.750 112.50 33.750 112.50 11.250 11.250
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_1-edit.jpg")
2. Draw a bar graph, that shows the number of each gear type in mtcars.
# Calculate the number of each gear type by creating table using one entry gear
g_cars <- table(mtcars$gear)
g_cars_new <- rbind(as.numeric(names(g_cars)), g_cars)
rownames(g_cars_new) <- c("Gear", "Count")
g_cars_new
## 3 4 5
## Gear 3 4 5
## Count 15 12 5
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_2-edit.jpg")
3. Next show a stacked bar graph of the number of each gear type and how they are further divded out by cyl.
# Calculate the number of each gear type for each cyl by creating table with two entries gear and cyl
gc_cars <- table(mtcars$gear,mtcars$cyl)
names(dimnames(gc_cars)) <- c("gear", "cyl")
gc_cars
## cyl
## gear 4 6 8
## 3 1 2 12
## 4 8 4 0
## 5 2 1 2
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_3-edit.jpg")
4. Draw a scatter plot showing the relationship between wt and mpg.
# Extract only wt and mpg columns from dataset to draw a scatter plot
wm_cars <- subset(mtcars, select=c("wt", "mpg"))
wm_cars
## wt mpg
## Mazda RX4 2.620 21.0
## Mazda RX4 Wag 2.875 21.0
## Datsun 710 2.320 22.8
## Hornet 4 Drive 3.215 21.4
## Hornet Sportabout 3.440 18.7
## Valiant 3.460 18.1
## Duster 360 3.570 14.3
## Merc 240D 3.190 24.4
## Merc 230 3.150 22.8
## Merc 280 3.440 19.2
## Merc 280C 3.440 17.8
## Merc 450SE 4.070 16.4
## Merc 450SL 3.730 17.3
## Merc 450SLC 3.780 15.2
## Cadillac Fleetwood 5.250 10.4
## Lincoln Continental 5.424 10.4
## Chrysler Imperial 5.345 14.7
## Fiat 128 2.200 32.4
## Honda Civic 1.615 30.4
## Toyota Corolla 1.835 33.9
## Toyota Corona 2.465 21.5
## Dodge Challenger 3.520 15.5
## AMC Javelin 3.435 15.2
## Camaro Z28 3.840 13.3
## Pontiac Firebird 3.845 19.2
## Fiat X1-9 1.935 27.3
## Porsche 914-2 2.140 26.0
## Lotus Europa 1.513 30.4
## Ford Pantera L 3.170 15.8
## Ferrari Dino 2.770 19.7
## Maserati Bora 3.570 15.0
## Volvo 142E 2.780 21.4
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_4-edit.jpg")
5. Design a visualization of your choice using the data.
# Histogram
# Disp range is from 71 to 472. Plotted histogram to visualize disp. Bin size is selected as 50.
# Divide total data range into bin size of 50.
Range_disp <- cut(mtcars$disp, breaks=seq(0,500, by=50))
# Calculate total count/freqeuncy for each bin/range
as.data.frame(table(Range_disp))
## Range_disp Freq
## 1 (0,50] 0
## 2 (50,100] 5
## 3 (100,150] 7
## 4 (150,200] 4
## 5 (200,250] 1
## 6 (250,300] 4
## 7 (300,350] 4
## 8 (350,400] 4
## 9 (400,450] 1
## 10 (450,500] 2
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_5-edit.jpg")