Questions

Find the mtcars data in R. This is the dataset that you will use to create your graphics. Use that data to draw by hand graphics for the next 4 questions.

# Data
data("mtcars")
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000

1. Draw a pie chart showing the proportion of cars from the mtcars data set that have different carb values.

# Calculations to draw pie chart

# calculate frequency of different carb values for mtcars dataset
t_cars<-table(mtcars$carb)

# calculate the proporation of cars
total=sum(as.vector(t_cars)) 
percentage=(as.vector(t_cars)) *100/total

# calculate degrees for pie chart
degrees=(as.vector(t_cars)) *360/total

t_cars_new <- rbind(as.numeric(names(t_cars)), t_cars)
rownames(t_cars_new) <- c("Carb", "Count")

# add proportion and degrees rows in the same table
t_cars_new <- as.data.frame(t_cars_new)
t_cars_new <- rbind(t_cars_new,percentage)
t_cars_new <- rbind(t_cars_new,degrees)
rownames(t_cars_new) <- c("Carb", "Count","Percentage", "Degree")
t_cars_new
##                 1      2      3      4      6      8
## Carb        1.000   2.00  3.000   4.00  6.000  8.000
## Count       7.000  10.00  3.000  10.00  1.000  1.000
## Percentage 21.875  31.25  9.375  31.25  3.125  3.125
## Degree     78.750 112.50 33.750 112.50 11.250 11.250
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_1-edit.jpg")

2. Draw a bar graph, that shows the number of each gear type in mtcars.

# Calculate the number of each gear type by creating table using one entry gear
g_cars <- table(mtcars$gear)
g_cars_new <- rbind(as.numeric(names(g_cars)), g_cars)
rownames(g_cars_new) <- c("Gear", "Count")
g_cars_new
##        3  4 5
## Gear   3  4 5
## Count 15 12 5
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_2-edit.jpg")

3. Next show a stacked bar graph of the number of each gear type and how they are further divded out by cyl.

# Calculate the number of each gear type for each cyl by creating table with two entries gear and cyl
gc_cars <- table(mtcars$gear,mtcars$cyl)
names(dimnames(gc_cars)) <- c("gear", "cyl")
gc_cars
##     cyl
## gear  4  6  8
##    3  1  2 12
##    4  8  4  0
##    5  2  1  2
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_3-edit.jpg")

4. Draw a scatter plot showing the relationship between wt and mpg.

# Extract only wt and mpg columns from dataset to draw a scatter plot
wm_cars <- subset(mtcars, select=c("wt", "mpg"))
wm_cars
##                        wt  mpg
## Mazda RX4           2.620 21.0
## Mazda RX4 Wag       2.875 21.0
## Datsun 710          2.320 22.8
## Hornet 4 Drive      3.215 21.4
## Hornet Sportabout   3.440 18.7
## Valiant             3.460 18.1
## Duster 360          3.570 14.3
## Merc 240D           3.190 24.4
## Merc 230            3.150 22.8
## Merc 280            3.440 19.2
## Merc 280C           3.440 17.8
## Merc 450SE          4.070 16.4
## Merc 450SL          3.730 17.3
## Merc 450SLC         3.780 15.2
## Cadillac Fleetwood  5.250 10.4
## Lincoln Continental 5.424 10.4
## Chrysler Imperial   5.345 14.7
## Fiat 128            2.200 32.4
## Honda Civic         1.615 30.4
## Toyota Corolla      1.835 33.9
## Toyota Corona       2.465 21.5
## Dodge Challenger    3.520 15.5
## AMC Javelin         3.435 15.2
## Camaro Z28          3.840 13.3
## Pontiac Firebird    3.845 19.2
## Fiat X1-9           1.935 27.3
## Porsche 914-2       2.140 26.0
## Lotus Europa        1.513 30.4
## Ford Pantera L      3.170 15.8
## Ferrari Dino        2.770 19.7
## Maserati Bora       3.570 15.0
## Volvo 142E          2.780 21.4
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_4-edit.jpg")

5. Design a visualization of your choice using the data.

# Histogram 
# Disp range is from 71 to 472. Plotted histogram to visualize disp. Bin size is selected as 50.
# Divide total data range into bin size of 50. 
Range_disp <- cut(mtcars$disp, breaks=seq(0,500, by=50))
# Calculate total count/freqeuncy for each bin/range
as.data.frame(table(Range_disp))
##    Range_disp Freq
## 1      (0,50]    0
## 2    (50,100]    5
## 3   (100,150]    7
## 4   (150,200]    4
## 5   (200,250]    1
## 6   (250,300]    4
## 7   (300,350]    4
## 8   (350,400]    4
## 9   (400,450]    1
## 10  (450,500]    2
knitr::include_graphics("E:\\Harrisburg courses\\sem 6-Fall 2017\\Data Visualization\\lecture1\\Question_5-edit.jpg")