Mutate Function

mutate() is used to create new variables of modify existing ones within a data frame. It is used with other dplyr functions, like filter(), select(). Arguments: - data: The data frame you’re working with. - new_variable: The name of the new or modified variable. - expression: A mathematical or logical expression to create the variable.

hers.dat <- read_excel("C:/Users/MAVOURIN/Desktop/hersdata.xls")
data("mtcars") 
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
mtcars <- mutate(mtcars,kpl=mpg * 0.43)
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb   kpl
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 9.030
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 9.030
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 9.804
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 9.202
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 8.041
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 7.783

Mutating Multiple Variables at Once

Context: Add two new columns — one for weight in kilograms (assuming wt is in tons) and another for horsepower per liter (hppl).

data("mtcars") 
mutate(mtcars,wt_kg = wt*454,hppl=hp/disp)
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
##                        wt_kg      hppl
## Mazda RX4           1189.480 0.6875000
## Mazda RX4 Wag       1305.250 0.6875000
## Datsun 710          1053.280 0.8611111
## Hornet 4 Drive      1459.610 0.4263566
## Hornet Sportabout   1561.760 0.4861111
## Valiant             1570.840 0.4666667
## Duster 360          1620.780 0.6805556
## Merc 240D           1448.260 0.4226312
## Merc 230            1430.100 0.6747159
## Merc 280            1561.760 0.7338902
## Merc 280C           1561.760 0.7338902
## Merc 450SE          1847.780 0.6526468
## Merc 450SL          1693.420 0.6526468
## Merc 450SLC         1716.120 0.6526468
## Cadillac Fleetwood  2383.500 0.4343220
## Lincoln Continental 2462.496 0.4673913
## Chrysler Imperial   2426.630 0.5227273
## Fiat 128             998.800 0.8386277
## Honda Civic          733.210 0.6869221
## Toyota Corolla       833.090 0.9142053
## Toyota Corona       1119.110 0.8076603
## Dodge Challenger    1598.080 0.4716981
## AMC Javelin         1559.490 0.4934211
## Camaro Z28          1743.360 0.7000000
## Pontiac Firebird    1745.630 0.4375000
## Fiat X1-9            878.490 0.8354430
## Porsche 914-2        971.560 0.7564422
## Lotus Europa         686.902 1.1882229
## Ford Pantera L      1439.180 0.7521368
## Ferrari Dino        1257.580 1.2068966
## Maserati Bora       1620.780 1.1129568
## Volvo 142E          1262.120 0.9008264
mtcars_new <- mutate(mtcars,wt_kg = wt*454,hppl=hp/disp)

Modifying Existing Variables

Suppose you want to adjust the mpg column by adding a constant factor (e.g., adding 5 to each value).

mutate(mtcars,mpg = mpg + 5)
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4           26.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       26.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          27.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      26.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   23.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             23.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          19.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           29.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            27.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            24.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           22.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          21.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          22.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         20.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  15.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 15.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   19.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            37.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         35.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      38.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       26.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Dodge Challenger    20.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         20.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          18.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    24.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           32.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       31.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        35.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      20.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        24.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       20.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          26.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

Conditional Mutations with if_else()

Create a column efficient that categorizes cars as “High Efficiency” if they have a kpl greater than 10, otherwise as “Low Efficiency”.

mtcars_new <- mutate(mtcars,kpl=mpg *0.43,efficient=if_else(kpl>10,"high eff","low eff"))

Exercise

Using mutate() with Other dplyr Functions

Ssuppose we only want to see high efficient cars weight and Cylinder

mtcars_new <-select(filter(mutate(mtcars,kpl =mpg *0.43,efficient=if_else(kpl>10,"high eff","low eff")),efficient == "high eff"),wt,cyl)

More Graphical Summaries

pie(table(hers.dat$physact))

We want to put the percentage on each activity. To plot a pie chart using ggplot2 in R, you need to use a combination of geom_bar() to create a bar plot and coord_polar() to transform it into a pie chart.

physact_n =  mutate(count(hers.dat,physact),prop=n/sum(n))
ggplot(physact_n,aes(x="",y=prop,fill=physact)) + 
  geom_bar(stat="identity",width=1) + 
  coord_polar("y",start=0) +theme_void() +
  geom_text(aes(label = paste0(round(prop, 3))), position = position_stack(vjust = 0.5))

### Exercise You are working at the Hyundai Market Research Executive Office. Your supervisor wants you to present some insights into the car market. He is interested in the proportion of competitors by cylinder count. He particularly prefers visual presentations, and pie charts are his go-to for understanding data.

More graphical Summaries

The most common graphical summary in analytic profession is a side by side boxplot. For example, if we wanted to compare the LDL cholesterol levels within each of the physical activity groups using the HERS data, we could use either boxplot or qplot.

with(hers.dat,boxplot(LDL~physact)) 
barplot(table(hers.dat$physact))

You can also use qplot

qplot(x = cyl,y=wt,data=mtcars,geom="boxplot")
qplot(x = as.factor(cyl), y = wt, data = mtcars, geom = "boxplot")

## Two quantitative variable - correlation cor() - regression will be covered in week 10

We expect see a strong positive correlation between cyl and wt and a negative correlation between mpg and wt.

with(mtcars,cor(cyl,wt))
## [1] 0.7824958
with(mtcars,cor(mpg,wt))
## [1] -0.8676594
with(mtcars,cor(carb,am))
## [1] 0.05753435
plot(mtcars$mpg,mtcars$wt)

qplot(x=mpg,y=wt,data=mtcars,geom="point")
qplot(x=mpg,y=wt,data=mtcars,geom="line")

You can add a trend line

ggplot(mtcars,aes(x=mpg,y=wt)) + geom_point() + stat_smooth()

You can present the trend for different categories

ggplot(mtcars,aes(x=mpg,y=wt,colour=as.factor(am))) + geom_point() + stat_smooth()