mutate() is used to create new variables of modify existing ones within a data frame. It is used with other dplyr functions, like filter(), select(). Arguments: - data: The data frame you’re working with. - new_variable: The name of the new or modified variable. - expression: A mathematical or logical expression to create the variable.
hers.dat <- read_excel("C:/Users/MAVOURIN/Desktop/hersdata.xls")
data("mtcars")
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
mtcars <- mutate(mtcars,kpl=mpg * 0.43)
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb kpl
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 9.030
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 9.030
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 9.804
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 9.202
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 8.041
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7.783
data("mtcars")
mutate(mtcars,wt_kg = wt*454,hppl=hp/disp)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
## wt_kg hppl
## Mazda RX4 1189.480 0.6875000
## Mazda RX4 Wag 1305.250 0.6875000
## Datsun 710 1053.280 0.8611111
## Hornet 4 Drive 1459.610 0.4263566
## Hornet Sportabout 1561.760 0.4861111
## Valiant 1570.840 0.4666667
## Duster 360 1620.780 0.6805556
## Merc 240D 1448.260 0.4226312
## Merc 230 1430.100 0.6747159
## Merc 280 1561.760 0.7338902
## Merc 280C 1561.760 0.7338902
## Merc 450SE 1847.780 0.6526468
## Merc 450SL 1693.420 0.6526468
## Merc 450SLC 1716.120 0.6526468
## Cadillac Fleetwood 2383.500 0.4343220
## Lincoln Continental 2462.496 0.4673913
## Chrysler Imperial 2426.630 0.5227273
## Fiat 128 998.800 0.8386277
## Honda Civic 733.210 0.6869221
## Toyota Corolla 833.090 0.9142053
## Toyota Corona 1119.110 0.8076603
## Dodge Challenger 1598.080 0.4716981
## AMC Javelin 1559.490 0.4934211
## Camaro Z28 1743.360 0.7000000
## Pontiac Firebird 1745.630 0.4375000
## Fiat X1-9 878.490 0.8354430
## Porsche 914-2 971.560 0.7564422
## Lotus Europa 686.902 1.1882229
## Ford Pantera L 1439.180 0.7521368
## Ferrari Dino 1257.580 1.2068966
## Maserati Bora 1620.780 1.1129568
## Volvo 142E 1262.120 0.9008264
mtcars_new <- mutate(mtcars,wt_kg = wt*454,hppl=hp/disp)
Suppose you want to adjust the mpg column by adding a constant factor (e.g., adding 5 to each value).
mutate(mtcars,mpg = mpg + 5)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 26.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 26.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 27.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 26.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 23.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 23.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 19.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 29.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 27.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 24.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 22.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 21.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 22.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 20.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 15.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 15.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 19.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 37.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 35.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 38.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 26.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 20.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 20.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 18.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 24.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 32.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 31.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 35.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 20.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 24.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 20.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 26.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Create a column efficient that categorizes cars as “High Efficiency” if they have a kpl greater than 10, otherwise as “Low Efficiency”.
mtcars_new <- mutate(mtcars,kpl=mpg *0.43,efficient=if_else(kpl>10,"high eff","low eff"))
Ssuppose we only want to see high efficient cars weight and Cylinder
mtcars_new <-select(filter(mutate(mtcars,kpl =mpg *0.43,efficient=if_else(kpl>10,"high eff","low eff")),efficient == "high eff"),wt,cyl)
pie(table(hers.dat$physact))
We want to put the percentage on each activity. To plot a pie chart using ggplot2 in R, you need to use a combination of geom_bar() to create a bar plot and coord_polar() to transform it into a pie chart.
physact_n = mutate(count(hers.dat,physact),prop=n/sum(n))
ggplot(physact_n,aes(x="",y=prop,fill=physact)) +
geom_bar(stat="identity",width=1) +
coord_polar("y",start=0) +theme_void() +
geom_text(aes(label = paste0(round(prop, 3))), position = position_stack(vjust = 0.5))
### Exercise You are working at the Hyundai Market Research Executive
Office. Your supervisor wants you to present some insights into the car
market. He is interested in the proportion of competitors by cylinder
count. He particularly prefers visual presentations, and pie charts are
his go-to for understanding data.
The most common graphical summary in analytic profession is a side by side boxplot. For example, if we wanted to compare the LDL cholesterol levels within each of the physical activity groups using the HERS data, we could use either boxplot or qplot.
with(hers.dat,boxplot(LDL~physact))
barplot(table(hers.dat$physact))
You can also use qplot
qplot(x = cyl,y=wt,data=mtcars,geom="boxplot")
qplot(x = as.factor(cyl), y = wt, data = mtcars, geom = "boxplot")
## Two quantitative variable - correlation cor() - regression will be
covered in week 10
We expect see a strong positive correlation between cyl and wt and a negative correlation between mpg and wt.
with(mtcars,cor(cyl,wt))
## [1] 0.7824958
with(mtcars,cor(mpg,wt))
## [1] -0.8676594
with(mtcars,cor(carb,am))
## [1] 0.05753435
plot(mtcars$mpg,mtcars$wt)
qplot(x=mpg,y=wt,data=mtcars,geom="point")
qplot(x=mpg,y=wt,data=mtcars,geom="line")
You can add a trend line
ggplot(mtcars,aes(x=mpg,y=wt)) + geom_point() + stat_smooth()
You can present the trend for different categories
ggplot(mtcars,aes(x=mpg,y=wt,colour=as.factor(am))) + geom_point() + stat_smooth()