#pfizer dataset
data=read.csv("Pfizer.csv")
data
## Org_Indiv First_Plus First_Name Last_Name
## 1 3-D Medical Services Llc Steven Bruce Steven Deitelzweig
## 2 Aa Doctors, Inc. Aakash Mohan Aakash Ahuja
## 3 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 4 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 5 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 6 Abdullah Raffee Md Pc Abdullah Abdullah Raffee
## 7 Abebe, Sheila Y Sheila Y Sheila Abebe
## 8 Abebe, Sheila Y Sheila Y Sheila Abebe
## 9 Abilene Family Foot Center Galen Chris Galen Albritton
## 10 Abolnik, Igor Z Igor Z Igor Abolnik
## 11 Abolnik, Igor Z Igor Z Igor Abolnik
## City State Category Cash Other Total
## 1 New Orleans LA Professional Advising 2625 0 2625
## 2 Paso Robles CA Expert-Led Forums 1000 0 1000
## 3 Miami FL Business Related Travel 0 448 448
## 4 Miami FL Meals 0 119 119
## 5 Miami FL Professional Advising 1800 0 1800
## 6 Flint MI Expert-Led Forums 750 0 750
## 7 Indianapolis IN Educational Items 0 47 47
## 8 Indianapolis IN Expert-Led Forums 825 0 825
## 9 Abilene TX Professional Advising 3000 0 3000
## 10 Provo UT Business Related Travel 0 396 396
## 11 Provo Ut Expert-Led Forums 1750 0 1750
‘pfizer’ is a sample data.
View the structure of the data and perform conversion operations & Provide a statistical summary of the given data.
#a. and b. structure and summary of the data
str(data)#structure
## 'data.frame': 11 obs. of 10 variables:
## $ Org_Indiv : chr "3-D Medical Services Llc" "Aa Doctors, Inc." "Abbo, Lilian Margarita" "Abbo, Lilian Margarita" ...
## $ First_Plus: chr "Steven Bruce" "Aakash Mohan" "Lilian Margarita" "Lilian Margarita" ...
## $ First_Name: chr "Steven" "Aakash" "Lilian" "Lilian" ...
## $ Last_Name : chr "Deitelzweig" "Ahuja" "Abbo" "Abbo" ...
## $ City : chr "New Orleans" "Paso Robles" "Miami" "Miami" ...
## $ State : chr "LA" "CA" "FL" "FL" ...
## $ Category : chr "Professional Advising" "Expert-Led Forums" "Business Related Travel" "Meals" ...
## $ Cash : int 2625 1000 0 0 1800 750 0 825 3000 0 ...
## $ Other : int 0 0 448 119 0 0 47 0 0 396 ...
## $ Total : int 2625 1000 448 119 1800 750 47 825 3000 396 ...
summary(data)#summary
## Org_Indiv First_Plus First_Name Last_Name
## Length:11 Length:11 Length:11 Length:11
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## City State Category Cash
## Length:11 Length:11 Length:11 Min. : 0
## Class :character Class :character Class :character 1st Qu.: 0
## Mode :character Mode :character Mode :character Median : 825
## Mean :1068
## 3rd Qu.:1775
## Max. :3000
## Other Total
## Min. : 0.00 Min. : 47
## 1st Qu.: 0.00 1st Qu.: 422
## Median : 0.00 Median : 825
## Mean : 91.82 Mean :1160
## 3rd Qu.: 83.00 3rd Qu.:1775
## Max. :448.00 Max. :3000
#conversion functions
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.2.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Example 1: Convert a column to a different data type (e.g., from character to numeric)
my_data <- as.numeric(data$Total)
my_data
## [1] 2625 1000 448 119 1800 750 47 825 3000 396 1750
# Example 2: Convert a column to a factor
my_data <- as.factor(data$Category)
my_data
## [1] Professional Advising Expert-Led Forums Business Related Travel
## [4] Meals Professional Advising Expert-Led Forums
## [7] Educational Items Expert-Led Forums Professional Advising
## [10] Business Related Travel Expert-Led Forums
## 5 Levels: Business Related Travel Educational Items ... Professional Advising
# Example 3: Perform some data transformations using dplyr (e.g., calculate a new column)
my_data <- data %>%
mutate(TotalSquared = Total^2)
my_data
## Org_Indiv First_Plus First_Name Last_Name
## 1 3-D Medical Services Llc Steven Bruce Steven Deitelzweig
## 2 Aa Doctors, Inc. Aakash Mohan Aakash Ahuja
## 3 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 4 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 5 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 6 Abdullah Raffee Md Pc Abdullah Abdullah Raffee
## 7 Abebe, Sheila Y Sheila Y Sheila Abebe
## 8 Abebe, Sheila Y Sheila Y Sheila Abebe
## 9 Abilene Family Foot Center Galen Chris Galen Albritton
## 10 Abolnik, Igor Z Igor Z Igor Abolnik
## 11 Abolnik, Igor Z Igor Z Igor Abolnik
## City State Category Cash Other Total TotalSquared
## 1 New Orleans LA Professional Advising 2625 0 2625 6890625
## 2 Paso Robles CA Expert-Led Forums 1000 0 1000 1000000
## 3 Miami FL Business Related Travel 0 448 448 200704
## 4 Miami FL Meals 0 119 119 14161
## 5 Miami FL Professional Advising 1800 0 1800 3240000
## 6 Flint MI Expert-Led Forums 750 0 750 562500
## 7 Indianapolis IN Educational Items 0 47 47 2209
## 8 Indianapolis IN Expert-Led Forums 825 0 825 680625
## 9 Abilene TX Professional Advising 3000 0 3000 9000000
## 10 Provo UT Business Related Travel 0 396 396 156816
## 11 Provo Ut Expert-Led Forums 1750 0 1750 3062500
# After performing the desired conversion and transformation operations, you can view the modified data frame:
str(my_data)
## 'data.frame': 11 obs. of 11 variables:
## $ Org_Indiv : chr "3-D Medical Services Llc" "Aa Doctors, Inc." "Abbo, Lilian Margarita" "Abbo, Lilian Margarita" ...
## $ First_Plus : chr "Steven Bruce" "Aakash Mohan" "Lilian Margarita" "Lilian Margarita" ...
## $ First_Name : chr "Steven" "Aakash" "Lilian" "Lilian" ...
## $ Last_Name : chr "Deitelzweig" "Ahuja" "Abbo" "Abbo" ...
## $ City : chr "New Orleans" "Paso Robles" "Miami" "Miami" ...
## $ State : chr "LA" "CA" "FL" "FL" ...
## $ Category : chr "Professional Advising" "Expert-Led Forums" "Business Related Travel" "Meals" ...
## $ Cash : int 2625 1000 0 0 1800 750 0 825 3000 0 ...
## $ Other : int 0 0 448 119 0 0 47 0 0 396 ...
## $ Total : int 2625 1000 448 119 1800 750 47 825 3000 396 ...
## $ TotalSquared: num 6890625 1000000 200704 14161 3240000 ...
dplyr is an R package for efficient data manipulation.
It offers easy-to-use functions like filtering, sorting, and summarizing
data, following tidy data principles for analysis.
#c.head of the data
head(data)
## Org_Indiv First_Plus First_Name Last_Name City
## 1 3-D Medical Services Llc Steven Bruce Steven Deitelzweig New Orleans
## 2 Aa Doctors, Inc. Aakash Mohan Aakash Ahuja Paso Robles
## 3 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo Miami
## 4 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo Miami
## 5 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo Miami
## 6 Abdullah Raffee Md Pc Abdullah Abdullah Raffee Flint
## State Category Cash Other Total
## 1 LA Professional Advising 2625 0 2625
## 2 CA Expert-Led Forums 1000 0 1000
## 3 FL Business Related Travel 0 448 448
## 4 FL Meals 0 119 119
## 5 FL Professional Advising 1800 0 1800
## 6 MI Expert-Led Forums 750 0 750
# d) Filter the data for payments related to 'Expert-Led Forums' or 'Professional Advising' and arrange alphabetically by doctor
filtered_data <- data %>%
filter(Category %in% c("Expert-Led Forums", "Professional Advising")) %>%
arrange(Last_Name, First_Name)
# Print filtered data
print("Filtered Data:")
## [1] "Filtered Data:"
print(filtered_data)
## Org_Indiv First_Plus First_Name Last_Name
## 1 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 2 Abebe, Sheila Y Sheila Y Sheila Abebe
## 3 Abolnik, Igor Z Igor Z Igor Abolnik
## 4 Aa Doctors, Inc. Aakash Mohan Aakash Ahuja
## 5 Abilene Family Foot Center Galen Chris Galen Albritton
## 6 3-D Medical Services Llc Steven Bruce Steven Deitelzweig
## 7 Abdullah Raffee Md Pc Abdullah Abdullah Raffee
## City State Category Cash Other Total
## 1 Miami FL Professional Advising 1800 0 1800
## 2 Indianapolis IN Expert-Led Forums 825 0 825
## 3 Provo Ut Expert-Led Forums 1750 0 1750
## 4 Paso Robles CA Expert-Led Forums 1000 0 1000
## 5 Abilene TX Professional Advising 3000 0 3000
## 6 New Orleans LA Professional Advising 2625 0 2625
## 7 Flint MI Expert-Led Forums 750 0 750
# Load the stringr package to use str_detect
library(stringr)
## Warning: package 'stringr' was built under R version 4.2.3
# e) Use pattern matching to filter text
pattern_matched_data <- data %>%
filter(str_detect(Category, "Professional Advising"))
# Print pattern matched data
print("Pattern Matched Data:")
## [1] "Pattern Matched Data:"
print(pattern_matched_data)
## Org_Indiv First_Plus First_Name Last_Name
## 1 3-D Medical Services Llc Steven Bruce Steven Deitelzweig
## 2 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 3 Abilene Family Foot Center Galen Chris Galen Albritton
## City State Category Cash Other Total
## 1 New Orleans LA Professional Advising 2625 0 2625
## 2 Miami FL Professional Advising 1800 0 1800
## 3 Abilene TX Professional Advising 3000 0 3000
The stringr package is an R package that provides a set
of functions for working with strings, making string manipulation and
pattern matching more convenient and intuitive.
# f) Append one data frame to another
# Assuming df2 is another dataframe
df2 <- data.frame(Org_Indiv = c("Sample Org"),
First_Plus = c("John Doe"),
First_Name = c("John"),
Last_Name = c("Doe"),
City = c("Sample City"),
State = c("XX"),
Category = c("Professional Advising"),
Cash = c(1500),
Other = c(200),
Total = c(1700))
appended_data <- rbind(data, df2)
# Print appended data
print("Appended Data:")
## [1] "Appended Data:"
print(appended_data)
## Org_Indiv First_Plus First_Name Last_Name
## 1 3-D Medical Services Llc Steven Bruce Steven Deitelzweig
## 2 Aa Doctors, Inc. Aakash Mohan Aakash Ahuja
## 3 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 4 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 5 Abbo, Lilian Margarita Lilian Margarita Lilian Abbo
## 6 Abdullah Raffee Md Pc Abdullah Abdullah Raffee
## 7 Abebe, Sheila Y Sheila Y Sheila Abebe
## 8 Abebe, Sheila Y Sheila Y Sheila Abebe
## 9 Abilene Family Foot Center Galen Chris Galen Albritton
## 10 Abolnik, Igor Z Igor Z Igor Abolnik
## 11 Abolnik, Igor Z Igor Z Igor Abolnik
## 12 Sample Org John Doe John Doe
## City State Category Cash Other Total
## 1 New Orleans LA Professional Advising 2625 0 2625
## 2 Paso Robles CA Expert-Led Forums 1000 0 1000
## 3 Miami FL Business Related Travel 0 448 448
## 4 Miami FL Meals 0 119 119
## 5 Miami FL Professional Advising 1800 0 1800
## 6 Flint MI Expert-Led Forums 750 0 750
## 7 Indianapolis IN Educational Items 0 47 47
## 8 Indianapolis IN Expert-Led Forums 825 0 825
## 9 Abilene TX Professional Advising 3000 0 3000
## 10 Provo UT Business Related Travel 0 396 396
## 11 Provo Ut Expert-Led Forums 1750 0 1750
## 12 Sample City XX Professional Advising 1500 200 1700
# g) Write data to a CSV file
write.csv(data, file = "data.csv", row.names = FALSE)
# h) View the structure and summary of the given data
str(data)
## 'data.frame': 11 obs. of 10 variables:
## $ Org_Indiv : chr "3-D Medical Services Llc" "Aa Doctors, Inc." "Abbo, Lilian Margarita" "Abbo, Lilian Margarita" ...
## $ First_Plus: chr "Steven Bruce" "Aakash Mohan" "Lilian Margarita" "Lilian Margarita" ...
## $ First_Name: chr "Steven" "Aakash" "Lilian" "Lilian" ...
## $ Last_Name : chr "Deitelzweig" "Ahuja" "Abbo" "Abbo" ...
## $ City : chr "New Orleans" "Paso Robles" "Miami" "Miami" ...
## $ State : chr "LA" "CA" "FL" "FL" ...
## $ Category : chr "Professional Advising" "Expert-Led Forums" "Business Related Travel" "Meals" ...
## $ Cash : int 2625 1000 0 0 1800 750 0 825 3000 0 ...
## $ Other : int 0 0 448 119 0 0 47 0 0 396 ...
## $ Total : int 2625 1000 448 119 1800 750 47 825 3000 396 ...
summary(data)
## Org_Indiv First_Plus First_Name Last_Name
## Length:11 Length:11 Length:11 Length:11
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## City State Category Cash
## Length:11 Length:11 Length:11 Min. : 0
## Class :character Class :character Class :character 1st Qu.: 0
## Mode :character Mode :character Mode :character Median : 825
## Mean :1068
## 3rd Qu.:1775
## Max. :3000
## Other Total
## Min. : 0.00 Min. : 47
## 1st Qu.: 0.00 1st Qu.: 422
## Median : 0.00 Median : 825
## Mean : 91.82 Mean :1160
## 3rd Qu.: 83.00 3rd Qu.:1775
## Max. :448.00 Max. :3000
# i) Convert the 'Total' column to a numeric variable
data$Total <- as.numeric(data$Total)
data$Total
## [1] 2625 1000 448 119 1800 750 47 825 3000 396 1750
#mtcars dataset
data("mtcars")
mtcars
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
summary(mtcars)
## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
## Median :19.20 Median :6.000 Median :196.3 Median :123.0
## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
## drat wt qsec vs
## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
## Median :3.695 Median :3.325 Median :17.71 Median :0.0000
## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Min. :0.0000 Min. :3.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
## Median :0.0000 Median :4.000 Median :2.000
## Mean :0.4062 Mean :3.688 Mean :2.812
## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000 Max. :8.000
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
is.null(mtcars)
## [1] FALSE
mtcars is an in-built dataset of rstudio. no null values are there in this dataset.
selecting specific columns using select function:
# Select the mpg and cyl columns
mtcars_select <- select(mtcars, mpg, cyl)
mtcars_select
## mpg cyl
## Mazda RX4 21.0 6
## Mazda RX4 Wag 21.0 6
## Datsun 710 22.8 4
## Hornet 4 Drive 21.4 6
## Hornet Sportabout 18.7 8
## Valiant 18.1 6
## Duster 360 14.3 8
## Merc 240D 24.4 4
## Merc 230 22.8 4
## Merc 280 19.2 6
## Merc 280C 17.8 6
## Merc 450SE 16.4 8
## Merc 450SL 17.3 8
## Merc 450SLC 15.2 8
## Cadillac Fleetwood 10.4 8
## Lincoln Continental 10.4 8
## Chrysler Imperial 14.7 8
## Fiat 128 32.4 4
## Honda Civic 30.4 4
## Toyota Corolla 33.9 4
## Toyota Corona 21.5 4
## Dodge Challenger 15.5 8
## AMC Javelin 15.2 8
## Camaro Z28 13.3 8
## Pontiac Firebird 19.2 8
## Fiat X1-9 27.3 4
## Porsche 914-2 26.0 4
## Lotus Europa 30.4 4
## Ford Pantera L 15.8 8
## Ferrari Dino 19.7 6
## Maserati Bora 15.0 8
## Volvo 142E 21.4 4
filtering the data through filter function using a condition:
# Filter for cars with more than 6 cylinders
mtcars_filter <- filter(mtcars, cyl > 6)
mtcars_filter
## mpg cyl disp hp drat wt qsec vs am gear carb
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
sorting using arrange function:
# Arrange the dataset by descending mpg
mtcars_arrange <- arrange(mtcars, desc(mpg))
mtcars_arrange
## mpg cyl disp hp drat wt qsec vs am gear carb
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
desc function sorts data in descending order.
renaming a column using rename function:
# Rename the mpg column to miles_per_gallon
mtcars_rename <- rename(mtcars, miles_per_gallon = mpg)
mtcars_rename
## miles_per_gallon cyl disp hp drat wt qsec vs am gear
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4
## carb
## Mazda RX4 4
## Mazda RX4 Wag 4
## Datsun 710 1
## Hornet 4 Drive 1
## Hornet Sportabout 2
## Valiant 1
## Duster 360 4
## Merc 240D 2
## Merc 230 2
## Merc 280 4
## Merc 280C 4
## Merc 450SE 3
## Merc 450SL 3
## Merc 450SLC 3
## Cadillac Fleetwood 4
## Lincoln Continental 4
## Chrysler Imperial 4
## Fiat 128 1
## Honda Civic 2
## Toyota Corolla 1
## Toyota Corona 1
## Dodge Challenger 2
## AMC Javelin 2
## Camaro Z28 4
## Pontiac Firebird 2
## Fiat X1-9 1
## Porsche 914-2 2
## Lotus Europa 2
## Ford Pantera L 4
## Ferrari Dino 6
## Maserati Bora 8
## Volvo 142E 2
add a column using mutate function:
# Add a new column for horsepower per weight
mtcars_mutate <- mutate(mtcars, hp_per_wt = hp / wt)
mtcars_mutate
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
## hp_per_wt
## Mazda RX4 41.98473
## Mazda RX4 Wag 38.26087
## Datsun 710 40.08621
## Hornet 4 Drive 34.21462
## Hornet Sportabout 50.87209
## Valiant 30.34682
## Duster 360 68.62745
## Merc 240D 19.43574
## Merc 230 30.15873
## Merc 280 35.75581
## Merc 280C 35.75581
## Merc 450SE 44.22604
## Merc 450SL 48.25737
## Merc 450SLC 47.61905
## Cadillac Fleetwood 39.04762
## Lincoln Continental 39.63864
## Chrysler Imperial 43.03087
## Fiat 128 30.00000
## Honda Civic 32.19814
## Toyota Corolla 35.42234
## Toyota Corona 39.35091
## Dodge Challenger 42.61364
## AMC Javelin 43.66812
## Camaro Z28 63.80208
## Pontiac Firebird 45.51365
## Fiat X1-9 34.10853
## Porsche 914-2 42.52336
## Lotus Europa 74.68605
## Ford Pantera L 83.28076
## Ferrari Dino 63.17690
## Maserati Bora 93.83754
## Volvo 142E 39.20863
# Combine two datasets with the same columns
mtcars_combined <- bind_rows(mtcars, mtcars)
mtcars_combined
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4...1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag...2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710...3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive...4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout...5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant...6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360...7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D...8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230...9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280...10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C...11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE...12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL...13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC...14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood...15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental...16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial...17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128...18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic...19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla...20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona...21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger...22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin...23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28...24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird...25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9...26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2...27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa...28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L...29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino...30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora...31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E...32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
## Mazda RX4...33 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag...34 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710...35 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive...36 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout...37 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant...38 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360...39 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D...40 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230...41 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280...42 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C...43 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE...44 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL...45 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC...46 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood...47 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental...48 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial...49 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128...50 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic...51 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla...52 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona...53 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger...54 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin...55 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28...56 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird...57 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9...58 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2...59 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa...60 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L...61 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino...62 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora...63 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E...64 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
# Group the dataset by number of cylinders and summarize the mean mpg
mtcars_grouped <- group_by(mtcars, cyl)
mtcars_summarized <- summarize(mtcars_grouped, mean_mpg = mean(mpg))
mtcars_grouped
## # A tibble: 32 × 11
## # Groups: cyl [3]
## mpg cyl disp hp drat wt qsec vs am gear carb
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
## 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
## 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
## 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
## 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
## 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
## 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
## 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
## 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
## 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
## # ℹ 22 more rows
mtcars_summarized
## # A tibble: 3 × 2
## cyl mean_mpg
## <dbl> <dbl>
## 1 4 26.7
## 2 6 19.7
## 3 8 15.1
the above functions are called data manipulation functions. these functions are available in dplyr package.