In this RStudio tutorial, we’ll delve into the world of data manipulation using the mtcars dataset. We’ll cover a range of essential operations, from generating and aggregating data to changing data types, formatting dates, and combining data frames and sets.
#Generating a Data Set Begin by launching RStudio and creating a new R Script.
To generate a sample data set, use the following command:
#Aggregating Data in R Let’s start by understanding how to aggregate data. This process involves summarizing data by specific categories. For instance, to find the mean miles per gallon (mpg) grouped by transmission type (am), execute the code:
## Group.1 x
## 1 0 17.14737
## 2 1 24.39231
#Filtering Data Now, suppose you want to filter data points where the miles per gallon is greater than 20:
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Here, we’ve isolated cars with mpg values greater than 20.
Let’s explore working with positive and negative values. Begin by generating a hypothetical data set with both types of values:
## values
## 1 10
## 2 0
## 3 8
## 4 0
## 5 6
In this code, negative values are converted to zero using the ifelse function.
To count occurrences of specific data points, such as transmission types, employ the table function:
##
## 0 1
## 19 13
This code snippet yields a count of different transmission types.
#Attaching Data in R Learn how to attach additional data to your workspace using the attach function:
## speed
## 1 100
## 2 120
## 3 80
## 4 110
## 5 95
With this code, you’ll make the additional_data accessible without explicitly referencing it.
#Changing Data Types To alter data types, let’s convert the cyl column to a factor:
## num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
## Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
These lines demonstrate converting a numerical column to a categorical factor.
#Changing Date Format Manipulating dates is crucial. Create a hypothetical data frame date_data and change the date format:
## dates
## 1 08/01/2023
## 2 08/15/2023
## 3 09/05/2023
This code converts date strings to a different format using the format function.
#Combining Data Frames and Sets Merging data frames is a common task. Let’s merge based on the car name:
## name mpg cyl disp hp drat wt qsec vs am gear carb
## 1 AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## 2 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## 3 Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## 4 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## 5 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## 6 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## 7 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## 8 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## 9 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## 10 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## 11 Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## 12 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## 13 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## 14 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## 15 Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## 16 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## 17 Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## 18 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## 19 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## 20 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## 21 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## 22 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## 23 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## 24 Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## 25 Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## 26 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## 27 Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## 28 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## 29 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## 30 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## 31 Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## 32 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
## color
## 1 blue
## 2 green
## 3 green
## 4 blue
## 5 green
## 6 red
## 7 red
## 8 green
## 9 green
## 10 blue
## 11 blue
## 12 red
## 13 red
## 14 blue
## 15 red
## 16 red
## 17 red
## 18 red
## 19 blue
## 20 green
## 21 blue
## 22 red
## 23 blue
## 24 green
## 25 red
## 26 blue
## 27 red
## 28 green
## 29 blue
## 30 green
## 31 green
## 32 blue
This code snippet demonstrates merging based on the car name.
Additionally, integrate sales data into the car data set:
## name mpg cyl disp hp drat wt qsec vs am gear carb
## 1 AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## 2 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## 3 Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## 4 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## 5 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## 6 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## 7 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## 8 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## 9 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## 10 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## 11 Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## 12 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## 13 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## 14 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## 15 Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## 16 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## 17 Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## 18 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## 19 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## 20 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## 21 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## 22 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## 23 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## 24 Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## 25 Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## 26 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## 27 Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## 28 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## 29 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## 30 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## 31 Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## 32 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
## sales
## 1 50
## 2 80
## 3 80
## 4 50
## 5 80
## 6 100
## 7 100
## 8 80
## 9 80
## 10 50
## 11 50
## 12 100
## 13 100
## 14 50
## 15 100
## 16 100
## 17 100
## 18 100
## 19 50
## 20 80
## 21 50
## 22 100
## 23 50
## 24 80
## 25 100
## 26 50
## 27 100
## 28 80
## 29 50
## 30 80
## 31 80
## 32 50
You’ve now gained valuable insights into data manipulation and aggregation techniques in R. Feel free to explore further and experiment with your own data sets to enhance your R programming skills! for code visit here: https://www.data03.online/2023/08/generate-aggregate-count-attach-change.html