mtcars
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
class(mtcars) [1] “data.frame” The mtcars data reflects information about 32 vehicles (observations) from 1974 Motor Trends Us magazine using 11 variables. It also includes examples of how to use these data for summary statistics and visualizations. There are two dummy coded variables that would make sense to call categorical, engine and transmission. In these cases, the numbers 0 and 1 represent types of engines or transmissions. Error: unexpected symbol in “There are” # Create the scatterplot plot(mtcars\(hp, mtcars\)mpg, + xlab = “Horsepower”, ylab = “Miles per Gallon”, + main = “Scatterplot of Horsepower vs. MPG”)
Add the linear regression line
lm_fit <- lm(mpg ~ hp, data = mtcars) # Fit linear regression model abline(lm_fit, col = “red”) # Add the regression line to the plot # Load the required packages library(ggplot2)
Create the scatterplot with linear regression line
ggplot(mtcars, aes(x = hp, y = mpg)) + + geom_point() + # Add the scatterplot + geom_smooth(method = “lm”, se = FALSE) + # Add the linear regression line + labs(x = “Horsepower”, y = “Miles per Gallon”, title = “Scatterplot of Horsepower vs. MPG with Linear Regression Line”)
geom_smooth()using formula = ‘y ~ x’Load the required packages
library(ggplot2)
Create the conditional violin plot with superimposed boxplots
ggplot(mtcars, aes(x = am, y = mpg, fill = factor(am))) + + geom_violin(trim = FALSE) + + geom_boxplot(width = 0.2, fill = “white”, color = “black”, outlier.shape = NA) + + labs(x = “Transmission Type”, y = “Miles per Gallon”, + title = “Conditional Violin Plot with Superimposed Boxplots”, + fill = “Transmission Type”) + + scale_fill_manual(values = c(“#E69F00”, “#56B4E9”), labels = c(“Automatic”, “Manual”)) + + theme_bw() Warning message: Continuous x aesthetic ℹ did you forget
aes(group = ...)?ggplot(mtcars, aes(x = factor(am), y = mpg, fill = factor(am), group = factor(am))) + + geom_violin(trim = FALSE) + + geom_boxplot(width = 0.2, fill = “white”, color = “black”, outlier.shape = NA) + + labs(x = “Transmission Type”, y = “Miles per Gallon”, + title = “Conditional Violin Plot with Superimposed Boxplots”, + fill = “Transmission Type”) + + scale_fill_manual(values = c(“#E69F00”, “#56B4E9”), labels = c(“Automatic”, “Manual”)) + + theme_bw()
###The first thing I interpret is that manual transmissions get better fas mileage. Then, I noticed that although the median distribution for manual transmissions is higher than automatic transmissions, it’s not very far off. This reminds me to to look at the the spread of data/shape of the violins (in other words, don’t be biased because I like manual transmission cars). I see that there is a heavy distribution of automatic transmission vehicles that have MPGs between 15-20, where as the manual transmission vheicles are much more evenly spread. This means I should be careful about assuming a manual transmission vehicle is going to have better MPG that an automatic transmission vehicle.### mtcars\(gear_factor <- factor(mtcars\)gear) mtcars\(cyl_factor <- factor(mtcars\)cyl)
ggplot(mtcars, aes(x = gear_factor, fill = cyl_factor)) + + geom_bar(position = “fill”) + + labs(x = “Gear”, y = “Proportion”, fill = “Cylinders”) + + scale_y_continuous(labels = scales::percent_format()) + + theme_bw()
library(ggplot2)
ggplot(mtcars, aes(x = gear_factor, fill = cyl_factor)) + + geom_bar(position = “dodge”) + + labs(x = “Gear”, y = “Count”, fill = “Cylinders”) + + theme_bw()
###I don’t know a ton about vehicles, but I see from these charts is that possible the more gears you have the more flexibility there is about how any cyclinders you can have. Another possibility is that there is a slightly negative correlation between gears and cylinders. I see that the majority of 3-gear vehicles 8 cyclinders, 4-gear vehicles only have 4 or 6 cyclinders and the 5-gear vehicles have a slight tendency to have 4 or 6 cylinders but can have 5 cylinders. It may be that the number of gears be a clue to the type or purpose of the vehicle which would require more of fewer cyclinders. ### data <- read.delim(“/Users/nicoleborunda/Downloads/ICPSR_37938 13/DS0005/37938-0005-Data.tsv”, header = TRUE)
transpop <- 37938-0005-Data.tsv Error: object ‘Data.tsv’ not found > transpop <- data Error: unexpected ‘>’ in “>” transpop <- data class(transpop) [1] “data.frame” install.packages(“haven”) also installing the dependencies ‘bit’, ‘prettyunits’, ‘bit64’, ‘progress’, ‘clipr’, ‘crayon’, ‘vroom’, ‘tzdb’, ‘forcats’, ‘hms’, ‘readr’, ‘tidyselect’, ‘cpp11’
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/bit_4.0.5.tgz’ Content type ‘application/x-gzip’ length 1240695 bytes (1.2 MB) ================================================== downloaded 1.2 MB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/prettyunits_1.1.1.tgz’ Content type ‘application/x-gzip’ length 35233 bytes (34 KB) ================================================== downloaded 34 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/bit64_4.0.5.tgz’ Content type ‘application/x-gzip’ length 561949 bytes (548 KB) ================================================== downloaded 548 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/progress_1.2.2.tgz’ Content type ‘application/x-gzip’ length 83994 bytes (82 KB) ================================================== downloaded 82 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/clipr_0.8.0.tgz’ Content type ‘application/x-gzip’ length 51085 bytes (49 KB) ================================================== downloaded 49 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/crayon_1.5.2.tgz’ Content type ‘application/x-gzip’ length 162442 bytes (158 KB) ================================================== downloaded 158 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/vroom_1.6.3.tgz’ Content type ‘application/x-gzip’ length 3181756 bytes (3.0 MB) ================================================== downloaded 3.0 MB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/tzdb_0.4.0.tgz’ Content type ‘application/x-gzip’ length 1264072 bytes (1.2 MB) ================================================== downloaded 1.2 MB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/forcats_1.0.0.tgz’ Content type ‘application/x-gzip’ length 423085 bytes (413 KB) ================================================== downloaded 413 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/hms_1.1.3.tgz’ Content type ‘application/x-gzip’ length 99198 bytes (96 KB) ================================================== downloaded 96 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/readr_2.1.4.tgz’ Content type ‘application/x-gzip’ length 1986319 bytes (1.9 MB) ================================================== downloaded 1.9 MB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/tidyselect_1.2.0.tgz’ Content type ‘application/x-gzip’ length 222290 bytes (217 KB) ================================================== downloaded 217 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/cpp11_0.4.3.tgz’ Content type ‘application/x-gzip’ length 300362 bytes (293 KB) ================================================== downloaded 293 KB
trying URL ‘https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.3/haven_2.5.2.tgz’ Content type ‘application/x-gzip’ length 1108956 bytes (1.1 MB) ================================================== downloaded 1.1 MB
The downloaded binary packages are in /var/folders/p5/01x6s31x4pncvzwg_3c6_mf80000gn/T//RtmpfkMVlg/downloaded_packages > library(haven) > > > data <- read.csv(“/Users/nicoleborunda/Desktop/transpopmulti.sav”, header = TRUE) Error: unexpected ‘>’ in “>” > data <- read.csv(“/Users/nicoleborunda/Desktop/transpopmulti.sav”, header = TRUE) Warning messages: 1: In read.table(file = file, header = header, sep = sep, quote = quote, : line 1 appears to contain embedded nulls 2: In read.table(file = file, header = header, sep = sep, quote = quote, : line 2 appears to contain embedded nulls 3: In read.table(file = file, header = header, sep = sep, quote = quote, : line 3 appears to contain embedded nulls 4: In read.table(file = file, header = header, sep = sep, quote = quote, : line 4 appears to contain embedded nulls 5: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : embedded nul(s) found in input > small_df <- large_df[, c(“GENDER_IDENTITY”, “GMILESAWAY”, “LIFESAT”)] Error: object ‘large_df’ not found > large_df <- transpop
small_df <- large_df[, c(“GENDER_IDENTITY”, “GMILESAWAY”, “LIFESAT”)] Error: unexpected ‘>’ in “>”
small_df <- large_df[, c(“GENDER_IDENTITY”, “GMILESAWAY”, “LIFESAT”)]
library(ggplot2)
ggplot(data, aes(x = GMILESAWAY, y = LIFESAT, color = GENDER_IDENTITY)) + + geom_point() + + scale_color_manual(values = c(“Male” = “blue”, “Female” = “pink”, “Non-binary” = “purple”)) + + labs(x = “Distance from Health Center (GMILESAWAY)”, + y = “Life Satisfaction (LIFESAT)”, + color = “Gender Identity (GENDER_IDENTITY)”) + + ggtitle(“Scatter Plot of Gender Identity, Distance, and Life Satisfaction”) Error in
geom_point(): ! Problem while computing aesthetics. ℹ Error occurred in the 1st layer. Caused by error: ! object ‘GMILESAWAY’ not found Runrlang::last_trace()to see where the error occurred. head(transpop) STUDYID WEIGHT_CISGENDER_TRANSPOP WEIGHT_CISGENDER WEIGHT_TRANSPOP GMETHOD_TYPE 1 151768927 0.02203922 NA 0.9861429
SURVEYCOMPLETED GRESPONDENT_DATE GCENREG RACE RACE_RECODE RACE_RECODE_CAT5 SEXUALID 1 0 26-APR-2016 1 6 1 1 1 SEXMINID HINC HINC_I PINC PINC_I GEDUC1