In the following code hunk, import your data.
# 安装并加载需要的包
install.packages("readr") # 如果尚未安装 readr 包,则需要先安装
## Installing package into '/usr/local/lib/R/site-library'
## (as 'lib' is unspecified)
## Warning in download.file(url, destfile, method, mode =
## "wb", ...): URL 'https://rspm-sync.rstudio.com/bin/4.0-focal/
## 0a4ab44717de6ecfd713443d11cf7f7f4abe0ba0023f524d008ed63ab8c3469e.tar.gz': status
## was 'SSL connect error'
## Error in download.file(url, destfile, method, mode = "wb", ...) :
## cannot open URL 'https://packagemanager.rstudio.com/all/__linux__/focal/291/src/contrib/readr_1.3.1.tar.gz'
## Warning in download.packages(pkgs, destdir = tmpd, available = available, :
## download of package 'readr' failed
library(readr)
# 使用 read_csv() 函数读取 CSV 文件,并将其转换为 tibble 格式的数据
dat <- read_csv("Olympics_2016_Rio_Athletes.csv")
## Rows: 11538 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): name, nationality, sex, dob, sport
## dbl (6): id, height, weight, gold, silver, bronze
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Using words, describe the visualization you are going to make using which variables/characteristics in your data:
Example: For my first figure, I am going to create a scatterplot that plots vehicle weight on the x axis and miles per gallon the y axis. I will create a two column tibble with these data.
In the code chunk below, show your work filtering the data and create the subset of data you will display graphically.
library(dplyr)
# 选择 dat 数据框中的 wt 和 mpg 列,并存储结果在 fig_dat1 变量中
fig_dat1 <- dat %>%
select(nationality, sport)
# 显示 fig_dat1 变量中的结果
fig_dat1
## # A tibble: 11,538 × 2
## nationality sport
## <chr> <chr>
## 1 ESP athletics
## 2 KOR fencing
## 3 CAN athletics
## 4 MDA taekwondo
## 5 NZL cycling
## 6 AUS triathlon
## 7 USA volleyball
## 8 AUS aquatics
## 9 ESP athletics
## 10 ETH athletics
## # … with 11,528 more rows
Using words, describe the second visualization you are going to make using which variables/characteristics in your data:
Example: For my second figure, I am going to create a boxplot that includes three boxplots for miles per gallon, horsepower, and weight.
In the code chunk below, show your work filtering the data and create the subset of data you will display graphically.
fig_dat2<-dat %>% select(nationality,sport,gold)
####make sure you call the data so it will display in your report
fig_dat2
## # A tibble: 11,538 × 3
## nationality sport gold
## <chr> <chr> <dbl>
## 1 ESP athletics 0
## 2 KOR fencing 0
## 3 CAN athletics 0
## 4 MDA taekwondo 0
## 5 NZL cycling 0
## 6 AUS triathlon 0
## 7 USA volleyball 0
## 8 AUS aquatics 0
## 9 ESP athletics 0
## 10 ETH athletics 0
## # … with 11,528 more rows
Using words, describe the third visualization you are going to make using which variables/characteristics in your data:
Example: For the third figure, I will display a density plot of the quarter mile time for six cylinder cars.
fig_dat3<-dat %>% filter(gold==1) %>% select(nationality)
####make sure you call the data so it will display in your report
fig_dat3
## # A tibble: 584 × 1
## nationality
## <chr>
## 1 USA
## 2 RUS
## 3 GBR
## 4 ARG
## 5 JOR
## 6 RUS
## 7 GBR
## 8 RUS
## 9 GBR
## 10 GER
## # … with 574 more rows