[https://media.ed.ac.uk/media/HealthyRA+Reshaping+Data+%28pivot%29/1_wmix0222]
Day 05 of HealthyR demo
Reshaping Data (pivot)
Honey bee (Apis mellifera) colony loss is a widespread phenomenon with important economic and biological implications, whose drivers are still an open matter of investigation.
Data showing percentage of bee colonies impacted by various stressors over time.
La pérdida de colonias de abejas (Apis mellifera) es un fenómeno generalizado con importantes implicaciones económicas y biológicas, cuyas causas siguen siendo un tema abierto de investigación.
Datos que muestran el porcentaje de colonias de abejas afectadas por diversos factores estresantes a lo largo del tiempo.
[https://www.nature.com/articles/s41598-022-24946-4]
[Apis mellifera.Crédito: John Quine (con licencia CC
BY-NC-SA 2.0)]
library(tidyverse)
# Load the datasets - Buscamos el datasets
# Datasets from Tidy Tuesday - desde Tidy Tuesday
colony <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-11/colony.csv')
colony
## # A tibble: 1,222 × 10
## year months state colony_n colony_max colony_lost colony_lost_pct
## <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 2015 January-March Alabama 7000 7000 1800 26
## 2 2015 January-March Arizona 35000 35000 4600 13
## 3 2015 January-March Arkansas 13000 14000 1500 11
## 4 2015 January-March Californ… 1440000 1690000 255000 15
## 5 2015 January-March Colorado 3500 12500 1500 12
## 6 2015 January-March Connecti… 3900 3900 870 22
## 7 2015 January-March Florida 305000 315000 42000 13
## 8 2015 January-March Georgia 104000 105000 14500 14
## 9 2015 January-March Hawaii 10500 10500 380 4
## 10 2015 January-March Idaho 81000 88000 3700 4
## # ℹ 1,212 more rows
## # ℹ 3 more variables: colony_added <dbl>, colony_reno <dbl>,
## # colony_reno_pct <dbl>
stressor <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-11/stressor.csv')
stressor
## # A tibble: 7,332 × 5
## year months state stressor stress_pct
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2015 January-March Alabama Varroa mites 10
## 2 2015 January-March Alabama Other pests/parasites 5.4
## 3 2015 January-March Alabama Disesases NA
## 4 2015 January-March Alabama Pesticides 2.2
## 5 2015 January-March Alabama Other 9.1
## 6 2015 January-March Alabama Unknown 9.4
## 7 2015 January-March Arizona Varroa mites 26.9
## 8 2015 January-March Arizona Other pests/parasites 20.5
## 9 2015 January-March Arizona Disesases 0.1
## 10 2015 January-March Arizona Pesticides NA
## # ℹ 7,322 more rows
pivot_wider() is the opposite of pivot_longer(): it widens a data set by increasing the number of columns and decreasing the number of rows.
pivot_wider() es lo opuesto a pivot_longer(): amplía un conjunto de datos al aumentar el número de columnas y disminuir el número de filas.
# Start with wide dataset for this demo
# Data showing percentage of bee colonies impacted by various stressors over time - Datos que muestran el porcentaje de colonias de abejas afectadas por diversos factores estresantes a lo largo del tiempo.
wide_stressor = stressor %>%
pivot_wider(
names_from = months,
values_from = stress_pct)
wide_stressor
## # A tibble: 1,974 × 7
## year state stressor `January-March` `April-June` `July-September`
## <dbl> <chr> <chr> <dbl> <dbl> <dbl>
## 1 2015 Alabama Varroa mites 10 16.7 63.1
## 2 2015 Alabama Other pests/para… 5.4 42.5 70.6
## 3 2015 Alabama Disesases NA NA NA
## 4 2015 Alabama Pesticides 2.2 2.3 2.6
## 5 2015 Alabama Other 9.1 3.2 2.2
## 6 2015 Alabama Unknown 9.4 4.1 17.7
## 7 2015 Arizona Varroa mites 26.9 8.4 53.5
## 8 2015 Arizona Other pests/para… 20.5 32.1 24.8
## 9 2015 Arizona Disesases 0.1 0.5 5.4
## 10 2015 Arizona Pesticides NA 20.1 17.3
## # ℹ 1,964 more rows
## # ℹ 1 more variable: `October-December` <dbl>
Pivot_longer() is commonly needed to sort captured raw data sets.
Pivot_longer() se necesita comúnmente para ordenar los conjuntos de datos capturados en crudo.
long_stressor = wide_stressor %>%
pivot_longer(4:7,
names_to = "months",
values_to = "stress_pct")
long_stressor
## # A tibble: 7,896 × 5
## year state stressor months stress_pct
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2015 Alabama Varroa mites January-March 10
## 2 2015 Alabama Varroa mites April-June 16.7
## 3 2015 Alabama Varroa mites July-September 63.1
## 4 2015 Alabama Varroa mites October-December 3.1
## 5 2015 Alabama Other pests/parasites January-March 5.4
## 6 2015 Alabama Other pests/parasites April-June 42.5
## 7 2015 Alabama Other pests/parasites July-September 70.6
## 8 2015 Alabama Other pests/parasites October-December 6.4
## 9 2015 Alabama Disesases January-March NA
## 10 2015 Alabama Disesases April-June NA
## # ℹ 7,886 more rows
long_stressor = wide_stressor %>%
pivot_longer(c("January-March", "April-June","July-September", "October-December"),
names_to = "months",
values_to = "stress_pct")
long_stressor
## # A tibble: 7,896 × 5
## year state stressor months stress_pct
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2015 Alabama Varroa mites January-March 10
## 2 2015 Alabama Varroa mites April-June 16.7
## 3 2015 Alabama Varroa mites July-September 63.1
## 4 2015 Alabama Varroa mites October-December 3.1
## 5 2015 Alabama Other pests/parasites January-March 5.4
## 6 2015 Alabama Other pests/parasites April-June 42.5
## 7 2015 Alabama Other pests/parasites July-September 70.6
## 8 2015 Alabama Other pests/parasites October-December 6.4
## 9 2015 Alabama Disesases January-March NA
## 10 2015 Alabama Disesases April-June NA
## # ℹ 7,886 more rows
long_stressor = wide_stressor %>%
pivot_longer(all_of(c("January-March", "April-June","July-September","October-December")),
names_to = "months",
values_to = "stress_pct")
long_stressor
## # A tibble: 7,896 × 5
## year state stressor months stress_pct
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2015 Alabama Varroa mites January-March 10
## 2 2015 Alabama Varroa mites April-June 16.7
## 3 2015 Alabama Varroa mites July-September 63.1
## 4 2015 Alabama Varroa mites October-December 3.1
## 5 2015 Alabama Other pests/parasites January-March 5.4
## 6 2015 Alabama Other pests/parasites April-June 42.5
## 7 2015 Alabama Other pests/parasites July-September 70.6
## 8 2015 Alabama Other pests/parasites October-December 6.4
## 9 2015 Alabama Disesases January-March NA
## 10 2015 Alabama Disesases April-June NA
## # ℹ 7,886 more rows
long_stressor = wide_stressor %>%
pivot_longer(contains("-"),
names_to = "months",
values_to = "stress_pct")
long_stressor
## # A tibble: 7,896 × 5
## year state stressor months stress_pct
## <dbl> <chr> <chr> <chr> <dbl>
## 1 2015 Alabama Varroa mites January-March 10
## 2 2015 Alabama Varroa mites April-June 16.7
## 3 2015 Alabama Varroa mites July-September 63.1
## 4 2015 Alabama Varroa mites October-December 3.1
## 5 2015 Alabama Other pests/parasites January-March 5.4
## 6 2015 Alabama Other pests/parasites April-June 42.5
## 7 2015 Alabama Other pests/parasites July-September 70.6
## 8 2015 Alabama Other pests/parasites October-December 6.4
## 9 2015 Alabama Disesases January-March NA
## 10 2015 Alabama Disesases April-June NA
## # ℹ 7,886 more rows
# For each stressor average percentage of columns affected per year
# # Para cada factor estresante, porcentaje promedio de columnas afectadas por año
year_stressor_pct = long_stressor %>%
group_by(year, stressor) %>%
summarise(pct_mean = mean(stress_pct, na.rm = TRUE))
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
year_stressor_pct
## # A tibble: 42 × 3
## # Groups: year [7]
## year stressor pct_mean
## <dbl> <chr> <dbl>
## 1 2015 Disesases 3.67
## 2 2015 Other 6.37
## 3 2015 Other pests/parasites 11.3
## 4 2015 Pesticides 7.69
## 5 2015 Unknown 4.25
## 6 2015 Varroa mites 27.2
## 7 2016 Disesases 4.24
## 8 2016 Other 5.81
## 9 2016 Other pests/parasites 10.4
## 10 2016 Pesticides 7.26
## # ℹ 32 more rows
year_stressor_wide = year_stressor_pct %>%
pivot_wider(
names_from = "stressor",
values_from = "pct_mean")
year_stressor_wide
## # A tibble: 7 × 7
## # Groups: year [7]
## year Disesases Other Other pests/parasite…¹ Pesticides Unknown `Varroa mites`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2015 3.67 6.37 11.3 7.69 4.25 27.2
## 2 2016 4.24 5.81 10.4 7.26 4.47 29.4
## 3 2017 5.76 6.63 11.5 7.96 4.44 33.0
## 4 2018 4.28 7.78 14.7 8.77 4.78 34.9
## 5 2019 4.38 6.48 12.7 9.22 4.19 31.7
## 6 2020 3.72 5.13 10.4 6.29 4.63 28.9
## 7 2021 3.34 5.49 11.2 5.62 4.08 28.2
## # ℹ abbreviated name: ¹`Other pests/parasites`