Praktik 1 1. Menampilkan kolom Entity, Year, Potatoes, dan Cassava saja. 2. Mengeliminasi kolom Soybeans, Beans, dan Peas dari tabel. 3. Tahun berapa saja hasil panen padi (Rice) di Indonesia yang nilainya di bawah 2 ton? 4. Negara apa saja yang punya hasil gandum (Wheat) di atas 5 ton pada tahun 2000 ke atas? 5. Bagaimana cara memunculkan data negara Indonesia dan Malaysia khusus untuk tahun 2015 saja? 6. Negara mana yang punya hasil jagung (Maize) paling rendah di tahun 2020? 7. Mengurutkan data Indonesia dari hasil kentang (Potatoes) yang paling tinggi. 8. Membuat kolom Rice_Status berisi teks “Tinggi” jika padi > 4 ton, dan “Rendah” jika di bawahnya. 9. Berapa rata-rata hasil panen pisang (Bananas) di Indonesia dari seluruh tahun yang ada? 10.Tampilkan data jagung mulai tahun 2010, lalu menghitung simpangan baku per negara, dan mengurutkannya dari nilai yang paling besar
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.5.3
## Warning: package 'ggplot2' was built under R version 4.5.3
## Warning: package 'tidyr' was built under R version 4.5.3
## Warning: package 'purrr' was built under R version 4.5.3
## Warning: package 'dplyr' was built under R version 4.5.3
## Warning: package 'stringr' was built under R version 4.5.3
## Warning: package 'forcats' was built under R version 4.5.3
## Warning: package 'lubridate' was built under R version 4.5.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.1 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.3 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
url <- "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-01/key_crop_yields.csv"
# Membaca data read_csv
df_crop <- read_csv(url)
## Rows: 13075 Columns: 14
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Entity, Code
## dbl (12): Year, Wheat (tonnes per hectare), Rice (tonnes per hectare), Maize...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Melihat data
glimpse(df_crop)
## Rows: 13,075
## Columns: 14
## $ Entity <chr> "Afghanistan", "Afghanistan", "Afgh…
## $ Code <chr> "AFG", "AFG", "AFG", "AFG", "AFG", …
## $ Year <dbl> 1961, 1962, 1963, 1964, 1965, 1966,…
## $ `Wheat (tonnes per hectare)` <dbl> 1.0220, 0.9735, 0.8317, 0.9510, 0.9…
## $ `Rice (tonnes per hectare)` <dbl> 1.5190, 1.5190, 1.5190, 1.7273, 1.7…
## $ `Maize (tonnes per hectare)` <dbl> 1.4000, 1.4000, 1.4260, 1.4257, 1.4…
## $ `Soybeans (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Potatoes (tonnes per hectare)` <dbl> 8.6667, 7.6667, 8.1333, 8.6000, 8.8…
## $ `Beans (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Peas (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Cassava (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Barley (tonnes per hectare)` <dbl> 1.0800, 1.0800, 1.0800, 1.0857, 1.0…
## $ `Cocoa beans (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Bananas (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
df_crop$Entity <- as.factor(df_crop$Entity)
df_crop$Code <- as.factor(df_crop$Code)
#Periksa apakah ada perubahan
glimpse(df_crop)
## Rows: 13,075
## Columns: 14
## $ Entity <fct> "Afghanistan", "Afghanistan", "Afgh…
## $ Code <fct> AFG, AFG, AFG, AFG, AFG, AFG, AFG, …
## $ Year <dbl> 1961, 1962, 1963, 1964, 1965, 1966,…
## $ `Wheat (tonnes per hectare)` <dbl> 1.0220, 0.9735, 0.8317, 0.9510, 0.9…
## $ `Rice (tonnes per hectare)` <dbl> 1.5190, 1.5190, 1.5190, 1.7273, 1.7…
## $ `Maize (tonnes per hectare)` <dbl> 1.4000, 1.4000, 1.4260, 1.4257, 1.4…
## $ `Soybeans (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Potatoes (tonnes per hectare)` <dbl> 8.6667, 7.6667, 8.1333, 8.6000, 8.8…
## $ `Beans (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Peas (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Cassava (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Barley (tonnes per hectare)` <dbl> 1.0800, 1.0800, 1.0800, 1.0857, 1.0…
## $ `Cocoa beans (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ `Bananas (tonnes per hectare)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
1.Menampilkan kolom Entity, Year, Potatoes, dan Cassava saja.
data_select <- select(df_crop, Entity, Year, `Potatoes (tonnes per hectare)`, `Cassava (tonnes per hectare)`)
data_select
## # A tibble: 13,075 × 4
## Entity Year `Potatoes (tonnes per hectare)` Cassava (tonnes per hecta…¹
## <fct> <dbl> <dbl> <dbl>
## 1 Afghanistan 1961 8.67 NA
## 2 Afghanistan 1962 7.67 NA
## 3 Afghanistan 1963 8.13 NA
## 4 Afghanistan 1964 8.6 NA
## 5 Afghanistan 1965 8.8 NA
## 6 Afghanistan 1966 9.07 NA
## 7 Afghanistan 1967 9.8 NA
## 8 Afghanistan 1968 10 NA
## 9 Afghanistan 1969 10.2 NA
## 10 Afghanistan 1970 9.54 NA
## # ℹ 13,065 more rows
## # ℹ abbreviated name: ¹`Cassava (tonnes per hectare)`
2. Mengeliminasi kolom Soybeans, Beans, dan Peas dari tabel.
select(df_crop, -c(`Soybeans (tonnes per hectare)`, `Beans (tonnes per hectare)`, `Peas (tonnes per hectare)`))
## # A tibble: 13,075 × 11
## Entity Code Year `Wheat (tonnes per hectare)` Rice (tonnes per hecta…¹
## <fct> <fct> <dbl> <dbl> <dbl>
## 1 Afghanistan AFG 1961 1.02 1.52
## 2 Afghanistan AFG 1962 0.974 1.52
## 3 Afghanistan AFG 1963 0.832 1.52
## 4 Afghanistan AFG 1964 0.951 1.73
## 5 Afghanistan AFG 1965 0.972 1.73
## 6 Afghanistan AFG 1966 0.867 1.52
## 7 Afghanistan AFG 1967 1.12 1.92
## 8 Afghanistan AFG 1968 1.16 1.95
## 9 Afghanistan AFG 1969 1.19 1.98
## 10 Afghanistan AFG 1970 0.956 1.81
## # ℹ 13,065 more rows
## # ℹ abbreviated name: ¹`Rice (tonnes per hectare)`
## # ℹ 6 more variables: `Maize (tonnes per hectare)` <dbl>,
## # `Potatoes (tonnes per hectare)` <dbl>,
## # `Cassava (tonnes per hectare)` <dbl>, `Barley (tonnes per hectare)` <dbl>,
## # `Cocoa beans (tonnes per hectare)` <dbl>,
## # `Bananas (tonnes per hectare)` <dbl>
3. Tahun berapa saja hasil panen padi (Rice) di Indonesia yang nilainya di bawah 2 ton?
df_crop %>%
select(Entity, Year, `Rice (tonnes per hectare)`) %>%
filter(Entity == 'Indonesia', (`Rice (tonnes per hectare)` < 2))
## # A tibble: 7 × 3
## Entity Year `Rice (tonnes per hectare)`
## <fct> <dbl> <dbl>
## 1 Indonesia 1961 1.76
## 2 Indonesia 1962 1.79
## 3 Indonesia 1963 1.72
## 4 Indonesia 1964 1.76
## 5 Indonesia 1965 1.77
## 6 Indonesia 1966 1.77
## 7 Indonesia 1967 1.76
4. Negara apa saja yang punya hasil gandum (Wheat) di atas 5 ton pada tahun 2000 ke atas?
df_crop %>%
select(Entity, Year, `Wheat (tonnes per hectare)`) %>%
filter(Year > 2000, `Wheat (tonnes per hectare)` < 5 )
## # A tibble: 2,361 × 3
## Entity Year `Wheat (tonnes per hectare)`
## <fct> <dbl> <dbl>
## 1 Afghanistan 2001 0.898
## 2 Afghanistan 2002 1.54
## 3 Afghanistan 2003 1.5
## 4 Afghanistan 2004 1.27
## 5 Afghanistan 2005 1.82
## 6 Afghanistan 2006 1.38
## 7 Afghanistan 2007 1.82
## 8 Afghanistan 2008 1.23
## 9 Afghanistan 2009 1.97
## 10 Afghanistan 2010 1.93
## # ℹ 2,351 more rows
5. Bagaimana cara memunculkan data negara Indonesia dan Malaysia khusus untuk tahun 2015 saja?
filter(df_crop,Entity=="Indonesia"|Entity=='Malaysia', Year == 2015)
## # A tibble: 2 × 14
## Entity Code Year `Wheat (tonnes per hectare)` `Rice (tonnes per hectare)`
## <fct> <fct> <dbl> <dbl> <dbl>
## 1 Indonesia IDN 2015 NA 5.34
## 2 Malaysia MYS 2015 NA 4.02
## # ℹ 9 more variables: `Maize (tonnes per hectare)` <dbl>,
## # `Soybeans (tonnes per hectare)` <dbl>,
## # `Potatoes (tonnes per hectare)` <dbl>, `Beans (tonnes per hectare)` <dbl>,
## # `Peas (tonnes per hectare)` <dbl>, `Cassava (tonnes per hectare)` <dbl>,
## # `Barley (tonnes per hectare)` <dbl>,
## # `Cocoa beans (tonnes per hectare)` <dbl>,
## # `Bananas (tonnes per hectare)` <dbl>
6. Negara mana yang punya hasil jagung (Maize) paling rendah di tahun 2020?
df_crop %>%
select(Entity, Year, `Maize (tonnes per hectare)`) %>%
filter(Year == 2020, !is.na(`Maize (tonnes per hectare)`)) %>%
arrange(`Maize (tonnes per hectare)`)
## # A tibble: 0 × 3
## # ℹ 3 variables: Entity <fct>, Year <dbl>, Maize (tonnes per hectare) <dbl>
7. Mengurutkan data Indonesia dari hasil kentang (Potatoes) yang paling tinggi.
df_crop %>%
select(Entity, Year, `Potatoes (tonnes per hectare)`) %>%
filter(Entity == "Indonesia") %>%
filter(!is.na(`Potatoes (tonnes per hectare)`)) %>%
arrange(desc(`Potatoes (tonnes per hectare)`))
## # A tibble: 58 × 3
## Entity Year `Potatoes (tonnes per hectare)`
## <fct> <dbl> <dbl>
## 1 Indonesia 2018 18.7
## 2 Indonesia 2016 18.3
## 3 Indonesia 2015 18.2
## 4 Indonesia 2014 17.7
## 5 Indonesia 2006 16.9
## 6 Indonesia 2008 16.7
## 7 Indonesia 1995 16.6
## 8 Indonesia 2012 16.6
## 9 Indonesia 2009 16.5
## 10 Indonesia 2005 16.4
## # ℹ 48 more rows
8. Membuat kolom Rice_Status berisi teks “Tinggi” jika padi > 4 ton, dan “Rendah” jika di bawahnya.
df_crop %>%
mutate(Rice_Status = ifelse(`Rice (tonnes per hectare)` > 4, "Tinggi", "Rendah")) %>%
select(Code,Year,Rice_Status)
## # A tibble: 13,075 × 3
## Code Year Rice_Status
## <fct> <dbl> <chr>
## 1 AFG 1961 Rendah
## 2 AFG 1962 Rendah
## 3 AFG 1963 Rendah
## 4 AFG 1964 Rendah
## 5 AFG 1965 Rendah
## 6 AFG 1966 Rendah
## 7 AFG 1967 Rendah
## 8 AFG 1968 Rendah
## 9 AFG 1969 Rendah
## 10 AFG 1970 Rendah
## # ℹ 13,065 more rows
9. Berapa rata-rata hasil panen pisang (Bananas) di Indonesia dari seluruh tahun yang ada?
df_crop %>%
select(Entity, `Bananas (tonnes per hectare)`) %>%
filter(Entity == "Indonesia") %>%
filter(!is.na(`Bananas (tonnes per hectare)`)) %>%
summarise(`Mean Bananas (tonnes per hectare)` = mean(`Bananas (tonnes per hectare)`))
## # A tibble: 1 × 1
## `Mean Bananas (tonnes per hectare)`
## <dbl>
## 1 30.5
10.Tampilkan data jagung mulai tahun 2010, lalu menghitung simpangan baku per negara, dan mengurutkannya dari nilai yang paling besar
df_crop %>%
select(Entity, Year, `Maize (tonnes per hectare)`) %>%
filter(Year >= 2010) %>%
filter(!is.na(`Maize (tonnes per hectare)`)) %>%
group_by(Entity) %>%
summarise(`SD Maize (tonnes per hectare)` = sd(`Maize (tonnes per hectare)`)) %>%
filter(!is.na(`SD Maize (tonnes per hectare)`)) %>%
arrange(desc(`SD Maize (tonnes per hectare)`))
## # A tibble: 202 × 2
## Entity `SD Maize (tonnes per hectare)`
## <fct> <dbl>
## 1 Kuwait 9.24
## 2 United Arab Emirates 9.19
## 3 Jordan 7.03
## 4 Israel 4.80
## 5 Saint Vincent and the Grenadines 2.89
## 6 Qatar 2.74
## 7 French Guiana 2.50
## 8 New Caledonia 2.29
## 9 Slovakia 1.68
## 10 Oman 1.61
## # ℹ 192 more rows