Reading In Data

For this challenge, I will be reading in the animal_weights CSV file. The dataset and the first rows are displayed below.

animal_weights <- read_csv("../challenge_datasets/animal_weight.csv")
## Rows: 9 Columns: 17
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): IPCC Area
## dbl (16): Cattle - dairy, Cattle - non-dairy, Buffaloes, Swine - market, Swi...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(animal_weights)
## # A tibble: 6 × 17
##   `IPCC Area`   `Cattle - dairy` `Cattle - non-dairy` Buffaloes `Swine - market`
##   <chr>                    <dbl>                <dbl>     <dbl>            <dbl>
## 1 Indian Subco…              275                  110       295               28
## 2 Eastern Euro…              550                  391       380               50
## 3 Africa                     275                  173       380               28
## 4 Oceania                    500                  330       380               45
## 5 Western Euro…              600                  420       380               50
## 6 Latin America              400                  305       380               28
## # ℹ 12 more variables: `Swine - breeding` <dbl>, `Chicken - Broilers` <dbl>,
## #   `Chicken - Layers` <dbl>, Ducks <dbl>, Turkeys <dbl>, Sheep <dbl>,
## #   Goats <dbl>, Horses <dbl>, Asses <dbl>, Mules <dbl>, Camels <dbl>,
## #   Llamas <dbl>

Breifly Describing the Data

Indicated by the dataset label, this dataset captures what seems like the average weight of different animals in different locations. This dataset is not tidy at the moment. I am going to pivot the table so the animal types are in an animal column not their own individual column headers. Then, I will make each row a specfiic animal in a specfic region and lastly store the weights in its own column.

Anticipation of End Result

The anticipated end result should hopefully be Region, Animal, and Weight as the columns to make it much easier to work with.

Pivot Data

# Pivoting here! 
animal_weights_tidy <- pivot_longer(animal_weights, 
                                    cols = -`IPCC Area`,
                                    names_to = "Animal", 
                                    values_to = "Weight")
animal_weights_tidy
## # A tibble: 144 × 3
##    `IPCC Area`         Animal             Weight
##    <chr>               <chr>               <dbl>
##  1 Indian Subcontinent Cattle - dairy      275  
##  2 Indian Subcontinent Cattle - non-dairy  110  
##  3 Indian Subcontinent Buffaloes           295  
##  4 Indian Subcontinent Swine - market       28  
##  5 Indian Subcontinent Swine - breeding     28  
##  6 Indian Subcontinent Chicken - Broilers    0.9
##  7 Indian Subcontinent Chicken - Layers      1.8
##  8 Indian Subcontinent Ducks                 2.7
##  9 Indian Subcontinent Turkeys               6.8
## 10 Indian Subcontinent Sheep                28  
## # ℹ 134 more rows

As we can see, the table is much longer but much easier to work with. Now we can easily observe the region and the animal and its average weight per location. I found that using the - symbol is very handy in these usecases.