This challenge is just about creating a function for any analysis task.
We first load the libraries
library(readr)
library(here)
## here() starts at C:/Users/SHAURYA/Desktop/Studies/Winter 2024 601/Challenges/challenge 9
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
The dataset is then loaded.
weight <- read_csv("animal_weight.csv", show_col_types = FALSE)
weight
## # A tibble: 9 × 17
## `IPCC Area` `Cattle - dairy` `Cattle - non-dairy` Buffaloes `Swine - market`
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Indian Subco… 275 110 295 28
## 2 Eastern Euro… 550 391 380 50
## 3 Africa 275 173 380 28
## 4 Oceania 500 330 380 45
## 5 Western Euro… 600 420 380 50
## 6 Latin America 400 305 380 28
## 7 Asia 350 391 380 50
## 8 Middle east 275 173 380 28
## 9 Northern Ame… 604 389 380 46
## # ℹ 12 more variables: `Swine - breeding` <dbl>, `Chicken - Broilers` <dbl>,
## # `Chicken - Layers` <dbl>, Ducks <dbl>, Turkeys <dbl>, Sheep <dbl>,
## # Goats <dbl>, Horses <dbl>, Asses <dbl>, Mules <dbl>, Camels <dbl>,
## # Llamas <dbl>
We see that the dataset as Area followed by different categories of various animals.
This function reads and cleans the data which is followed by a histogram. The final data is then returned by the function.
func <- function(data, category) {
cols <- c("IPCC Area", category)
new_data <- data[, cols]
# Clean Data
new_data[[category]] <- as.numeric(new_data[[category]])
# Remove missing values if there
new_data <- na.omit(new_data)
# The plot
ggplot(new_data, aes(x = `IPCC Area`, y = new_data[[category]])) +
geom_bar(stat = "identity", fill = "skyblue", color = "black", alpha = 0.7) +
labs(title = "Weight of Dairy Cattle by Area",
x = "IPCC Area", y = category) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
}
func(weight, "Cattle - non-dairy")
## Warning: Use of `new_data[[category]]` is discouraged.
## ℹ Use `.data[[category]]` instead.
The above function filters the data by getting the animal category and the area. We then clean the data by removing any missing values if there are. Then, the histogram is created which shows that Western Europe has the heaviest Dairy Cattle which is followed by Asia, Eastern Europe and Northern America. Indian subcontinent has the lighest cattle.
We created a function that can perform multiple operations in a single code block.