What is (are) your main question(s)? What is your story? What does the final graphic show?
Research question: “Does drinking more alcohol lead to higher happiness?”
Studies suggest that moderate drinking, especially in social settings, may be linked to higher reported happiness or well-being. However, excessive drinking does not have the same effect, and can often lead to negative outcomes that overshadow any short-term pleasure. We wanted to analyze the correlation between alcohol consumption and happiness using specific datasets on our own.
We presents various visualizations exploring the relationship between alcohol consumption and happiness across countries and regions. Europe shows generally higher alcohol consumption, while regions like Africa and the Middle East have lower levels, influenced by cultural or religious norms. While there is a positive correlation between alcohol consumption and happiness, regional comparisons suggest this relationship is not consistent. In particular, economic factors such as GDP appear to have a stronger influence on happiness, especially in Developing Countries.
Explain where the data came from, what agency or company made it, how it is structured, what it shows, etc.
The data for this project is sourced from Kaggle, specifically the Happiness and Alcohol Consumption dataset created by Marcos Pessotto. It includes 9 different variables related to alcohol consumption and happiness across 122 countries.
This dataset appears to have been processed, compiled from various sources including the United Nations Development Programme (UNDP) and happiness survey results conducted in 2016, and cleaned with key variables organized for analysis. Using this dataset, we want to determine the correlation between alcohol consumption and happiness, specifically whether drinking more alcohol leads to higher happiness.
The main variables we chose for our project are country, region, happiness score, GDP_PerCapita, Beer_PerCapita, Spirit_PerCapita, and Wine_PerCapita.
Country and Region: These are essential for understanding the geographical variation in alcohol consumption and happiness scores.
Happiness Score: This variable shows the relationship between a country’s happiness and its alcohol consumption.
GPD per Capita: This indicates the relationship between alcohol consumption for each country.
Alcohol consumption by type (Beer, Spirits, Wine): These variables allow for an analysis of how alcohol consumption is distributed across different types of beverages in each country.
Describe and show how you cleaned and reshaped the data
First, we loaded our dataset.
hac <- read.csv("data/HappinessAlcoholConsumption.csv")
hac
## Country Region Hemisphere
## 1 Denmark Western Europe north
## 2 Switzerland Western Europe north
## 3 Iceland Western Europe north
## 4 Norway Western Europe north
## 5 Finland Western Europe north
## 6 Canada North America north
## 7 Netherlands Western Europe north
## 8 New Zealand Australia and New Zealand south
## 9 Australia Australia and New Zealand south
## 10 Sweden Western Europe north
## 11 Israel Middle East and Northern Africa north
## 12 Austria Western Europe north
## 13 United States North America north
## 14 Costa Rica Latin America and Caribbean north
## 15 Germany Western Europe north
## 16 Brazil Latin America and Caribbean both
## 17 Belgium Western Europe north
## 18 Ireland Western Europe north
## 19 Luxembourg Western Europe north
## 20 Mexico Latin America and Caribbean north
## 21 Singapore Southeastern Asia north
## 22 United Kingdom Western Europe north
## 23 Chile Latin America and Caribbean south
## 24 Panama Latin America and Caribbean north
## 25 Argentina Latin America and Caribbean south
## 26 Czech Republic Central and Eastern Europe north
## 27 United Arab Emirates Middle East and Northern Africa north
## 28 Uruguay Latin America and Caribbean south
## 29 Malta Western Europe north
## 30 Colombia Latin America and Caribbean both
## 31 France Western Europe north
## 32 Thailand Southeastern Asia north
## 33 Qatar Middle East and Northern Africa north
## 34 Spain Western Europe north
## 35 Guatemala Latin America and Caribbean north
## 36 Suriname Latin America and Caribbean north
## 37 Bahrain Middle East and Northern Africa north
## 38 Trinidad and Tobago Latin America and Caribbean north
## 39 Venezuela Latin America and Caribbean north
## 40 Slovakia Central and Eastern Europe north
## 41 El Salvador Latin America and Caribbean north
## 42 Nicaragua Latin America and Caribbean north
## 43 Uzbekistan Central and Eastern Europe north
## 44 Italy Western Europe north
## 45 Ecuador Latin America and Caribbean both
## 46 Belize Latin America and Caribbean north
## 47 Japan Eastern Asia noth
## 48 Kazakhstan Central and Eastern Europe north
## 49 Moldova Central and Eastern Europe north
## 50 Russian Federation Central and Eastern Europe north
## 51 Poland Central and Eastern Europe north
## 52 South Korea Eastern Asia noth
## 53 Bolivia Latin America and Caribbean south
## 54 Lithuania Central and Eastern Europe north
## 55 Belarus Central and Eastern Europe north
## 56 Slovenia Central and Eastern Europe north
## 57 Peru Latin America and Caribbean both
## 58 Turkmenistan Central and Eastern Europe north
## 59 Mauritius Sub-Saharan Africa north
## 60 Latvia Central and Eastern Europe north
## 61 Cyprus Western Europe north
## 62 Paraguay Latin America and Caribbean south
## 63 Romania Central and Eastern Europe north
## 64 Estonia Central and Eastern Europe north
## 65 Jamaica Latin America and Caribbean north
## 66 Croatia Central and Eastern Europe north
## 67 Turkey Middle East and Northern Africa north
## 68 Jordan Middle East and Northern Africa north
## 69 Azerbaijan Central and Eastern Europe north
## 70 Philippines Southeastern Asia north
## 71 China Eastern Asia noth
## 72 Kyrgyzstan Central and Eastern Europe north
## 73 Serbia Central and Eastern Europe north
## 74 Bosnia and Herzegovina Central and Eastern Europe north
## 75 Montenegro Central and Eastern Europe north
## 76 Dominican Republic Latin America and Caribbean north
## 77 Morocco Middle East and Northern Africa north
## 78 Hungary Central and Eastern Europe north
## 79 Lebanon Middle East and Northern Africa north
## 80 Portugal Western Europe north
## 81 Macedonia Central and Eastern Europe north
## 82 Vietnam Southeastern Asia north
## 83 Tunisia Middle East and Northern Africa north
## 84 Greece Western Europe north
## 85 Mongolia Eastern Asia noth
## 86 Nigeria Sub-Saharan Africa north
## 87 Honduras Latin America and Caribbean north
## 88 Zambia Sub-Saharan Africa south
## 89 Albania Central and Eastern Europe north
## 90 Sierra Leone Sub-Saharan Africa north
## 91 Namibia Sub-Saharan Africa south
## 92 Cameroon Sub-Saharan Africa south
## 93 South Africa Sub-Saharan Africa south
## 94 Egypt Middle East and Northern Africa north
## 95 Armenia Central and Eastern Europe north
## 96 Kenya Sub-Saharan Africa both
## 97 Ukraine Central and Eastern Europe north
## 98 Ghana Sub-Saharan Africa north
## 99 Dem. Rep. Congo Sub-Saharan Africa south
## 100 Georgia Central and Eastern Europe north
## 101 Rep. Congo Sub-Saharan Africa south
## 102 Senegal Sub-Saharan Africa north
## 103 Bulgaria Central and Eastern Europe north
## 104 Zimbabwe Sub-Saharan Africa south
## 105 Malawi Sub-Saharan Africa south
## 106 Gabon Sub-Saharan Africa south
## 107 Mali Sub-Saharan Africa north
## 108 Haiti Latin America and Caribbean north
## 109 Botswana Sub-Saharan Africa south
## 110 Comoros Sub-Saharan Africa south
## 111 Cote d'Ivoire Sub-Saharan Africa north
## 112 Cambodia Southeastern Asia north
## 113 Angola Sub-Saharan Africa south
## 114 Niger Sub-Saharan Africa north
## 115 Chad Sub-Saharan Africa north
## 116 Burkina Faso Sub-Saharan Africa north
## 117 Madagascar Sub-Saharan Africa south
## 118 Tanzania Sub-Saharan Africa south
## 119 Liberia Sub-Saharan Africa north
## 120 Benin Sub-Saharan Africa north
## 121 Togo Sub-Saharan Africa north
## 122 Syria Middle East and Northern Africa north
## HappinessScore HDI GDP_PerCapita Beer_PerCapita Spirit_PerCapita
## 1 7.526 928 53.579 224 81
## 2 7.509 943 79.866 185 100
## 3 7.501 933 60.530 233 61
## 4 7.498 951 70.890 169 71
## 5 7.413 918 43.433 263 133
## 6 7.404 922 42.349 240 122
## 7 7.339 928 45.638 251 88
## 8 7.334 915 40.332 203 79
## 9 7.313 938 49.897 261 72
## 10 7.291 932 51.845 152 60
## 11 7.267 902 37.181 63 69
## 12 7.119 906 44.731 279 75
## 13 7.104 922 57.589 249 158
## 14 7.087 791 11.733 149 87
## 15 6.994 934 42.233 346 117
## 16 6.952 758 8.639 245 145
## 17 6.929 915 41.261 295 84
## 18 6.907 934 64.100 313 118
## 19 6.871 904 100.739 236 133
## 20 6.778 772 8.444 238 68
## 21 6.739 930 55.243 60 12
## 22 6.725 920 40.412 219 126
## 23 6.705 842 13.961 130 124
## 24 6.701 785 14.333 285 104
## 25 6.650 822 12.654 193 25
## 26 6.596 885 18.484 361 170
## 27 6.573 862 38.518 16 135
## 28 6.545 802 15.298 115 35
## 29 6.488 875 24.771 149 100
## 30 6.481 747 5.757 159 76
## 31 6.478 899 36.870 127 151
## 32 6.474 748 5.979 99 258
## 33 6.375 855 59.324 1 42
## 34 6.361 889 26.617 284 157
## 35 6.324 649 4.141 53 69
## 36 6.269 719 5.871 128 178
## 37 6.218 846 22.561 42 63
## 38 6.168 785 16.352 197 156
## 39 6.084 766 15.692 333 100
## 40 6.078 853 16.530 196 293
## 41 6.068 679 3.769 52 69
## 42 5.992 657 2.144 78 118
## 43 5.987 703 2.106 25 101
## 44 5.977 878 30.669 85 42
## 45 5.976 749 6.019 162 74
## 46 5.956 709 4.960 263 114
## 47 5.921 907 38.972 77 202
## 48 5.919 797 7.715 124 246
## 49 5.897 697 1.913 109 226
## 50 5.856 815 8.748 247 326
## 51 5.835 860 12.415 343 215
## 52 5.835 900 27.105 140 16
## 53 5.822 689 3.117 167 41
## 54 5.813 855 14.913 343 244
## 55 5.802 805 5.023 142 373
## 56 5.768 894 21.650 270 51
## 57 5.743 748 6.031 163 160
## 58 5.658 705 6.389 19 71
## 59 5.648 788 9.682 98 31
## 60 5.560 844 14.070 281 216
## 61 5.546 867 23.541 192 154
## 62 5.538 702 4.078 213 117
## 63 5.528 807 9.532 297 122
## 64 5.517 868 17.737 224 194
## 65 5.510 732 4.879 82 97
## 66 5.488 828 12.299 230 87
## 67 5.389 787 10.863 51 22
## 68 5.303 735 4.088 6 21
## 69 5.291 757 3.881 21 46
## 70 5.279 696 2.951 71 186
## 71 5.245 748 8.117 79 192
## 72 5.185 669 1.220 31 97
## 73 5.177 785 5.426 283 131
## 74 5.163 766 4.809 76 173
## 75 5.161 810 7.029 31 114
## 76 5.155 733 6.794 193 147
## 77 5.151 662 2.893 12 6
## 78 5.145 835 12.820 234 215
## 79 5.129 753 8.257 20 55
## 80 5.123 845 19.872 194 67
## 81 5.121 756 4.834 106 27
## 82 5.061 689 2.171 111 2
## 83 5.045 732 3.689 51 3
## 84 5.033 868 17.882 133 112
## 85 4.907 743 3.694 77 189
## 86 4.875 530 2.176 42 5
## 87 4.871 614 2.375 69 98
## 88 4.795 586 1.263 32 19
## 89 4.655 782 4.132 89 132
## 90 4.635 413 481.000 25 3
## 91 4.574 645 4.561 376 3
## 92 4.513 553 1.375 147 1
## 93 4.459 696 5.280 225 76
## 94 4.362 694 3.548 6 4
## 95 4.360 749 3.606 21 179
## 96 4.356 585 1.463 58 22
## 97 4.324 746 2.186 206 237
## 98 4.276 588 1.517 31 3
## 99 4.272 452 1.712 32 3
## 100 4.252 776 3.866 52 100
## 101 4.236 612 498.000 76 1
## 102 4.219 499 953.000 9 1
## 103 4.217 810 7.469 231 252
## 104 4.193 532 1.029 64 18
## 105 4.156 474 300.000 8 11
## 106 4.121 698 7.079 347 98
## 107 4.073 421 780.000 5 1
## 108 4.028 496 735.000 1 326
## 109 3.974 712 6.954 173 35
## 110 3.956 502 775.000 1 3
## 111 3.916 486 1.535 37 1
## 112 3.907 576 1.270 57 65
## 113 3.866 577 3.309 217 57
## 114 3.856 351 368.000 3 2
## 115 3.763 405 651.000 15 1
## 116 3.739 420 614.000 25 7
## 117 3.695 517 402.000 26 15
## 118 3.666 533 878.000 36 6
## 119 3.622 432 455.000 19 152
## 120 3.484 512 789.000 34 4
## 121 3.303 500 577.000 36 2
## 122 3.069 536 2.058 5 35
## Wine_PerCapita
## 1 278
## 2 280
## 3 78
## 4 129
## 5 97
## 6 100
## 7 190
## 8 175
## 9 212
## 10 186
## 11 9
## 12 191
## 13 84
## 14 11
## 15 175
## 16 16
## 17 212
## 18 165
## 19 271
## 20 5
## 21 11
## 22 195
## 23 172
## 24 18
## 25 221
## 26 134
## 27 5
## 28 220
## 29 120
## 30 3
## 31 370
## 32 1
## 33 7
## 34 112
## 35 2
## 36 7
## 37 7
## 38 7
## 39 3
## 40 116
## 41 2
## 42 1
## 43 8
## 44 237
## 45 3
## 46 8
## 47 16
## 48 12
## 49 18
## 50 73
## 51 56
## 52 9
## 53 8
## 54 56
## 55 42
## 56 276
## 57 21
## 58 32
## 59 18
## 60 62
## 61 113
## 62 74
## 63 167
## 64 59
## 65 9
## 66 254
## 67 7
## 68 1
## 69 5
## 70 1
## 71 8
## 72 6
## 73 127
## 74 8
## 75 128
## 76 9
## 77 10
## 78 185
## 79 31
## 80 339
## 81 86
## 82 1
## 83 20
## 84 218
## 85 8
## 86 2
## 87 2
## 88 4
## 89 54
## 90 2
## 91 1
## 92 4
## 93 81
## 94 1
## 95 11
## 96 2
## 97 45
## 98 10
## 99 1
## 100 149
## 101 9
## 102 7
## 103 94
## 104 4
## 105 1
## 106 59
## 107 1
## 108 1
## 109 35
## 110 1
## 111 7
## 112 1
## 113 45
## 114 1
## 115 1
## 116 7
## 117 4
## 118 1
## 119 2
## 120 13
## 121 19
## 122 16
Next, we renamed several variables and selected only those relevant to our analysis.
hac_clean <- hac %>%
rename(Happiness_Score = HappinessScore,
GDP = GDP_PerCapita,
Beer = Beer_PerCapita,
Spirit = Spirit_PerCapita,
Wine = Wine_PerCapita) %>%
select(Country, Region, Happiness_Score, GDP, Beer, Spirit, Wine)
Upon examining the data, we noticed something unusual: African countries appeared to have the highest GDP per capita. We found out some values of GDP_PerCapita column had errors. Specifically, values that should have used commas were mistakenly recorded with periods instead. To resolve this issue, we corrected the erroneous values by multiplying them by 1,000.
hac_clean <- hac_clean %>%
mutate(
GDP = if_else(GDP < 101, GDP * 1000, GDP),
Country = case_when(
Country == "United States" ~ "United States of America",
Country == "Russian Federation" ~ "Russia",
Country == "Dominican Republic" ~ "Dominican Rep.",
Country == "Iran" ~ "Iran (Islamic Republic of)",
Country == "Czech Republic" ~ "Czechia",
Country == "Brunei" ~ "Brunei Darussalam",
Country == "Macedonia" ~ "North Macedonia",
Country == "Cote d'Ivoire" ~ "Côte d'Ivoire",
Country == "Rep. Congo" ~ "Congo",
Country == "Bosnia and Herzegovina" ~ "Bosnia and Herz.",
TRUE ~ Country
),
BigRegion = case_when(
Region %in% c("Central and Eastern Europe", "Western Europe") ~ "Europe",
Region %in% c("Eastern Asia", "Southeastern Asia") ~ "Asia",
Region %in% c("Middle East and Northern Africa", "Sub-Saharan Africa") ~ "Africa",
Region %in% c("Latin America and Caribbean", "North America") ~ "America",
Region == "Australia and New Zealand" ~ "Oceania",
TRUE ~ Region
),
Total_Alcohol_Consumption = Beer + Spirit + Wine
)
This column was made to show the alcohol types in rows instead of columns separately.
hac_clean_melted <- hac_clean %>%
pivot_longer(cols = c(Beer, Spirit, Wine),
names_to = "Alcohol_Type",
values_to = "Consumption")
Describe and show how you created the first figure. Why did you choose this figure type?
Our first chart shows the maps of alcohol consumption by country. This images visualize alcohol consumption by country (Total Alcohol Per Capita), designed to provide a clear view of alcohol consumption levels across countries. Countries with higher alcohol consumption are represented in darker blue, while those with lower consumption are shown in lighter blue. Countries appearing in white have alcohol consumption below 100, and those without alcohol consumption data are depicted in gray.
# 지도 데이터 준비
world <- ne_countries(scale = "medium", returnclass = "sf")
map_data <- world %>%
left_join(hac_clean, by = c("name" = "Country"))
# 지도 시각화
p <- ggplot(data = map_data) +
geom_sf(aes(fill = Total_Alcohol_Consumption), color = "white", size = 0.2) +
scale_fill_gradient(
name = "Total Alcohol Consumption",
low = "lightblue", high = "darkblue",
na.value = "gray80"
) +
labs(
title = "Global Alcohol Consumption",
subtitle = "Total Alcohol Per Capita by Country"
) +
theme_minimal() +
theme(
panel.background = element_rect(fill = "aliceblue"),
legend.position = "bottom"
)
ggsave("images/First Chart 1.jpeg", plot = p, width = 15, height = 12)
This is a box plot chart comparing alcohol consumption by region. Each box plot represents the beer, spirits, and wine consumption within regional groupings of countries, displaying the minimum, maximum, and median values. In the chart, Western Europe and Eastern Asia stand out for their prominent wine and spirits consumption, respectively, while the Middle East and Northern Africa show generally low alcohol consumption. Religious norms likely influence this. Beer is consumed fairly evenly across most regions, while spirits show higher consumption in specific areas, such as Eastern and Southeastern Asia. This data allows for a clear comparison of alcohol consumption patterns across regions, reflecting the influence of national characteristics.
p <- ggplot(hac_clean_melted, aes(x = Alcohol_Type, y = Consumption, fill = Alcohol_Type)) +
geom_boxplot(outlier.shape = NA) +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Region-wise Alcohol Consumption Comparison",
subtitle = "Comparison by Beer, Spirit, and Wine",
x = "Alcohol Type",
y = "Per Capita Consumption",
caption = "Source: HappinessAlcoholConsumption.csv") +
facet_wrap(~ Region, ncol = 3) +
theme_minimal(base_size = 15) +
theme(
plot.title = element_text(face = "bold", size = 18, hjust = 0.5, color = "darkblue"),
plot.subtitle = element_text(hjust = 0.5, size = 14, face = "italic"),
axis.text.x = element_text(angle = 45, hjust = 1, size = 12),
legend.position = "top",
legend.title = element_blank(),
panel.grid.major = element_line(color = "gray", linetype = "dashed"),
panel.grid.minor = element_blank())
ggsave("images/First Chart 2.jpeg", plot = p, width = 15, height = 12)
This is a bar graph of alcohol consumption and happiness. This image illustrates the relationship between alcohol consumption and happiness scores across different countries. “Total Alcohol Consumption” is the sum of beer, spirit, and wine consumption data for each country. In the image, higher alcohol consumption is represented by a darker blue color, while lower consumption is indicated by a lighter blue shade. As observed, there is a positive correlation between alcohol consumption and happiness scores; countries with higher alcohol consumption tend to have higher happiness scores, and vice versa.
p <- ggplot(hac_clean, aes(x = reorder(Country, Happiness_Score), y = Happiness_Score, fill = Total_Alcohol_Consumption)) +
geom_bar(stat = "identity", position = "dodge", show.legend = TRUE) +
labs(
x = NULL,
y = "Happiness Score",
title = "Alcohol Consumption and the Happiness Score") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 30),
axis.text.y = element_text(size = 20),
strip.text = element_text(size = 30),
plot.title = element_text(size = 50, face = "bold"),
panel.spacing = unit(1, "lines"),
axis.title = element_text(size = 40),
plot.margin = margin(30, 30, 30, 30),
legend.position = "right",
legend.key.size = unit(2, "cm"),
legend.text = element_text(size = 25),
legend.title = element_text(size = 30)) +
scale_fill_gradient(low = "lightblue", high = "darkblue")
ggsave("images/Second Chart 1.jpeg", plot = p, width = 35, height = 30)