Animal Crossing is a Nintendo video game series created by Katsuya Eguchi and Hisashi Nogami (Knezevic, 2020) in which players create a human avatar who moves into a village and builds community with adorable little animal villagers.
It was first released in 2001 and since then there haVe been several iterations of the game, including:
Animal Crossing: Wild World (2005)
Animal Crossing: City Folk (2008)
Animal Crossing: New Leaf (2013)
Animal Crossing: New Horizon (2020)
I played Animal Crossing: City Folk on the Wii when I was a kid and enjoy playing New Horizons on my Switch in my free time.
I encountered the Animal Crossing New Horizon Nookplaza dataset during the Google Data Analytics Certificate when I was first starting out on my data analysis journey a few years ago. I remember thinking it was so cool that people had made these spreadsheets and it made me excited to learn more about all things data.
The dataset includes 30 csv files containing data related to various aspects of the latest New Horizons game, from aspects of various villagers to different furniture and accessories available in the game.
The file originally hosted on Google Sheets but the separated CSV files are also available to download as a zip file on Kaggle:
Animal Crossing New Horizons Dataset
The premise of Animal Crossing New Horizons is that players move to a deserted island inhabited by a charming entrepreneurial raccoon family after purchasing a destination home package. You are then selected to lead the development of the island’s residential community as villagers visit your island to explore and eventually inhabit.
The villagers file is a great place to start because it serves as an introduction to the characters in Animal Crossing New Horizons.
villagers <- read.csv("C:/Users/crjur/Downloads/archive/villagers.csv")
head(villagers)
## Name Species Gender Personality Hobby Birthday Catchphrase
## 1 Admiral Bird Male Cranky Nature 27-Jan aye aye
## 2 Agent S Squirrel Female Peppy Fitness 2-Jul sidekick
## 3 Agnes Pig Female Big Sister Play 21-Apr snuffle
## 4 Al Gorilla Male Lazy Fitness 18-Oct ayyyeee
## 5 Alfonso Alligator Male Lazy Play 9-Jun it'sa me
## 6 Alice Koala Female Normal Education 19-Aug guvnor
## Favorite.Song Style.1 Style.2 Color.1 Color.2 Wallpaper
## 1 Steep Hill Cool Cool Black Blue dirt-clod wall
## 2 Go K.K. Rider Active Simple Blue Black concrete wall
## 3 K.K. House Simple Elegant Pink White gray molded-panel wall
## 4 Go K.K. Rider Active Active Red White concrete wall
## 5 Forest Life Simple Simple Red Blue yellow playroom wall
## 6 Aloha K.K. Cute Cute Red Pink white botanical-tile wall
## Flooring
## 1 tatami
## 2 colorful tile flooring
## 3 arabesque flooring
## 4 green rubber flooring
## 5 green honeycomb tile
## 6 light parquet flooring
## Furniture.List Filename
## 1 717;1849;7047;2736;787;5970;3449;3622;3802;4106;3438;4029 brd06
## 2 7845;7150;3468;4080;290;3971;3449;1708;4756;2560;4753;7323;7789 squ05
## 3 4129;7236;7235;7802;896;3428;4027;7325;3958;7136;3951;3773;7801 pig17
## 4 1452;4078;4013;833;4116;3697;7845;3307;3946;3960;7800 gor08
## 5 4763;3205;3701;1557;3623;85;3208;3584;4761;1217;3702 crd00
## 6 4052;4054;4129;4053;4049;3615;4051;7353;4130;3428;4046;4338 kal01
## Unique.Entry.ID
## 1 B3RyfNEqwGmcccRC3
## 2 SGMdki6dzpDZyXAw5
## 3 jzWCiDPm9MqtCfecP
## 4 LBifxETQJGEaLhBjC
## 5 REpd8KxB8p9aGBRSE
## 6 wkPJDHMMMTK24eqzC
There are 391 different villagers that might visit your island in Animal Crossing New Horizons. They appear via random selection, meaning that players do not have control over which villagers visit their island at any given time. However, players do have control over who builds their new home on the island and can even choose the location of the villager’s home on the island map.
There are multiple personality types in the game that influence how the player can interact with villagers in the game. I get distracted by the cutesy animals and fun colors, but I’ve always* wondered if there is potential gender bias within these personality types.
villagers %>%
group_by(Gender,Personality) %>%
summarize(n=n())
## `summarise()` has grouped output by 'Gender'. You can override using the
## `.groups` argument.
## # A tibble: 8 × 3
## # Groups: Gender [2]
## Gender Personality n
## <chr> <chr> <int>
## 1 Female Big Sister 24
## 2 Female Normal 59
## 3 Female Peppy 49
## 4 Female Snooty 55
## 5 Male Cranky 55
## 6 Male Jock 55
## 7 Male Lazy 60
## 8 Male Smug 34
Looking at the table, it’s pretty clear that gender impacts what personality type a villager has. Female villagers have personalities that are stereotypically associated with women in media, such as “snooty” or “big sister”. Similarly, male villagers are assigned personality types that are associated with men in popular culture, such as “jock”, “lazy”, or “smug”.
Interestingly, it seems that male villagers are associated with negative personality types. We can test this observation using a chi-square test, which compares the distributions of two different samples (male and female) via contingency tables.
First, let’s define which personality types are positive, negative, and neutral for the purposes of our analysis.
villagers <- villagers %>%
mutate(personality.rating = case_when(Personality == "Peppy" |
Personality == "Jock"
~ "positive",
Personality == "Snooty" |
Personality == "Cranky" |
Personality == "Lazy" |
Personality == "Smug"
~ "negative",
Personality == "Normal" |
Personality == "Big Sister"
~ "neutral"))
Then, we can follow the steps to conduct a chi-square test that are beautifully outlined by Arunn Thevapalan in the following resource: Chi-Square Test in R: A Complete Guide
# Step 1: Prepare data in a contingency table format.
selected_data <- villagers %>%
select(Gender, personality.rating)
contingency_table <- table(selected_data$Gender, selected_data$personality)
print(contingency_table)
##
## negative neutral positive
## Female 55 83 49
## Male 149 0 55
# Step 2: Apply chi-square test
chi_square_test <- chisq.test(contingency_table)
print(chi_square_test)
##
## Pearson's Chi-squared test
##
## data: contingency_table
## X-squared = 126.16, df = 2, p-value < 2.2e-16
To interpret these results, we establish two hypotheses.
Null Hypothesis (H0): There is no association between villager gender and personality rating.
Alternative Hypothesis (HA): There is a significant association between villager gender and personality rating.
The chi-square statistic is 126.16, which represents the discrepancy between the observed and expected frequencies of each variable.
The degrees of freedom (df) is 2, which represents the number of independent values in the sample.
The p-value is < 2.2e-16, which is very small and definitely smaller than p=0.05, so we reject the null hypothesis.
This demonstrates a significant associaTion between villager gender and personality rating (positive, negative, neutral) in Animal Crossing New Horizons.
I think this silly dataset would be really fun to use when teaching yourself to use data management/exploration packages like dplyr or other tools in R.
I think this dataset is also great for refreshing your memory on non-parametric hypothttps://www.kaggle.com/datasets/jessicali9530/animal-crossing-new-horizons-nookplaza-dataset/datahesis testing on categorical data, like chi-square testing.
Li, Jessica (2021). Animal Crossing New Horizons Catalog [Data set]. https://www.kaggle.com/datasets/jessicali9530/animal-crossing-new-horizons-nookplaza-dataset/data
Knezevic, Kevin. (2020, April 6). “How Animal Crossing Was Born From One Of Nintendo’s Biggest Flops”. GameSpot. Archived from the original on August 17, 2024. Retrieved April 28, 2025. https://www.gamespot.com/articles/how-animal-crossing-was-born-from-one-of-nintendos-biggest-flops/1100-6475342/
Thevapalan, Arunn. (2024, August 29). “Chi-Square Test in R: A Complete Guide”. Datacamp. Rertrieved April 30, 2025. https://www.datacamp.com/tutorial/chi-square-test-r
Wikipedia contributors. (2025, April 7). Animal Crossing. In Wikipedia, The Free Encyclopedia. Retrieved April 29, 2025. https://en.wikipedia.org/w/index.php?title=Animal_Crossing&oldid=1284419559