Introduction

My goal in this analysis is to explore the growth in the number of yuri manga titles (series or one-shots) over time.

For those readers not familiar with the R statistical software and the additional Tidyverse software I use to manipulate and plot data, check out the various ways to learn more about the Tidyverse.

Setup

I load the following R libraries, for the purposes listed:

library("tidyverse")
library("tools")

Preparing the data

Obtaining the yuri manga data

I use a local copy of a data file containing the number of yuri manga titles released in each year through 2022. See the “References” section below for more information.

I check the MD5 hash values for the file, and stop if the contents are not what are expected.

stopifnot(md5sum("yuri-manga-per-year.csv") == "bcf4da8233ec0bc93e3e5e0e8bfdf443")

Loading the data

I load the data from the CSV file (filtering out any rows with invalid data) and sort the date in ascending order by year.

mpy_tb <- read_csv("yuri-manga-per-year.csv", col_types = "ii") %>%
  filter(!is.na(Year) & !is.na(Num_Manga)) %>%
  arrange(Year)

Analysis

Preliminary analysis

I do some basic exploratory data analysis, starting the total number of yuri manga in the dataset, the earliest year covered, and the maximum number of manga released in any year.

total_manga <- sum(mpy_tb$Num_Manga)

earliest_year <- min(mpy_tb$Year)

max_in_year <- max(mpy_tb$Num_Manga)

year_of_max <- mpy_tb$Year[mpy_tb$Num_Manga == max_in_year]

There are a total of 1,704 yuri manga in the dataset. The earliest year for which there is data is 1957. The most yuri manga released in any year was 192, in 2018.

Number of manga released per year

Now that I have my dataset of interest, I can continue my analysis, this time by plotting the number of yuri manga released in each year.

mpy_tb %>%
  ggplot(mapping=aes(x = Year, y = Num_Manga)) +
  geom_point() +
  geom_line() +
  scale_x_continuous(breaks = c(1960, 1970, 1980, 1990, 2000, 2010, 2020)) +
  scale_y_continuous(labels = scales::label_comma()) +
  xlab("Year") +
  ylab("New Yuri Manga") +
  labs(
    title = "New Yuri Manga Titles by Year (All Time)",
    subtitle = "Number of Series or One-Shots First Released in Year",
    caption = "Data source: Manga tagged GL on Anime Planet site"
  ) +
  theme_gray() +
  theme(axis.title.x = element_text(margin = margin(t = 5))) +
  theme(axis.title.y = element_text(margin = margin(r = 10))) +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0))

Since very few yuri manga were published until the 1990s, I re-do the plot to focus on the time period from 1990 on.

mpy_tb %>%
  filter(Year >= 1990) %>%
  ggplot(mapping=aes(x = Year, y = Num_Manga)) +
  geom_point() +
  geom_line() +
  scale_x_continuous(breaks = c(1990, 1995, 2000, 2005, 2010, 2015, 2020)) +
  scale_y_continuous(labels = scales::label_comma()) +
  xlab("Year") +
  ylab("New Yuri Manga") +
  labs(
    title = "New Yuri Manga Titles by Year (Since 1990)",
    subtitle = "Number of Series or One-Shots First Released in Year",
    caption = "Data source: Manga tagged GL on Anime Planet site"
  ) +
  theme_gray() +
  theme(axis.title.x = element_text(margin = margin(t = 5))) +
  theme(axis.title.y = element_text(margin = margin(r = 10))) +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0))

Cumulative number of yuri manga released by year

Another interesting statistic to look at is the cumulative number of yuri manga released by year.

mpy_tb <- mpy_tb %>%
  mutate(Cumul_Manga = cumsum(Num_Manga))

cumul_90 <- min(mpy_tb$Year[mpy_tb$Cumul_Manga > 0.1 * total_manga]) - 1
cumul_75 <- min(mpy_tb$Year[mpy_tb$Cumul_Manga > 0.25 * total_manga]) - 1
cumul_50 <- min(mpy_tb$Year[mpy_tb$Cumul_Manga > 0.5 * total_manga]) - 1

I plot this statistic, again looking at the period from 1990 on.

mpy_tb %>%
  filter(Year >= 1990) %>%
  ggplot(mapping=aes(x = Year, y = Cumul_Manga)) +
  geom_line() +
  geom_point() +
  scale_x_continuous(breaks = c(1990, 1995, 2000, 2005, 2010, 2015, 2020)) +
  scale_y_continuous(labels = scales::label_comma()) +
  xlab("Year") +
  ylab("Cumulative Yuri Manga") +
  labs(
    title = "Cumulative Yuri Manga Titles by Year",
    subtitle = "Number of Series or One-Shots First Released in or before Year",
    caption = "Data source: Manga tagged GL on Anime Planet site"
  ) +
  theme_gray() +
  theme(axis.title.x = element_text(margin = margin(t = 5))) +
  theme(axis.title.y = element_text(margin = margin(r = 10))) +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0))

More than 90% of yuri manga titles were published after 2005, more than three-quarters after 2009, and more than half after 2015.

Appendix

Caveats

The accuracy of the dataset, and thus of the analysis, depends on the following factors:

  • whether the source of the data has entries for all manga published during the timeframe in question (post-World War 2);
  • whether the search properly excluded all works that are not manga in the strict definition of the term (i.e., comics created by Japanese artists and writers and published in Japan in the traditional manga format), and did not exclude any actual manga by mistake; and
  • whether the search captured all manga with significant yuri content, and excluded manga lacking yuri content or with only incidental yuri content.

In particular, the source of the data uses a scheme devised by Western fans in which the tag “yuri“ is used to mark works more sexual in nature and the tag “shoujo-ai” is used to mark works focused more on emotional relationships. The tag “GL” should include both of these categories, but I have not confirmed this.

References

Data on the number of yuri manga titles released per year were obtained from the Anime Planet web site using the following procedure:

  • Go to the page https://www.anime-planet.com/manga/all
  • Click on the “Tags” tab.
  • Click “+” on the “GL” tag.
  • Click “-” on the “Light Novels”, “Manhua”, “Manwha”, “OEL”, “Web Novels”, and “Webtoons” tags.
  • Click on the “APPLY FILTERS” button. (At the time of writing, this generated 49 pages of results.)
  • Click on the “YEAR“ column heading to sort the listing by year.
  • Count the number of manga released in each year.

Environment

I used the following R environment in doing the analysis above:

sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] tools     stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] forcats_0.5.2   stringr_1.5.0   dplyr_1.0.10    purrr_1.0.1    
## [5] readr_2.1.3     tidyr_1.2.1     tibble_3.1.8    ggplot2_3.4.0  
## [9] tidyverse_1.3.2
## 
## loaded via a namespace (and not attached):
##  [1] lubridate_1.9.0     assertthat_0.2.1    digest_0.6.31      
##  [4] utf8_1.2.2          R6_2.5.1            cellranger_1.1.0   
##  [7] backports_1.4.1     reprex_2.0.2        evaluate_0.19      
## [10] highr_0.10          httr_1.4.4          pillar_1.8.1       
## [13] rlang_1.0.6         googlesheets4_1.0.1 readxl_1.4.1       
## [16] rstudioapi_0.14     jquerylib_0.1.4     rmarkdown_2.19     
## [19] labeling_0.4.2      googledrive_2.0.0   bit_4.0.5          
## [22] munsell_0.5.0       broom_1.0.2         compiler_4.2.1     
## [25] modelr_0.1.10       xfun_0.36           pkgconfig_2.0.3    
## [28] htmltools_0.5.4     tidyselect_1.2.0    fansi_1.0.3        
## [31] crayon_1.5.2        tzdb_0.3.0          dbplyr_2.2.1       
## [34] withr_2.5.0         grid_4.2.1          jsonlite_1.8.4     
## [37] gtable_0.3.1        lifecycle_1.0.3     DBI_1.1.3          
## [40] magrittr_2.0.3      scales_1.2.1        cli_3.6.0          
## [43] stringi_1.7.12      vroom_1.6.0         cachem_1.0.6       
## [46] farver_2.1.1        fs_1.5.2            xml2_1.3.3         
## [49] bslib_0.4.2         ellipsis_0.3.2      generics_0.1.3     
## [52] vctrs_0.5.1         bit64_4.0.5         glue_1.6.2         
## [55] hms_1.1.2           parallel_4.2.1      fastmap_1.1.0      
## [58] yaml_2.3.6          timechange_0.2.0    colorspace_2.0-3   
## [61] gargle_1.2.1        rvest_1.0.3         knitr_1.41         
## [64] haven_2.5.1         sass_0.4.4

Source code

The source code for this analysis can be found in the public code repository https://gitlab.com/frankhecker/misc-analysis in the manga subdirectory.

This document and its source code are available for unrestricted use, distribution and modification under the terms of the Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. Stated more simply, you’re free to do whatever you’d like with it.