The Broadening Picture of Happily Ever After

Author

Annet Isa

Introduction

Genre book publishing in the United States can respond to changes in customers’ tastes and preferences much faster than movies or TV shows. Genre books - romance, science fiction, fantasy, horror, and mystery - far outsell classic literary fiction. In 2007, classic literary fiction had a market share of $466 million; romance fiction market share was $1.375 billion, 3 times as much (1). Romance readers will gladly give publishers their money as long as publishers give them the happily-ever-afters the readers desire. As such, examining the covers of romance novels over the last 13 years can serve as a window into the type of stories publishers believe readers want to read.

This report examines the “What Does A Happily Ever After Look Like” data set from Alice Liang. Liang’s data set is featured in the 2023.11.29 edition of Data Is Plural (2). Liang’s data set examines over 1,400 romance books and quantifies the covers based on the type of art (photorealistic/illustration), raunchiness (partial nudity of male and/or female models), and vibrancy (whether a person of color (POC) is featured on the cover).

The romance genre is incredibly broad. According to the Romance Writers of America, all a romance novel needs is “a central love story” and “an emotionally satisfying and optimistic ending”(3). For readers, there is joy in seeing the path a writer takes to get there.

I believe that as time passes, the covers of romance novels from traditional publishers will become less photorealistic and more diverse to reflect the diversity of consumers

(1. https://en.wikipedia.org/wiki/Genre_fiction) (2. https://www.data-is-plural.com/archive/2023-11-29-edition/) (3. https://www.rwa.org/Online/Advocacy/About_Romance_Fiction/Online/Romance_Genre/About_Romance_Genre.aspx)

Setup

#load relevant library
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

#load csv file
happily <- read_csv("happily_covers.csv")

New names:
Rows: 1435 Columns: 21
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(13): season, title, author, publisher, date, description, Github permal... dbl
(3): year, season_order, ISBN lgl (5): Github thumbnail, Cover, Man partially
unclothed, Woman partially ...
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...21`

# Here, we explore the data. A snapshot of the first six rows....
head(happily)

# A tibble: 6 × 21
   year season season_order title     author publisher date  description    ISBN
  <dbl> <chr>         <dbl> <chr>     <chr>  <chr>     <chr> <chr>         <dbl>
1  2011 Spring            1 Too Rich… Mona … Waterbro… 5/1/… <NA>        9.78e12
2  2011 Spring            1 Breaking… Suzan… Ballanti… 4/1/… <NA>        9.78e12
3  2011 Spring            1 The Secr… Mary … Delacorte 7/3/… <NA>        9.78e12
4  2011 Spring            1 Born of … Sherr… Grand Ce… 5/1/… <NA>        9.78e12
5  2011 Spring            1 All Up i… Lutis… Kensingt… 3/1/… <NA>        9.78e12
6  2011 Spring            1 His, Une… Susan… Kensingt… 2/1/… <NA>        9.78e12
# ℹ 12 more variables: `Github permalink` <chr>, google_thumbnail <chr>,
#   google_smallThumbnail <chr>, `Cover URL override` <chr>, cover_url <chr>,
#   `Github thumbnail` <lgl>, Cover <lgl>, Style <chr>,
#   `Man partially unclothed` <lgl>, `Woman partially unclothed` <lgl>,
#   `Has POC` <lgl>, ...21 <chr>

# ... and the number of authors and publishers in the dataset
num_unique_authors <- length(unique(happily$author))
print(num_unique_authors)

[1] 936

num_unique_publishers <- length(unique(happily$publisher))
print(num_unique_publishers)

[1] 209

sum(happily$year == 2022)

[1] 119

# We now tidy up the data (standardizing column names & spacing)...

names(happily) <- tolower(names(happily))
names(happily) <- gsub(" ", "_", names(happily))
head(happily)

# A tibble: 6 × 21
   year season season_order title     author publisher date  description    isbn
  <dbl> <chr>         <dbl> <chr>     <chr>  <chr>     <chr> <chr>         <dbl>
1  2011 Spring            1 Too Rich… Mona … Waterbro… 5/1/… <NA>        9.78e12
2  2011 Spring            1 Breaking… Suzan… Ballanti… 4/1/… <NA>        9.78e12
3  2011 Spring            1 The Secr… Mary … Delacorte 7/3/… <NA>        9.78e12
4  2011 Spring            1 Born of … Sherr… Grand Ce… 5/1/… <NA>        9.78e12
5  2011 Spring            1 All Up i… Lutis… Kensingt… 3/1/… <NA>        9.78e12
6  2011 Spring            1 His, Une… Susan… Kensingt… 2/1/… <NA>        9.78e12
# ℹ 12 more variables: github_permalink <chr>, google_thumbnail <chr>,
#   google_smallthumbnail <chr>, cover_url_override <chr>, cover_url <chr>,
#   github_thumbnail <lgl>, cover <lgl>, style <chr>,
#   man_partially_unclothed <lgl>, woman_partially_unclothed <lgl>,
#   has_poc <lgl>, ...21 <chr>

# ...refining the columns to the ones that will be used for data analysis...

happily_ever <- select(happily, year, season_order, title, author, publisher, style, man_partially_unclothed, woman_partially_unclothed, has_poc)
head(happily_ever)

# A tibble: 6 × 9
   year season_order title         author publisher style man_partially_unclot…¹
  <dbl>        <dbl> <chr>         <chr>  <chr>     <chr> <lgl>                 
1  2011            1 Too Rich for… Mona … Waterbro… Phot… FALSE                 
2  2011            1 Breaking the… Suzan… Ballanti… Phot… FALSE                 
3  2011            1 The Secret M… Mary … Delacorte Phot… FALSE                 
4  2011            1 Born of Shad… Sherr… Grand Ce… Phot… FALSE                 
5  2011            1 All Up in My… Lutis… Kensingt… Phot… FALSE                 
6  2011            1 His, Unexpec… Susan… Kensingt… Phot… TRUE                  
# ℹ abbreviated name: ¹man_partially_unclothed
# ℹ 2 more variables: woman_partially_unclothed <lgl>, has_poc <lgl>

# ...and confirming there are no NA's
sum(is.na(happily_ever))

[1] 0

Data Preparation for Data Analysis

## Categorical counts are converted to frequency
happily_ever_after <- happily_ever %>%
  group_by(year) %>%
  summarise(
    num_photo_style = sum(style == "Photorealistic"),
    num_ill_style = sum(style == "Illustrated"),
    num_m_partial = sum(man_partially_unclothed),
    num_w_partial = sum(woman_partially_unclothed),
    num_poc = sum(has_poc)
  )

A Data Analysis Interlude - Linear Regression

cor(happily_ever_after$num_poc, happily_ever_after$num_m_partial)

[1] -0.5756839

happy_5 <- lm(num_m_partial ~ num_poc, data = happily_ever_after)
summary(happy_5)


Call:
lm(formula = num_m_partial ~ num_poc, data = happily_ever_after)

Residuals:
    Min      1Q  Median      3Q     Max 
-10.064  -6.799  -1.932   4.419  19.936 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  22.5343     3.5291   6.385 5.19e-05 ***
num_poc      -0.4338     0.1858  -2.335   0.0395 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.601 on 11 degrees of freedom
Multiple R-squared:  0.3314,    Adjusted R-squared:  0.2706 
F-statistic: 5.453 on 1 and 11 DF,  p-value: 0.03951

The equation for my model is

num_m_partial = -0.4338(num_poc) + 22.5343

In words, on average, during the time period covered in the model, the equation suggests that every year 22 romance novels are published featuring partially unclothed males on the cover. Publishers are increasing the ethnicity of the cover models (both male and female) faster than they are clothing the male models.

The correlation between diverse models and unclothed male models is weak at -0.5756839. It makes sense that the data analysis suggests that only 27% of the variation shown here is explained by this model/equation.

Another Data Analysis Interlude - Diagnostic Plots

# Running diagnostic plots to further analyze the model.

library(GGally)

Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2

ggpairs(happily_ever_after)

The bottom 5 plots of the first column are intriguing! As the years progress, romance covers are becoming less racy, more diverse, and much less photorealistic. Consumers judging a romance novel by its cover would expect to find a more diverse set of characters in romance novels with illustrated covers.

Data Manipulation in Futherance of…

# An alluvial is a visualization that can depict changes in count over time. To utilize it, the tibble has to be changed from a wide data format to a long data format. 

happy6 <- happily_ever_after %>%
  pivot_longer(
        cols = 2:6,
    names_to = "cover_type",
    values_to = "cover_count")
head(happy6)

# A tibble: 6 × 3
   year cover_type      cover_count
  <dbl> <chr>                 <int>
1  2011 num_photo_style          53
2  2011 num_ill_style             4
3  2011 num_m_partial            14
4  2011 num_w_partial             3
5  2011 num_poc                   3
6  2012 num_photo_style         116

…The Visualization

library(alluvial)
library(ggalluvial)


happy_cover <- ggplot(happy6, aes(x = year, y = cover_count, alluvium = cover_type)) +
  theme_classic() +
  geom_alluvium(aes(fill = cover_type),
                color = 'black',
                width = .1,
                alpha = .8) +
  scale_fill_brewer(palette = "PRGn", labels = c("Illustrated", "Partial Unclothed\nMale", "Photorealistic", "POC", "Partial Unclothed\nFemale")) +
  scale_x_continuous(limits = c(2011, 2023)) +
  labs(title = "Changes in Romance Covers \nFrom 2011-2023", 
       y = "Romance Novel Covers",
       x = "Year",
       fill = "Cover Features",
       caption = "Source: Alice Liang, 'What does a happily ever after look like?\n www.aliceyliang.com") 

happy_cover

Epilogue

I standardized the column headings (all lowercase, replacing spacing) to minimize syntax errors when coding. I did confirm there were no missing values in the columns I processed.

The visualization above does show that the nature of “happily ever after” is changing with traditional publishers with books featured in Publisher’s Weekly. Only a fraction of traditionally published books are featured in Publisher’s Weekly. Indie authors - authors who self-publish or work with small publishing houses - are not reflected in the plot’s data. In 2022, thousands of romance novels were published yet only 122 were included in this data set. Nonetheless, the trend in traditional romance novel covers is pleasing.

I wish I had had time to review each of Ms. Liang’s classifications as I disagreed with her for a few. I also wish additional cover types had been included such as LGBTQ covers and polyromantic covers.