HW1: Does Science Fiction Mirror the Economy?

Author

Katie Hosey

Introduction

For the analysis, I decided to look at science fiction movie scripts across decades.

Science fiction – in both film and writing – often reflects an era’s sentiments and fears about technological and economic change. As Buckup (2016) points out, the hopeful themes and bright visuals in Star Trek (TV show) contrast with the more threatening content and dark scenes of Star Wars, perhaps reflecting changes in economic outlook from the 1966 Star Trek release to the 1977 Star Wars premiere. Sci-fi themes from the Cold War era often reflect hopes and anxieties about the space race and nuclear war; post-2000 sci-fi often features pandemics and robot takeovers (Berlatsky 2017). Each decade comes with advances in technology and newfound sources of hope and anxiety; science fiction mirrors these.

This analysis will examine 3 of the top (available, by highest grossing) science fiction scripts from each decade since 1970. It will look at most common words and general sentiments of the scripts of each decade to determine if changes in topic or sentiment among popular science fiction films reflects the technological and economic realities of each decade. Below is a chart of macroeconomic indicators across the decades, context around major historical and technological developements, and the 3 scripts from each period that will be included.

Annual rates of inflation, real GDP, and unemployment 1970-2023

1970s: Vietnam War; Cold War; space race; Nixon impeachment; oil embargo; stagflation; nuclear reactor malfunction at Three Mile Island; proliferation of nuclear stockpiles; first supercomputer invented. Movies:

1980s: Reagan presidency; Cold War; first space shuttle invented; Challenger explosion; HIV/AIDS epidemic; lowering inflation, initial rise in unemployment then drop, growth fluctuations; INF treaty; decrease in nuclear stockpiles; Exxon Valdez oil spill; personal computers hit the market; genetic engineering invented. Movies:

1990s: Soviet Union collapse; Persian Gulf War; Bush & Clinton presidencies; NAFTA signed; World Trade Center bombing; Waco, TX Branch Davidian standoff; Oklahoma City bombing; Clinton impeachment; Columbine shooting; drug wars; Internet launched; early 1990s recession, followed by cooling unemployment and inflation; cell phones popularized. Movies:

2000s represented: W. Bush presidency; 9/11; invasion of Iraq & Afghanistan; Hurricane Katrina, among other deadly hurricanes; Virginia Tech shooting; 2008 financial crisis; Swine Flu outbreak; economic growth until late-2000s recession and housing crisis; rising unemployment; rise of social media and eCommerce. Movies:

2010s-present represented: Continuation of Middle East wars on terror; Gulf Oil spill; Obama, Trump, and Biden presidencies; recovery from 2009 recession until Coronavirus pandemic and recession; rise in global terrorism and climate crises; rise of AI, 3D printing, Bitcoin. Movies:

*Only the first movie from a franchise, if the highest grossing in a decade, was used.

Scripts were directly copy and pasted from their IMSDb pages (The internet movie script database n.d.). Additional information for this section was drawn from Owens (n.d.), PBS (n.d.), Wikipedia (n.d.a), NYC (2024), Martin (2017), and Wikipedia (n.d.b).

Frequency Charts by Decade

Most common words used across the scripts in each decade

Each decade has unique common words, oftentimes referring to proper nouns within each script. This latter feature is particularly relevant in the 2020s graph, which is saturated with characters and themes from Dune Part One. These graphs indicate some themes from each era, with “death” as featured exclusively in the top words of the 1970s. Computer appears frequently for the first time in the 2000s, which is a bit delayed relative to supercomputer and PC releases. Themes of space and extraterrestrial travel are prominent throughout each era.

TF-IDF Analysis by Decade

Top 12 words with highest TF-IDF across the scripts in each decade

In comparing the raw frequency by decade and the TF-IDF, uniformity in the genre’s themes and topics are highlighted. For example, the word frequency chart includes more general words, such as “ship,” “star,” “alien,” and “space.” The TF-IDF chart, however, doesn’t include such words, as they are frequently used in scripts across decades. What makes decades distinct is the characters (even though prominent ones have been removed), class of characters, or settings of the movies (particularly the longer movies of the decade, which would have more representation) - e.g., “stormtroopers,” “wormhole,” “houston,” and “fremen.”

Word Clouds: Pre- and Post-2000

Most common words in science fiction scripts 1970-1999

Most common words in science fiction scripts 2000-present

Some words appear commonly in scripts in both time periods, such as “time,” “ship,” and “space.” These point to the popularity of space movies. Other words in both clouds may point to common visuals and stage direction within the genre, like “eyes” and “hand.” Interestingly, there’s some evolution of technology that reflects real life: post-2000 films feature the word “station” (the International Space Station was launched in 1998), which is not among common of words pre-2000.

Word Clouds with Bing Sentiment Analysis: Which feeling-inducing words are most common?

Most common negative words (darker gray) vs. Most common positive words (lighter gray) across the whole genre

Common negative words in the science fiction genre describe darkness, an inability to see, weapons, and death. The positive words describe light, attractiveness, and friendly gestures.

Sentiment Heatmap Using NRC: Have emotions changed overtime?

Heat map of average emotion prevalence across scripts in each decade

Fear and trust are the most prominent emotions across sci-fi scripts since the 1970s, which speaks to the common plot of banding together to overcome a fearful conflict. Disgust has become a less popular emotion in scripts since the 1970s, while joy has become more common, on average. Surprise is among the least common emotions in science fiction scripts.

Sentiment Analysis Using AFINN: Does sentiment map to economic recessions?

Total AFINN Sentiment Scores and Macroeconomics by Decade
Decade Total Sentiment Score Avg Inflation Avg GDP Growth Avg Unemployment
1970s -1711 7.09 3.24 6.38
1980s -956 5.54 3.12 7.21
1990s -1639 3.00 3.22 5.71
2000s -974 2.57 1.92 5.82
2010s -646 1.77 2.40 5.93
2020s -301 4.50 2.33 4.45

Average movie sentiment scores and U.S. recessions overtime

The chart does not show a very clear relationship between sci-fi script sentiment scores and occurance of recessions, though the sample size is small. Among the most negative scores, Star Wars (1977) and Terminator (1984) were released 2 years after the end of a recession; however, Independence Day (1996) and Armageddon (1998), which also have low sentiment scores, do not closely follow a recession, though they were released in the same decade. All scripts in this sample demonstrate an overwhelmingly negative sentiment score, but on average, sentiment scores have become less negative since the turn of the century. With more scripts, some patterns - either across decades or in correlation with economic outlook - could be elucidated.

Limitations

There are a few limitations to the data and analysis: first, some of these scripts are in draft form, so the translation to the screen may not be 1:1. Additionally, not all of the most popular movies from the decades are available online. I tried to use scripts that span a decade as much as possible, but there were availability constraints and some difficulty balancing release year with popularity. Sci-fi has become much more prevalent since 1970, and there were more movies - and more, more popular movies - to choose from. Thus, the full encapsulation of a decade of this genre across only 3 scripts is a bit difficult. I think this analysis would be much more interesting if 20 scripts from each decade were included.

Another significant challenge, which stems from my general unfamiliarity with screenwriting, is that scripts (and different directors/screenwriters) have distinct styles and methods for denoting scene and camera direction, so the cleaning process was challenging. Some words - such as “light” and “door” - are prevalent in these scripts and could feasibly be important to both stage direction (which isn’t entirely relevant to my analysis) and script/plot content (which is relevant). I think if I had more familiarity with scripts and more time, I would’ve gone through these more thoroughly to determine if I could pull out patterns more accurately.

LLM Use

I used ChatGPT to help with the regex section of the data cleaning – I am not very confident in my regular expression skills, and I found the DataCamp notes I had to be a bit unhelpful. I provided chat with a few lines from the first script in my dataframe and said: “I want to do the following using tidyverse: 1) remove stage directions, which are between parentheses; 2) remove directions that are between brackets; 3) remove scene heads (start with”INT” and “EXT” 4) remove camera directions (“CUT TO” “FADE IN’”FADE OUT” “CLOSER ANGLE” 5) remove lines that are all caps” –> it provided me with a function (which was unnecessary) but I used the regex in str_detect() function to do the data cleaning.

References

Berlatsky, N. 2017. “If Science Fiction Reflects Our Innermost Fears, How Do We See Ourselves Today?”.” Document Journal. https://www.documentjournal.com/2017/10/noah-berlasky-science-fiction-cultural-fears/.
Buckup, S. 2016. “The Surprising Link Between Science Fiction and Economic History.” World Economic Forum. https://www.weforum.org/stories/2016/06/the-poetry-of-progress/.
Martin, F. 2017. “Why Does Economic Growth Keep Slowing Down?” Federal Reserve Bank of St. Louis. https://www.stlouisfed.org/on-the-economy/2017/february/why-economic-growth-slowing-down.
NYC, Abracadabra. 2024. “Top Science Fiction Movies by Decade.” https://blog.abracadabranyc.com/top-science-fiction-movies-by-decade/.
Owens. “US Historical Events from 1900 to Present.” Baylor School. http://www.infoplease.com/ipa/A0005971.html.
PBS. “Technology Timeline (1752-1990).” https://www.pbs.org/wgbh/americanexperience/features/telephone-technology-timeline/.
“The Internet Movie Script Database.” https://imsdb.com/genre/Sci-Fi.
Wikipedia. “List of Highest-Grossing Science Fiction Films.” https://en.wikipedia.org/wiki/List_of_highest-grossing_science_fiction_films.
———. “List of Recessions in the United States.” https://en.wikipedia.org/wiki/List_of_recessions_in_the_United_States.