This assignment is for ETC5521 Assignment 2 by Team Sawfish comprising of Zoljargal Batsaikhan 30392756, Prajyot Nagrale 31132324, Yin Shan Ho 31201474, and Pranali Angne 32355068.
The Bechdel test is a technique for evaluating female character engagement in films. The Bechdel Test reveals how movies conceal gender bias. This test analyses whether the female protagonist has a serious character on the screen in comparison to the male protagonist.
Test rules: It has to have at least two named women in it (1) who talk to each other (2) about something besides a man (3) This test is done to show the positive representation of women in films.
This experiment is being carried out in order to demonstrate positive female representation in films. The Bechdel test was inspired by Alison Bechdel’s accidental creation in 1985. It became well-known thanks to the popular comic strip Dykes to Watch Out For. It was a basic test to discover if a film had any female characters in it. In other cases, an additional criterion is that the two female characters in a film are identified but solely speak about men. It was a source of concern for a few people when it was the only criterion used to assess a film’s men. When it was discovered that Twilight had passed such a test but Gravity, a brilliant film, had failed it, it was ridiculously regressive. So this topic piqued our curiosity, and hence we decided to learn more about it. Following some study, our team deliberated and decided on this issue for further investigation.
From the chosen data of Tidy Tuesday repository called Bechdel test, 2 data files movies.csv and raw_bechdel.csv were downloaded.
raw_bechdel.csv file contains 8839 rows and 5. It was originally downloaded from website on which moviegoers analyze and determine if the certain movies pass or fail the test.
movies.csv file contains 1794 rows and 34. This dataset comes from https://fivethirtyeight.com's article called The Dollar-And-Cents Case Against Hollywood’s Exclusion of Women. They used the Bechdel test dataset mentioned above and added financial information from The-Numbers.com, a leading site for box office and budget data. It inventories financial information for roughly 4,500 films. The authors of the article used the intersection of The-Numbers and BechdelTest which was a set of 1,615 films released between 1990 and 2013. Considering the financial information, they adjusted all numbers for inflation, using 2013 dollars. As mentioned in the article they have analyzed the movies in the sample using the Bechdel’s three criteria.
Because the dataset is not tidy and contains a large number of N/A values. It is also found with certain undesired variables with all N/A values, as well as some variables that would be useless for the research. To conduct additional analysis, data wrangling of the dataset is required.
The raw data summary of Bachdel test is shown below.
Summary of raw_bechdel data
In the data wrangling and cleaning section the data types were changed and the variables that are unnecessary the analysis were removed. Since the movies.csv data has more depth such as financial figures and ratings, it is decided to use only movies.csv data for further analysis. Moreover, the some of the data types have been adjusted for example, the genre, director, language, country, binary variables were separated and adjusted to facotr type. The following plot summarizes the tidy data.
Summary of movies_clean data after data wrangling
The description and types of each column are provided below:
| Variables | Description | Type |
|---|---|---|
| year | The year movie was released | Integer |
| title | The title of the movie. | Character |
| clean_test | Result of the Bechdel test with 5 levels | Factor |
| binary | Binary result of the Bechdel test | Factor |
| budget_2013 | Inflation adjusted budget of the movie in 2013 dollars | Numeric |
| domgross_2013 | Inflation adjusted domestic gross of the movie in 2013 dollars | Numeric |
| intgross_2013 | Inflation adjusted international gross of the movie in 2013 dollars | Numeric |
| imdb_id | Unique ID for each movie | Character |
| language1 | The language 1 of the movie | Factor |
| language2 | The language 2 of the movie | Factor |
| language3 | The language 3 of the movie | Factor |
| country1 | The country 1 of the movie | Factor |
| country2 | The country 2 of the movie | Factor |
| country3 | The country 3 of the movie | Factor |
| metascore | Metascore of the movie | Numeric |
| imdb_rating | IMDb rating of the movie | Numeric |
| director1 | The first director of the movie | Factor |
| director2 | The second director of the movie | Factor |
| director3 | The third director of the movie | Factor |
| genre1 | The first genre 1 of the movie | Factor |
| genre2 | The first genre 2 of the movie | Factor |
| genre3 | The first genre 3 of the movie | Factor |
| runtime | The runtime of the movie in minutes | Numeric |
| poster | The http address of the poster of the movie | Character |
| imdb_votes | The number of IMDb votes of the movie | Numeric |
“clean_test” refers to the result of the Bechdel test with 5 levels, detailed explanations are as follow:
What is the trend of the movies passing or failing the Bechdel test from 1970 to 2013?
By using the Bechdel test, it is easier to access the distribution of gender bias in films from 1970 to 2013. Based on the test, it is found that 56% of the 1,794 films released between 1970 and 2013 passed the Bechdel Test. While the passing rule is that if a film features two women talking to each other about something other than men, it is considered to pass the Bechdel test.
According to the graph above,the data indicates the percentage of movies that passed and failed the Bechdel test by every five years. It is found that the passing rate of the films were consistently below 50%. However, the rate has increased in the past 3 decades that there was a peak of passed tests observed around the year 2000 with the rate slightly above 50%. Sadly, the passing rate has fallen in the preceding decade. The one possible reason for the fall might be due to the fact that the dataset includes movies until 2013. It could be a different story if the the data set increased with broader dataset until 2021.
To examine the reason for failing the test, the results of Bechdel test were further decomposed into the following five groups:
[NOTE: The passing numbers are shown in green whereas the failures are shown in reds.]
Takeaway 1: Although the female representation in movie industry is getting better over time, only about 50% of movies pass simple Bechdel test.
Compare budget and return on investment of by grouping movies by Bechdel test result and finding if there are any correlation between IMDB rating and ROI?
The reason for asking this question is that money can have a significant impact on how well a film does. As a result, the graphs below is constructed to see if the Bechdel test had any impact on the budget or return on investment of the films. The budget and income figures have been adjusted for inflation using 2013 dollars. Then, return on investment is calculated by diving gross income by budget.
The average budget in USD in relation to the Bechdel test is depicted in the bar chart above. It is observed observe that the average budget for films that passed the Bechdel test is almost $20,000,000 less than for films that failed. This suggests that the movie industry investors invested more in the movie that has gender bias in which women are less represented.
Since more money was invested in the movies that failed the test, it is necessary to see if the investing more money converts into higher return on investment.
In Figure 6, the return on investment is plotted by the Bechdel test result over last 3 decades. The return has been log transformed since the original data had skewness. It is fascinating to see that the movies that passed the test slightly outperformed the movies that failed even though the different is low.
correlation between IMDB rating and ROI
The association between IMDB rating, and Metascore is seen in the table above. With a correlation of 0.73, it proves that IMDB rating and Metascore have a fairly strong relationship.
takeaway 2: The movie industry investors invested more in the movie that are male dominant. It is fascinating to see that the movies that passed the test slightly outperformed the movies that failed.
What are the top regions and genres with the most gender bias based on Bechdel test?
From the original dataset, genre and country variables were selected to further investigate the Bechdel test result. The first genre and country variables of each movies in the dataset were extracted, respectively in case there are multiple.
First, the number of movies were counted by grouping genre and regions. Then, top 10 genres and regions were filtered to plot the passing rate of the movies to the test.
The bar chart above depicts the percentage of people who passed the Bechdel test in each of the different genre. Horror, comedy, and drama were the three categories with more that 50% passing rate. With 63.4 percent of the passed categories, horror had the highest proportion.
The bar graph above depicts the Bechdel Test passing rate in different regions. It is found that Spain had the highest passing rate based on the test. Whereas, Hong Kong SAR and China had a realtively low percentage of movies that had passed the Bechdel test.
takeaway 3: The horror, comedy and drama are the types of movies with relatively higher passing rate of the Bechdel Test. Moreover, female has relatively higher status in Spainish and German movies.
Does higher rating mean higher passing rate in the Bachdel test?
The bubble chart above plots the movies that are in the top and bottom 30 of the IMDB ratings. The size of the bubble represent the budget while the color indicates the Bechdel test result that red means failure whereas green refers to passing. It is surprisingly found that the high rating movies with higher budget like The Dark Knight and Inception have failed the Bechdel test, whereas the low ratings with less budget like the Fog and Crossroads have passed the test.
takeaway 4: Movies with higher ratings and budgets are more likely to be male dominent. Whereas, female plays more important role in lower ratings and budgets movies.
Does the duration of the movies have any effect on the Bcchdel test
Bechdal test for movies with runtime
Figure 11 has taken the longest and shortest 10 movies from the data, respectively. It indicates that Run time of the movies is not likely to be inferential to the results of Bechdel test. As from the lot above, movies with Longer run time or shorter run time are having almost the same results.
takeaway 5: The passing rate of the Bechdel tests is not likely to be affected by the run time of the movies.
Based on the analysis, some conclusions are drawn as follow:
Firstly, an increasing trend is observed that females are playing more important roles in the movies. However, there were only half of the movies that had passed the Bechdel Test by 2013. This indicates that women are still having lower status in the movie industry.
Moreover, it is found that the investors tend to put more money in the movie that are male dominant. Whereas, the movies that passed the test slightly outperformed the movies that failed as they have comparatively higher return rate on investment.
On the other hand, there are some genre like horror, comedy and drama movies are having higher passing rate. And movie some regions like Spain and Germany are comparatively provide more opportunities to the actresses in the movies.
Surprisingly, it is found that most of the top rated movies with more investment have failed the test. This indicates that the movie industry and the the audience expectations are more favorable to men-dominant movies. As it is also found that the movies with the lowest ratings and less investment tend to have higher passing rate to the Bechdel test.
Lastly, it’s found that the length of the movies is less likely to affect the results of the Bechdel Test which means that women are not having more opportunities even when the movies are longer.
Based on the conclusions, sexual inequality was observed in the movie industry as female are having lower status and playing less important roles in the movies.
However, all of the conclusions drawn above are based on the data up to 2013 when was around 8 years from now. It can be a different story if the data is more updated to recent years.