This report will conduct exploratory data analysis using data from FiveThirtyEight’s ‘elements-by-episode’ dataset. This dataset tracks 67 different elements found in Bob Ross paintings featured in the TV show, “The Joy of Painting”. For more information about this dataset and a statistical analysis of the work of boss, please see this article.
To better understand the dataset, we can run the following to return
a vector showing how many rows and columns are within our dataset titled
data:
dim(data)
## [1] 403 69
To view summary statistics for the first six columns of the dataframe
data, we run the following:
summary(data[1:6])
## EPISODE TITLE APPLE_FRAME AURORA_BOREALIS
## Length:403 Length:403 Min. :0.000000 Min. :0.000000
## Class :character Class :character 1st Qu.:0.000000 1st Qu.:0.000000
## Mode :character Mode :character Median :0.000000 Median :0.000000
## Mean :0.002481 Mean :0.004963
## 3rd Qu.:0.000000 3rd Qu.:0.000000
## Max. :1.000000 Max. :1.000000
## BARN BEACH
## Min. :0.00000 Min. :0.000
## 1st Qu.:0.00000 1st Qu.:0.000
## Median :0.00000 Median :0.000
## Mean :0.04218 Mean :0.067
## 3rd Qu.:0.00000 3rd Qu.:0.000
## Max. :1.00000 Max. :1.000
For the columns that describe the paintings, the data are binary, meaning if an element is present in the painting, the value will be recorded as 1. If the element is not present, the value will be 0. Since there are 69 columns present, it may be better to focus on a particular element, such as whether the paintings are framed.
Currently, the data are at the episode level, meaning each row depicts the elements of a painting present in a given episode. For this report, we are exploring which paintings are framed and will add a column that records which season the painting was featured in:
# creating a new column titled 'SEASON'
data_v2 <- data |>
mutate(SEASON = substr(EPISODE, 2,3))
# selecting columns that contain the term FRAME
data_v3 <- data_v2 |>
select(EPISODE,TITLE,SEASON, contains("FRAME"))
To calculate the percentage of total episodes that include a framed painting, and the percentage of total episodes that include an unframed painting we run the following:
# calculating sum of framed paintings
sum(data_v3$FRAMED == 0)/sum(data_v3$FRAMED == 0|1)*100
## [1] 86.84864
# calculating sum of unframed paintings
sum(data_v3$FRAMED == 1)/sum(data_v3$FRAMED == 0|1)*100
## [1] 13.15136
Bob Ross did not typically frame his paintings. Frames were only featured in about 13% of the episodes. Additionally, Bob Ross did not feature a framed painting until Season 4, using a circle frame. When Bob Ross incorporated a frame for a featured painting, he most commonly used the oval frame.