Project 2

Hawaii’s Farmer’s Markets

Let’s start with the Hawaii Farmer’s Markets data, because this will take a lot of work to tidy.

hawaii.url <- "https://raw.githubusercontent.com/EyeDen/data607/master/Project%202/Hawaii_Farmer_s_Markets.csv"
download.file(hawaii.url, "farmers_markets.csv")
hi.farmers.markets <- read.csv("farmers_markets.csv", na.strings = c("", " "))

To begin with, there is a huge section of data that is entered incorrectly. Near as I can assume, the original CSV should have had sections for each island’s farmer’s markets, as evidenced by the multiple headers within the file. However, at some point someone simply put all the farmer’s markets together, which is why there is an Island column.

Moreover, we see that on the incorrectly entered entries, the farmer’s market name contains multiple lines of information. Whoever entered this used a newline character rather than a comma, or simply copy-pasted from something else. Fortunately, those incorrect entries are all duplicates. Just searching through the CSV in Excel was enough to see this. That means we can simply delete those entries out of hand.

hi.farmers.markets <- hi.farmers.markets[-c(34:146), ]

Conveniently this also removes the duplicate headers throughout the dataset.

There are special characters within our data as well that we don’t need. â???T shows up in multiple names. As I’m not sure what they might mean, I am opting to remove them. Likewise, the occassional \n shows up throughout our data.

hi.farmers.markets$FARMER.S.MARKET <- iconv(hi.farmers.markets$FARMER.S.MARKET, to = "ASCII//TRANSLIT")

hi.farmers.markets[6, 1]

## [1] "Hawaiian Homestead Farmersa?T Market at Kuhio Hale Building"

All those weird characters are now converted to a?T, so we can remove them and the \n through regex.

hi.farmers.markets$FARMER.S.MARKET <- str_replace_all(hi.farmers.markets$FARMER.S.MARKET, "a\\?T", "")
hi.farmers.markets$FARMER.S.MARKET <- str_replace_all(hi.farmers.markets$FARMER.S.MARKET, "\n", "")
hi.farmers.markets$Location.1 <- str_replace_all(hi.farmers.markets$Location.1, "\n", " ")

Before we move on, let’s decide what we’d like to examine. There are a lot of missing values in this dataset. Had we more time, it might be possible to fill the gaps, but for now let’s focus on the general schedule for each farmer’s market, and which island they serve.

We do have a slight problem where some island values are incorrectly filled with websites, so we’ll also take care of that now.

hi.farmers.markets <- hi.farmers.markets[, -c(10, 11, 13, 14, 15, 16)]
hi.farmers.markets$Island <- str_replace_all(hi.farmers.markets$Island, "www\\.co\\.honolulu\\.hi\\.us\\/parks\\/programs\\/pom\\s?|\nhttp\\:\\/\\/alamoanafarmersmarket\\.com\\/ala\\-moana\\.html", "")
head(hi.farmers.markets, 10)

##                                                           FARMER.S.MARKET
## 1     Country Market & Craft Fair at Waimanalo Homestead Community Center
## 2                             Halawa District Park (People's Open Market)
## 3                      Kahaluu Farmers' Market at Windward Baptist Church
## 4                                                Kailua Elementary School
## 5                                                 Fern Forest Town Market
## 6                Hawaiian Homestead Farmers Market at Kuhio Hale Building
## 7                                    Hilo Coffee Mill "Market @ The Mill"
## 8                                                    Hilos Farmers Market
## 9           Honokaa Farmers Market at Old Botelho Bldg., Downtown Honokaa
## 10 Ho'oulu Community Farmers Market at Keauhou Beach Resort Royal Gardens
##    Sunday Monday Tuesday Wednesday Thursday Friday Saturday
## 1       X   <NA>    <NA>      <NA>     <NA>   <NA>     <NA>
## 2    <NA>   <NA>    <NA>      <NA>     <NA>      X     <NA>
## 3    <NA>   <NA>    <NA>      <NA>     <NA>      X     <NA>
## 4    <NA>   <NA>    <NA>      <NA>     <NA>   <NA>        X
## 5    <NA>   <NA>    <NA>      <NA>     <NA>   <NA>       X*
## 6    <NA>   <NA>    <NA>      <NA>     <NA>   <NA>        X
## 7    <NA>   <NA>    <NA>      <NA>     <NA>   <NA>        X
## 8    <NA>   <NA>    <NA>         X     <NA>   <NA>        X
## 9    <NA>   <NA>    <NA>      <NA>     <NA>   <NA>        X
## 10   <NA>   <NA>    <NA>         X     <NA>   <NA>     <NA>
##                     Time Island
## 1      9:00 AM to 4:00PM   Oahu
## 2     7:00 AM to 8:00 AM   Oahu
## 3     3:00 PM to 7:00 PM   Oahu
## 4     4:00 PM to 8:00 PM   Oahu
## 5     9:00 AM to 1:00 PM Hawaii
## 6  7:00 AM to 12:00 NOON Hawaii
## 7      8:00 AM - 1:00 PM Hawaii
## 8     6:00 AM to 4:00 PM Hawaii
## 9     7:30 AM to 2:00 PM Hawaii
## 10   12:30 PM to 5:30 PM Hawaii

Now let’s try working on the hours of operations!

hi.farmers.markets$Open <- ""
hi.farmers.markets$Close <- ""
hi.farmers.markets$Diff <- ""

My thoughts are to split the hours between Open and Close for those with actual hours of operations. After all, how would one quantify a farmer’s market with only a single time? Or the one market that is "All Day"? Also, we’ll add a column for how long they are operating.

split.time <- function(x){
  for(i in 1:length(x$Time)){
    hours <- str_extract(x$Time[i], "\\d+\\:\\d+\\s?[:upper:]{2} to \\d+\\:\\d+\\s?[:alpha:]{2,}|\\d+\\:\\d+\\s?[:alpha:]{2} \\- \\d+\\:\\d+\\s?[:alpha:]{2,}")
    if(!is.na(hours)){
      open.hour <- str_extract(hours, "\\d+\\:\\d+\\s?[:upper:]{2}")
      close.hour <- str_extract(hours, "\\s\\d+\\:\\d+\\s?[:alpha:]{2,}")
      
      if(!str_detect(open.hour, "\\s")){
        l <- str_extract(open.hour, "\\d+\\:\\d+")
        r <- str_extract(open.hour, "[:upper:]{2}")
        open.hour <- paste(l, r, sep = " ")
      }
      
      if(str_detect(close.hour, "NOON|Noon|noon")){
        close.hour <- str_replace(close.hour, "NOON|Noon|noon", "PM")
      }
      
      close.hour <- str_replace(close.hour, "\\s", "")
      
      if(!str_detect(close.hour, "\\s")){
        l <- str_extract(close.hour, "\\d+\\:\\d+")
        r <- str_extract(open.hour, "[:upper:]{2}")
        close.hour <- paste(l, r, sep = " ")
      }
    
      #x$Open[i] <- format(strptime(open.hour, format = "%H:%M %p"))
      #x$Close[i] <- format(strptime(close.hour, format = "%H:%M %p"))
      
      o <- format(strptime(open.hour, format = "%H:%M %p"))
      c <- format(strptime(close.hour, format = "%H:%M %p"))
      
      d <- abs(difftime(o, c, units = "hours"))
      
      x$Open[i] <- open.hour
      x$Close[i] <- close.hour
      x$Diff[i] <- d
    }
  }
  
  return(x)
}

hi.farmers.markets <- split.time(hi.farmers.markets)
hi.farmers.markets <- hi.farmers.markets[, -9]

We have two more things to settle before we can get to any analyzing. First, we should remove any duplicate farmer’s markets, if any. Then, we should reformat the days. We don’t need the * for our purpose, but I think we can keep the X.

hi.farmers.markets <- hi.farmers.markets[!duplicated(hi.farmers.markets$FARMER.S.MARKET), ]

hi.farmers.markets$Sunday[!is.na(hi.farmers.markets$Sunday)] <- "X"
hi.farmers.markets$Sunday <- factor(hi.farmers.markets$Sunday)

hi.farmers.markets$Monday[!is.na(hi.farmers.markets$Monday)] <- "X"
hi.farmers.markets$Monday <- factor(hi.farmers.markets$Monday)

hi.farmers.markets$Tuesday[!is.na(hi.farmers.markets$Tuesday)] <- "X"
hi.farmers.markets$Tuesday <- factor(hi.farmers.markets$Tuesday)

hi.farmers.markets$Wednesday[!is.na(hi.farmers.markets$Wednesday)] <- "X"
hi.farmers.markets$Wednesday <- factor(hi.farmers.markets$Wednesday)

hi.farmers.markets$Thursday[!is.na(hi.farmers.markets$Thursday)] <- "X"
hi.farmers.markets$Thursday <- factor(hi.farmers.markets$Thursday)

hi.farmers.markets$Friday[!is.na(hi.farmers.markets$Friday)] <- "X"
hi.farmers.markets$Friday <- factor(hi.farmers.markets$Friday)

hi.farmers.markets$Saturday[!is.na(hi.farmers.markets$Saturday)] <- "X"
hi.farmers.markets$Saturday <- factor(hi.farmers.markets$Saturday)

colnames(hi.farmers.markets)[1] <- "Markets"
# write.csv(hi.farmers.markets, "clean_farmers_markets.csv", row.names = FALSE)

We see that the most popular time for a farmer’s market to open is 7:00 AM, followed by 9:00 AM, 8:00 AM, and 10:00 AM. There are also quite a few afternoon openings at 3:00 PM and 4:00 PM.

We see that the most popular closing time is 1:00 PM, followed by 2:00 PM, and 11:00 AM, 12:00 PM, and 5:00 PM.

Interestingly, most farmer’s markets are open for only 1 hr. Others are open from 3 - 5 hrs. A surprising number are only open for 45 minutes.

It isn’t much of a surprise that Oahu has the lion’s share of farmer’s markets, but it is surprising that it holds twice as much as second place. Still, the Big Island has a respectable representation with 25 farmer’s markets.

Lastly, we see that most of the farmer’s markets are open only one day a week. Though some are open 7 days a week, none are open 6 days a week. The mean number of days is 1.702703.

Metrocards

metro.url <- "https://raw.githubusercontent.com/EyeDen/data607/master/Project%202/Fare_Card_History_for_Metropolitan_Transportation_Authority__MTA___Beginning_2010.csv"
download.file(metro.url, "2010_mta.csv")
metrocard <- read.csv("2010_mta.csv")

There is a lot of information in this dataset, which records Metrocard usage per week from 2010 - 2018. For now, I am only interested in the information on Astoria’s stations, as that is where I live and two stations are currently shut down for platform work, and more will be shut down later in the year. So let’s filter this big dataset down.

By peeking at the levels of the Station column, almost all stations are broken down by cross streets. I am only interested in these stations on the N/W line:

So, for example, if I wanted the information for 30 Av, I would need to look up 30TH AVENUE & 31ST STREET. But there is a quirk, as the levels show two separate entries: 30TH AVENUE & 31ST STREET and 30TH AVENUE & 31ST STREET, and there are a different number of entries for both.

For now, let’s extract both and we can work on tidying them after.

astoria.stations <- c("30TH AVENUE & 31ST STREET", "30TH AVENUE & 31ST STREET     ", "36TH AVENUE & 31ST STREET", "36TH AVENUE & 31ST STREET     ", "39TH AVENUE & 31ST STREET", "39TH AVENUE & 31ST STREET     ", "ASTORIA BLVD & 31ST STREET", "ASTORIA BLVD & 31ST STREET    ", "BROADWAY & 31ST STREET", "BROADWAY & 31ST STREET        ", "DITMARS BLVD & 31ST STREET    ", "DITMARS BLVD & 31ST STREET", "QUEENSBORO PLAZA", "QUEENSBORO PLAZA              ")

astoria <- metrocard[metrocard$Station %in% astoria.stations, ]

head(astoria, 10)

##     From.Date   To.Date Remote.Station.ID                    Station
## 79  2/13/2016 2/19/2016              R090  39TH AVENUE & 31ST STREET
## 80  2/13/2016 2/19/2016              R091  36TH AVENUE & 31ST STREET
## 81  2/13/2016 2/19/2016              R092     BROADWAY & 31ST STREET
## 82  2/13/2016 2/19/2016              R093  30TH AVENUE & 31ST STREET
## 83  2/13/2016 2/19/2016              R094 ASTORIA BLVD & 31ST STREET
## 84  2/13/2016 2/19/2016              R095 DITMARS BLVD & 31ST STREET
## 110 2/13/2016 2/19/2016              R121           QUEENSBORO PLAZA
## 545 1/30/2016  2/5/2016              R090  39TH AVENUE & 31ST STREET
## 546 1/30/2016  2/5/2016              R091  36TH AVENUE & 31ST STREET
## 547 1/30/2016  2/5/2016              R092     BROADWAY & 31ST STREET
##     Full.Fare Senior.Citizen...Disabled
## 79       5498                       310
## 80      12391                       688
## 81      20706                      1386
## 82      23772                      1247
## 83      21134                       894
## 84      29178                      1510
## 110     19525                       601
## 545      5863                       340
## 546     14497                       843
## 547     24436                      1590
##     X7.Day.ADA.Farecard.Access.System.Unlimited
## 79                                           58
## 80                                           60
## 81                                          134
## 82                                          154
## 83                                          115
## 84                                          151
## 110                                         122
## 545                                          57
## 546                                          67
## 547                                         152
##     X30.Day.ADA.Farecard.Access.System.Unlimited Joint.Rail.Road.Ticket
## 79                                           104                      2
## 80                                           155                     12
## 81                                           325                      8
## 82                                           337                     13
## 83                                           264                     12
## 84                                           308                     13
## 110                                          314                     83
## 545                                          134                      6
## 546                                          223                     11
## 547                                          366                     15
##     X7.Day.Unlimited X30.Day.Unlimited
## 79              4524              3834
## 80              7505             14468
## 81             10911             29449
## 82             13389             31086
## 83             11629             24597
## 84             13895             35371
## 110            14633             19267
## 545             4678              4781
## 546             8873             17065
## 547            12121             34025
##     X14.Day.Reduced.Fare.Media.Unlimited X1.Day.Unlimited
## 79                                     0                0
## 80                                     0                0
## 81                                     0                0
## 82                                     0                0
## 83                                     0                0
## 84                                     0                0
## 110                                    0                0
## 545                                    0                0
## 546                                    0                0
## 547                                    0                0
##     X14.Day.Unlimited X7.Day.Express.Bus.Pass Transit.Check.Metrocard
## 79                  0                       0                      46
## 80                  0                       0                     135
## 81                  0                       0                     279
## 82                  0                       0                     196
## 83                  0                       0                     282
## 84                  0                       0                     434
## 110                 0                       0                     167
## 545                 0                       0                      49
## 546                 0                       0                     155
## 547                 0                       0                     361
##     LIB.Special.Senior Rail.Road.Unlimited.No.Trade
## 79                 122                            1
## 80                 239                           13
## 81                 455                           10
## 82                 511                           12
## 83                 272                           24
## 84                 712                            8
## 110                253                          108
## 545                153                            1
## 546                310                           18
## 547                473                           17
##     Transit.Check.Metrocard.Annual.Metrocard Mail.and.Ride.EZPass.Express
## 79                                       317                           56
## 80                                      1014                          170
## 81                                      2548                          406
## 82                                      2714                          383
## 83                                      2025                          351
## 84                                      2881                          547
## 110                                     2143                          392
## 545                                      363                           81
## 546                                     1150                          171
## 547                                     3049                          486
##     Mail.and.Ride.Unlimited Path.2.Trip Airtran.Full.Fare Airtran.30.Day
## 79                       57           0               141              0
## 80                      270           0               131              0
## 81                      650           0               252              0
## 82                      537           0               247              0
## 83                      371           0               210              0
## 84                      502           0               269              0
## 110                     350           0               349              0
## 545                      86           0               112              0
## 546                     326           0               143              0
## 547                     720           0               270              0
##     Airtran.10.Trip Airtran.Monthly
## 79                0               0
## 80                0               0
## 81                0               0
## 82                0               0
## 83                0               0
## 84                0               0
## 110               0               0
## 545               0               0
## 546               0               0
## 547               0               0

Looking at the data again, it seems that every station has a corresponding Remote.Station ID. Rather than replacing the actual station names, we could simply remove them, and replace the Remote.Station IDs with their corresponding station names. Then we can remove the Station column completely, along with any other columns we won’t be needing.

astoria$Remote.Station.ID <- str_replace_all(astoria$Remote.Station.ID, "R090", "39th Ave")
astoria$Remote.Station.ID <- str_replace_all(astoria$Remote.Station.ID, "R091", "36th Ave")
astoria$Remote.Station.ID <- str_replace_all(astoria$Remote.Station.ID, "R092", "Broadway")
astoria$Remote.Station.ID <- str_replace_all(astoria$Remote.Station.ID, "R093", "30th Ave")
astoria$Remote.Station.ID <- str_replace_all(astoria$Remote.Station.ID, "R094", "Astoria Blvd")
astoria$Remote.Station.ID <- str_replace_all(astoria$Remote.Station.ID, "R095", "Ditmars Blvd")
astoria$Remote.Station.ID <- str_replace_all(astoria$Remote.Station.ID, "R121", "Queensboro Plaza")
astoria$Remote.Station.ID <- factor(astoria$Remote.Station.ID)

astoria$From.Date <- str_extract_all(astoria$From.Date, "\\d{4}")

astoria <- astoria[, c(1, 3, 5, 6, 10, 11)]
colnames(astoria) <- c("Year", "Station", "Full.Fare", "SCD", "x7.Day.Un", "x30.Day.Un")
astoria <- astoria[order(as.numeric(astoria$Year)), ]
astoria$Year <- as.numeric(astoria$Year)

annual.astoria <- aggregate(. ~ Year + Station, data = astoria, sum)
head(annual.astoria, 10)

##    Year  Station Full.Fare   SCD x7.Day.Un x30.Day.Un
## 1  2010 30th Ave    890466 37916    300552    1158843
## 2  2011 30th Ave   1505589 64101    532517    1767177
## 3  2012 30th Ave   1612274 71517    585371    1860489
## 4  2013 30th Ave   1585521 75747    670494    1825441
## 5  2014 30th Ave   1479235 75287    727827    1736341
## 6  2015 30th Ave   1449250 69118    486062    1501683
## 7  2016 30th Ave   1345898 78681    782095    1667235
## 8  2017 30th Ave    963811 55998    551091    1151983
## 9  2018 30th Ave         1     0         0          1
## 10 2010 36th Ave    482943 20246    175349     580075

Now that we’ve got our data, let’s see the affectof the 30th Avenue and 36th Avenue closures that happened in late 2017.

Closed Stations

As expected, there were some major declines in 2017 for both 30th Ave and 36th Ave. Did the other stations pick up the slack?

Neighbor Stations

As we expect, ridership remains up for 2017 for each neighboring station, however there is not a marked increase that we would expect. In most instances they remain the same, or have a slight increase/decrease. Perhaps we are being too granular? What if we took the total ridership for Full.Fare, SCD, x7.Day.Un, and x30.Day.Un?

Total Ridership

Indeed, ridership has mostly held steady through late 2017 for the open stations. Admittedly, this would need to be revisited once 2018 is through, and we can see the full effect the (planned) seven month shutdown had on Astoria. However, I would have expected the steep drop seen in the closed stations to appear on the open ones! The drop is especially dramatic when one considers those stations were closed for only one full month in 2017.

ridership.diff

##            Station Percent.Change
## 1     Astoria Blvd     0.01936947
## 2         30th Ave    -0.29712262
## 3         Broadway    -0.03966321
## 4         36th Ave    -0.31777958
## 5         39th Ave    -0.03849471
## 6 Queensboro Plaza     0.08487820

We can easily calculate the drop between years. For both 30th Ave and 36th Ave ridership levels dropped by 30%, but we don’t see the same increase in numbers for the neighboring stations. As the graphs illustrate, most of the stations also see minor decreases. Astoria Blvd sees a negligible one percent increase, and Queensboro Plaza, one of the main choke points into Manhattan, only sees an 8% increase.

CitiBike also came to Astoria around this time. Perhaps people affected by the subway closures opted to take other methods to cope, such as CitiBike, or the bus, or taxis/ride shares.

Movies

movie.url <- "https://raw.githubusercontent.com/EyeDen/data607/master/Project%202/movie_metadata.csv"
download.file(movie.url, "imdb_data.csv")
imdb <- read.csv("imdb_data.csv", na.strings = c("", " ", "NA"))

There are a lot of columns here. For now, let’s concern ourselves with the power of social media. Is there a positive correlation between Facebook likes for the movie/cast and IMDB score? Does this also translate to a better gross?

We’ll start by removing all the columns we don’t need. We will also rearrange the columns into something more readable. Lastly, we’ll only concern ourselves with films categorized as USA.

imdb <- imdb[, c(12, 21, 24, 2, 11, 7, 15, 9, 23, 13, 14, 28, 19, 26)]
imdb <- subset(imdb, country == "USA")
imdb <- imdb[, -2]

The movie_title column has a strange character at the end of every movie, so let’s tidy that up next. We will also not concern ourselves with any entries that have NA values, nor any movies with 0 Facebook likes.

imdb$movie_title <- str_replace_all(imdb$movie_title, "Â\\s*", "")
imdb <- imdb[complete.cases(imdb),]
imdb <- imdb[imdb$movie_facebook_likes != 0,]

imdb <- imdb[!duplicated(imdb$movie_title), ]

##                                    movie_title title_year     gross
## 1                                       Avatar       2009 760505847
## 27                                     Titanic       1997 658672302
## 30                              Jurassic World       2015 652177271
## 18                                The Avengers       2012 623279547
## 67                             The Dark Knight       2008 533316061
## 241  Star Wars: Episode I - The Phantom Menace       1999 474544677
## 3025        Star Wars: Episode IV - A New Hope       1977 460935665
## 9                      Avengers: Age of Ultron       2015 458991599
## 4                        The Dark Knight Rises       2012 448130642
## 3081                E.T. the Extra-Terrestrial       1982 434949459

Avatar is king of the gross according to IMDB, followed by Titanic, Jurassic World, The Avengers, and The Dark Knight. Only 4 of the top 10 movies come from this decade. 4 come from the previous millennia, and E.T. the Extra-Terrestrial is the oldest of them all.

##                             movie_title title_year movie_facebook_likes
## 97                         Interstellar       2014               349000
## 297                    Django Unchained       2012               199000
## 11   Batman v Superman: Dawn of Justice       2016               197000
## 180                        The Revenant       2015               190000
## 98                            Inception       2010               175000
## 100   The Hobbit: An Unexpected Journey       2012               166000
## 4                 The Dark Knight Rises       2012               164000
## 279                         The Martian       2015               153000
## 30                       Jurassic World       2015               150000
## 1579           The Grand Budapest Hotel       2014               149000

In contrast, the movies with the most Facebook likes all come from this decade. Inception is the oldest of the 10, but falls in the middle of the group.

##                                        movie_title title_year imdb_score
## 1938                      The Shawshank Redemption       1994        9.3
## 3467                                 The Godfather       1972        9.2
## 67                                 The Dark Knight       2008        9.0
## 2838                        The Godfather: Part II       1974        9.0
## 340  The Lord of the Rings: The Return of the King       2003        8.9
## 1875                              Schindler's List       1993        8.9
## 3356                                  Pulp Fiction       1994        8.9
## 98                                       Inception       2010        8.8
## 684                                     Fight Club       1999        8.8
## 837                                   Forrest Gump       1994        8.8

Lastly, the movies that are the highest rated on IMDB are relatively older and more diverse. We also see a much higher showing of movies in the 1970s, with The Godfather and The Godfather: Part II showing up. Half the movies in the top ten come from the 1990s, perhaps showing some sort of bias in age of users?

Let’s see if there are any overlaps.

common

##  [1] movie_title               title_year               
##  [3] director_name             actor_1_name             
##  [5] actor_2_name              actor_3_name             
##  [7] gross                     budget                   
##  [9] num_voted_users           cast_total_facebook_likes
## [11] movie_facebook_likes      num_user_for_reviews     
## [13] imdb_score               
## <0 rows> (or 0-length row.names)

Nope. There are no common movies in the top 10 of each list. So let’s see what other trends we can see.

There doesn’t seem to be much of a trend comparing movie likes with cast total likes. On the average, the movie will garner more likes than the cast total.

On the other hand, there seems to be a rough suggestion of a movie’s gross given its Facebook likes.

Comparing the gross with IMDB rating is just a mess. There seems to be no suggestion between the two.

Let’s see if we can look at Facebook likes vs gross one more time. First, let’s limit ourselves to 2012 - 2017. 2012 is the year in which Facebook went public, so we will assume that is when Facebook hit its peak popularity.

Without running this through a machine learning algorithm it’s hard to know for certain, but maybe? It makes a kind of sense that Facebook likes will mirror gross, as both would be an expression of how an audience would react to a movie, and it’s a more raw form than having to rate a movie specifically.