In being a hockey player is it abvioius the NHL is just better than WHA (instert personal bias) From there, it I decided to see another element to make a personal decioson which a team I choose cheer for. IN that, factors to decide that depend on your view. Some suggestions are with the highest win to lost ratio, the oldest team, or even the highest amount of Stanley Cup’s.
In choosing two teams to compaire, I decied to focus on who has the higest amount of Stanley Cup’s wins. Montreal Canadians (Bell Centre) vs. Toronto Maple Leafs (Scotiabank Arena). Let’s do sentiment analysis of these areas because thats where the fans gather. It is a chance to view the team spirit. In this, I use tripadvisor and use the reviews.
Warning: package 'tidyverse' was built under R version 4.3.2
Warning: package 'ggplot2' was built under R version 4.3.2
Warning: package 'readr' was built under R version 4.3.2
Warning: package 'dplyr' was built under R version 4.3.2
Warning: package 'lubridate' was built under R version 4.3.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: package 'tidytext' was built under R version 4.3.2
Warning: package 'ggwordcloud' was built under R version 4.3.2
Warning: package 'textdata' was built under R version 4.3.2
Warning: package 'rvest' was built under R version 4.3.2
Attaching package: 'rvest'
The following object is masked from 'package:readr':
guess_encoding
Warning: package 'httr' was built under R version 4.3.2
Attaching package: 'httr'
The following object is masked from 'package:textdata':
cache_info
Attaching package: 'magrittr'
The following object is masked from 'package:purrr':
set_names
The following object is masked from 'package:tidyr':
extract
User Agent:
You use code that allows for the the reviews in multiple pages to be shown. Bellow is an example of how to do this with one of the urls. As for Trip Advisor, they do this to not make their data easily scraped because you would have to loop all the data with each element, reviewer, date, and all the elements you would like.
Example for webscrape:
#NOw make a loop to grab data from all urls #for (i in seq_along(setreviews)) {# scot_urls[i] <- gen_url(scot_base, setreviews[i], scot_ending)#}#print(scot_urls)#toronto_page <- function(toronto_urls) {# for(i in seq_along(toropto_urls)) {# print(paste("collecting page", i, "of", length(toronto_urls), ".", sep = " "))# Sys.sleep(runif(1,7,15))# tor <- # toronto_page(urls[i]) %>% # bind_rows(tor)# }# return(tor)# }
# Identify name of page Maple Leafs h1_textA <- maple_html %>%html_elements("h1") %>%html_text2()
Table of words used
Here is a table that shows the most common words use it the first ten reviews.
Joining with `by = join_by(word)`
# A tibble: 171 × 2
word n
<chr> <int>
1 arena 8
2 game 8
3 hockey 4
4 concerts 3
5 food 3
6 raptors 3
7 seats 3
8 subway 3
9 watch 3
10 2 2
# ℹ 161 more rows
Joining with `by = join_by(word)`
# A tibble: 245 × 2
word n
<chr> <int>
1 centre 8
2 bell 7
3 hockey 7
4 game 6
5 arena 5
6 building 5
7 canadiens 5
8 expensive 5
9 ice 5
10 seats 5
# ℹ 235 more rows
Question 1:
Are fans of Montreal more satisfied with experience than fans of Maple Leafs?
`summarise()` has grouped output by 'sentiment'. You can override using the
`.groups` argument.
The color-coded analysis indicates that the Bell Centre tends to receive more positive reviews compared to Scotiabank Arena. Specifically, the Bell Centre exhibits higher positive scores, especially in categories such as joy, trust, and overall positivity. Despite a slightly elevated score in sadness, the significant positive differences in various aspects suggest that the overall experience at the Bell Centre, as reflected in the reviews, is more favorable than that at Scotiabank Arena.
Question 2:
Are the reviews of the leafs or Montreal more positive?
`summarise()` has grouped output by 'arena'. You can override using the
`.groups` argument.
Joining with `by = join_by(word)`
For, sensitivity analysis involves assessing how changes in input values affect the output, and the language used should convey the meticulous examination of these variations. As we are intrested in the fan base, we are trying to learn what type of an environment envouages a more postive outlook.
Positive words are commonly used to express praise, approval, or admiration. They contribute to a constructive and uplifting communication style. Negative words are employed to convey criticism, dissatisfaction, or concerns. They play a role in addressing issues or expressing discontent.
From this analysis, we are able to see the difference next to each other which words show as well as it if is positive or negative. From this table we are able to conclude that Bell has more postive words than negative, similar to the graph above.
Question 3:
How does time affect reviews?
`geom_smooth()` using formula = 'y ~ x'
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: pseudoinverse used at 2023
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: reciprocal condition number 1.1087e-15
Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: There are other near singularities as well. 4
This graph means shows when most reviews where taken into account time, over the years, that with more reviews made in
In this graph, we are able to take into the element about how time is a factor to the reviews. In this graph, we are able to see that their were more reviews made in 2023. This does not factor in if they were good reviews or bad but states that more people had things to say about their experience.
Conclusion:
When comparing two distinct hockey towns, we gain insight into the nuances that differentiate them. This exploration allows us to delve into the unique cultures surrounding each area and comprehend the reasons why people passionately support specific teams. Upon reviewing the data set, it becomes clear that the differences in these hockey towns contribute significantly to the varied fan experiences and loyalties.
In this analysis, it becomes evident that, even within two Canadian towns, the Bell Centre and the Scotiabank Arena represent the home arenas of the Montreal Canadiens and the Toronto Maple Leafs, respectively, in the National Hockey League (NHL). The Canadiens are based in Montreal, Quebec, Canada, while the Maple Leafs are based in Toronto, Ontario, Canada. Despite their close proximity, the Bell Centre holds a higher favorability than the Scotiabank Arena.