In our analysis of the game of thrones dataset, we compare screentime and number of starred episodes for all characters. We then label all characters with a high screentime (top 10).
type quarto render nom.qmd dans le terminal –> équivalent comme appuyer sur Render type quarto preview nom.qmd dans le terminal –> webserver –> doesnt work!! ctrl Alt i -> code
#installing the librarieslibrary("readr")library("ggplot2")library("dplyr")
Attachement du package : 'dplyr'
Les objets suivants sont masqués depuis 'package:stats':
filter, lag
Les objets suivants sont masqués depuis 'package:base':
intersect, setdiff, setequal, union
library("ggrepel")#importing the datascreentimes <-read_csv("GOT_screentimes_1.csv")
Rows: 191 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): name, imdb_url, portrayed_by_name, portrayed_by_imdb_url
dbl (2): screentime, episodes
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#creating a new dataset with only the ten highest screentimesscreentimes_high <-top_n(screentimes, 10, screentime)#creating a scatterplot with both dataset (the high and the normal ones (with the label names for the 10 highest))#screentime vs episodesggplot(screentimes, aes(screentime, episodes)) +geom_point() +geom_text_repel(data = screentimes_high,aes(label = name),min.segment.length =0)
Warning: Removed 15 rows containing missing values or values outside the scale range
(`geom_point()`).