NewQuartoLayout

Loading Packages

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(rvest)


Attaching package: 'rvest'

The following object is masked from 'package:readr':

    guess_encoding

library(xml2)     
library(httr)     
library(magrittr)


Attaching package: 'magrittr'

The following object is masked from 'package:purrr':

    set_names

The following object is masked from 'package:tidyr':

    extract

Uploading the Data

This data includes the title of 10 Wikipedia articles which we wil then do a further analysis of.

wiki_pages <- 
  read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/olsonp2_xavier_edu/ESQO5P58qiVOrvm4m08kpyUB5TMhiLF3zjuLiHG7y6jIiQ?download=1")

Rows: 10 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): wiki_pages

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Viewing the Data

view(wiki_pages)

Visualization

for(i in seq_along(wiki_pages)) {
  wiki_pages$word_count[i] <- 

    wiki_pages[i] %>% nchar()
}

Warning: Unknown or uninitialised column: `word_count`.

wiki_pages %>%
  ggplot(aes( x = wiki_pages, y = word_count)) +
  geom_bar(stat = "summary")

No summary function supplied, defaulting to `mean_se()`