“Ni” or “Ecky-ecky-ecky-ecky-pikang-zoop-boing-goodem-zu-owly-zhiv”?

Row

This is a dashboard that retrieves scripts from Another Bleeding Monty Python Website.

The scraped scenes are:

  • Scene12: Knights With A Repetitive Tendency to Repetitively Say Ni Repetitively

  • Scene17: How to Find that Perfect Shrubbery

  • Scene18: Shrubbery or Herring? That is the Question

Row

Danger-O-meter for frequency of the word Ni!

Description

The Danger-O-meter for word Ni was done with tokenize_words() from tokenizers -package, because it keeps the word “ni”. However, unfortunately this command splits the “word” Ecky-ecky-ecky-ecky-pikang-zoop-boing-goodem-zu-owly-zhiv into bits and pieces, because it loses automatically the dashes. With this approach the total number of words in the three scripts was 359

Row

Danger-O-meter for frequency of the word Ecky-ecky-ecky-ecky-pikang-zoop-boing-goodem-zu-owly-zhiv!

Description

The Danger-O-meter for word Ecky-ecky-ecky-ecky-pikang-zoop-boing-goodem-zu-owly-zhiv was done by first removing punctuation while keeping the dashes and then counting the nubmer of terms using tm-package. This apprach keeps the “word” Ecky-ecky-ecky-ecky-pikang-zoop-boing-goodem-zu-owly-zhiv However, unfortunately this appriach loses the word “ni”. With this approach the total number of words in the three scripts was 315

Row

Histogram of the tokenizers approach

Histogram of the tm approach

About the Dashboard

Was created by: Anssi Tarkiainen

This report was generated on tammikuu 22, 2021.

Created for testing & learning purposes.