Exercises: Lists and data imports

t.test

Perform the following t.test:

t.test(rnorm(100))

Make R get the ends of the confidence interval and subtract them to compute its length.

lm

Adjust the following linear model:

lm(hp~cyl+disp, data=mtcars)

Get the names of the variables and their related p-values in a named vector or a data frame.

Names of months in any language

locale function from readr package produces a list which includes, among other things, the names of the days and the months in a given language (for example, locale("de") produces the names in German).

Find the months in the list and produce a vector with the names of the months.

Tallying words

We are going to see how many times each word is used in a text. You may use any text of your interest or https://www.un.org/en/about-us/universal-declaration-of-human-rights.

At each point check the kind of data structure we are using. At one step we will need to use lists.

Copy and paste the text to a txt file using Windows notepad or any other text editor.
Read the text using readLines
Use strsplit to split the lines (or paragraphs) into words. Argument split="[.,; ]" may be useful.
Now we have a list. You may be interested in practising how to index it. Which one is the first word of the second line?
To transform the list into a vector you may use unlist.
Tally the words and see which ones are used most times. Suggestions: table, sort (with descending=TRUE), tolower, head.
Count occurrences of words again, but this time restricting the count to words with more than three letters.

Extra exercise:

Count how many words each line has. Suggestions: sapply (or lapply) and length.
How many lines doesn’t have any word?
Remove from the list the lines without any word.

Mortality (csv file)

Download mortality data in Catalonia from https://www.idescat.cat/indicadors/?id=aec&n=15270. Suggestion: select comma as separator.

Import the dataset in R. Suggestion: use “Import Dataset” button in RStudio (in Environment tab) to open the “text (readr)” assistant.

Nomenclator (Excel file)

Download the list of place names in Catalonia in Excel format from https://www.icgc.cat/ca/Llibres-en-PDF/Toponimia/Nomenclator-oficial-de-toponimia-major-de-Catalunya-2009-2015 (see at the bottom of the page).

Upload it to R (the “Import Dataset” button is useful again).

Count how many municipalities names in Catalonia start with “Sant”. Suggestions: unique, grepl("^Sant", x).