Task 1. Four different scales

There are four different types of measurement scales: nominal, ordinal, interval, and ratio. For this assignment, please do the following:

  1. In your own words, explain the main characteristics of each scale.
  2. Provide one original example for each scale that is different from the examples shown in the slides.

Please address all four scales below:

Task 2. Describe your data

Load packages

library(tidyverse) ## Wrangling data
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” dplyr     1.1.4     âś” readr     2.1.6
## âś” forcats   1.0.1     âś” stringr   1.6.0
## âś” ggplot2   4.0.1     âś” tibble    3.3.1
## âś” lubridate 1.9.4     âś” tidyr     1.3.2
## âś” purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(psych) ## basic statistics
## 
## Attaching package: 'psych'
## 
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
library(rio) ## import and export datasets
library(DescTools) ## calculate Mode
## 
## Attaching package: 'DescTools'
## 
## The following objects are masked from 'package:psych':
## 
##     AUC, ICC, SD

Load dataset

This dataset is a subset of data published by the OKCupid website and includes the ages of 300 users.

ok_cupid_data <- c(22,32,39,39,49,28,23,
36,38,29,29,44,29,40,
37,30,41,41,47,28,33,
22,35,40,40,24,31,27,
30,34,30,30,26,28,35,
28,34,41,41,29,43,47,
32,25,46,46,40,44,21,
30,20,20,20,28,23,62,
25,22,27,27,55,35,31,
37,31,28,28,19,30,50,
34,31,36,36,52,58,49,
27,36,28,28,32,32,33,
23,30,23,23,29,40,41,
29,28,22,22,36,59,46,
29,29,36,36,21,23,40,
40,28,36,36,29,38,29,
33,22,17,17,35,32,52,
26,29,55,55,25,42,39,
31,30,20,20,26,30,39,
34,28,28,28,34,28,24,
26,26,24,24,20,22,25,
23,33,49,49,32,35,24,
31,34,24,24,62,32,58,
30,30,22,22,32,41,27,
33,27,28,28,23,23,28,
27,29,28,28,47,29,35,
23,25,42,42,27,28,25,
22,24,26,26,28,41,35,
30,35,26,26,35,34,43,
31,20,27,27,24,58,26,
28,27,29,29,32,26,26,
28,31,32,32,20,51,45,
37,34,42,42,28,46,36,
19,33,40,40,38,66,47,
27,30,21,21,28,33,25,
27,28,33,33,44,33,52,
33,41,23,23,52,53,27,
26,28,49,49,34,27,48,
28,44,32,32,27,21,23,
36,37,23,23,36,19,18,
29,24,32,32,29,36,27,
35,29,60,60,31,51,30,
31,30,37,37,28,29,25,
39,28,30,30,31,29,35,
28,30,39,39,42,33,25,
26,25,24,24,33,46,47,
27,34,28,28,28,27,52,
29,39,49,49,38,37,29,
31,32,32,32,27,39,47,
25,47,42,42,20,34,36
)

Check data

Direction: Use the head() function to examine the first 5–6 values of the dataset.

head(ok_cupid_data)
## [1] 22 32 39 39 49 28

Easy way of describing data

Direction: Use simple descriptive statistics to summarize the data (e.g., mean, median, and standard deviation) using “describe()” function.

psych::describe(ok_cupid_data)
##    vars   n  mean   sd median trimmed  mad min max range skew kurtosis   se
## X1    1 350 32.83 9.24     30   31.83 7.41  17  66    49 1.01     0.79 0.49

Central tendency

mean

Direction: Use the mean() function to calculate the mean age of the 300 individuals.

mean(ok_cupid_data)
## [1] 32.82571

median

Direction: Use the median() function to calculate the median age of the 300 individuals.

median(ok_cupid_data)
## [1] 30

mode

Direction: Use the DescTools::Mode() function to calculate the mode value (i.e., the most frequent values) in the dataset.

DescTools::Mode(ok_cupid_data)
## [1] 28
## attr(,"freq")
## [1] 35

Dispersion

range

Direction: Use the range() function to estimate the range of the 300 individuals’ age.

range(ok_cupid_data)
## [1] 17 66

variance

Direction: Use the var() function to calculate the variance of the 300 individuals’ age.

var(ok_cupid_data)
## [1] 85.42226

SD

Direction: Use the sd() function to calculate the standard deviation (SD) of the 300 individuals’ age.

sd(ok_cupid_data)
## [1] 9.242416

Visualization of the data

histogram

Direction: Use the hist() function to visualize the 300 individuals’ age, and evaluation the distribution of the dataset.

hist(ok_cupid_data)

##Interpretation Most of the Cupid users fall between 20-40, with the majority between 25-35 years old. There are a few outliers in the 60-70 age range, but generally this follows a typical bell curve with the greatest number of users falling in the 30-40 age range.
### box plot

Direction: Use the boxplot() function to visualize the 300 individuals’ age, and interpret the boxplot.

boxplot(ok_cupid_data)

##Boxplot Interpretation 75% of the users are above 30 years old. 25% of them are between 25-30 years old. There are a few outliers in the 60-70 year old age range, but these numbers fall out of the bloxpolot square range. Please knit your file to HTML and upload it in HTML format to canvas.