The data is from last years NOCTI data at my school (Franklin County CTC).
#Import Untidy Dataset
library(readxl)
Untidy_dataset <- read_excel("~/Box Sync/PracticeAssignment_4/Untidy_dataset.xlsx")
Untidy_dataset
# A tibble: 6 x 4
`Sending District` Advanced Competent Basic
<chr> <dbl> <dbl> <dbl>
1 Chambersburg 31 12 5
2 Fannett Metal 6 1 0
3 Greencastle 6 4 11
4 Shippensburg 21 17 1
5 Tuscarora 12 3 4
6 Waynesoro 18 1 4
#Import Tidy Dataset
Tidy_dataset <- read_excel("~/Box Sync/PracticeAssignment_4/Tidy_dataset.xlsx",
col_types = c("text", "text", "text"))
Tidy_dataset
# A tibble: 157 x 3
Student_ID Sending_District NOCTI_results
<chr> <chr> <chr>
1 1 Chambersburg Advanced
2 2 Chambersburg Advanced
3 3 Chambersburg Advanced
4 4 Chambersburg Advanced
5 5 Chambersburg Advanced
6 6 Chambersburg Advanced
7 7 Chambersburg Advanced
8 8 Chambersburg Advanced
9 9 Chambersburg Advanced
10 10 Chambersburg Advanced
# ... with 147 more rows
This dataset is from an online class that shows the pretest scores of the students. The first variable (Gender) is a nominal measurement. The numbers in the variable are used only to classify or name the data.
The second variable (Pretest_Score) is a ratio measurement. In this level of measurement, the observations, in addition to having equal intervals, can have a value of zero as well.
testscore_dataset <- read_excel("~/Box Sync/PracticeAssignment_4/test_score_dataset.xlsx")
testscore_dataset
# A tibble: 10 x 2
Gender Pretest_Score
<chr> <dbl>
1 Female 72
2 Male 70
3 NA 74
4 Female 80
5 Female 75
6 Female 72
7 Male 81
8 Female 74
9 Female 87
10 Male 83
This dataset is employment statistics from the US Department of Labor. The first variable (Year) is an interval variable. This interval level of measurement not only classifies and orders the measurements, but it also specifies that the distances between each interval on the scale are equivalent along the scale from low interval to high interval
The second variable (Percent_Unemployed) is a ratio variable. These observations, in addition to having equal intervals, can have a value of zero as well.
unemployment_dataset <- read_excel("~/Box Sync/PracticeAssignment_4/unemployment_dataset.xlsx")
unemployment_dataset
# A tibble: 10 x 2
Year Percent_Unemployed
<dbl> <dbl>
1 2000 4.0
2 2001 4.7
3 2002 5.8
4 2003 6.0
5 2004 5.5
6 2005 5.1
7 2006 4.6
8 2007 4.6
9 2008 5.8
10 2009 9.3
This dataset is violent crime statistics from the FBI. The first variable (population) is a ratio variable. These observations, in addition to having equal intervals, can have a value of zero as well. The second variable (violent_crime) is also a ratio variable. These observations, in addition to having equal intervals, can have a value of zero as well.
crimerate_dataset <- read_excel("~/Box Sync/PracticeAssignment_4/us_crimerate_dataset.xlsx")
crimerate_dataset
# A tibble: 10 x 2
population violent_crime
<dbl> <dbl>
1 260327021 1857670
2 262803276 1798792
3 265228572 1688540
4 267783607 1636096
5 270248003 1533887
6 272690813 1426044
7 281421906 1425486
8 285317559 1439480
9 287973924 1423677
10 290788976 1383676
I did not use the ordinal level of measurement which depicts some ordered relationship among the variable’s observations. Examples include size, ranking of favorite sports, class rankings, wellness ranking, and Likert scales.