DSC 011 S26 Lecture 13 Demo

Statistical Algorithms

Author

Leo Garcia Ortiz

Published

February 23, 2026

Preliminaries: Assignment of Teams and Team Tables

The class will split into OS-specific randomly assigned teams of about seven. The teams and team tables are determined by random sampling without replacement (also known as permutation) as follows:


# 70 students as of Feb 23
windows11 <- c("aflores-hernandez","akwong25","bdelacruz-angeles",
               "bhernandezarteaga","cbettencourt2","cchen271","cornelas3",
               "craigelijaesoriano","davila-castaneda","dvargas38",
               "ecastillo-quevedo","efernando","ekjotjohal","elliottwhitney",
               "fromerobojorquez","genaxiong","ggonzalez-ramirez",
               "ghendrickson","gurindersahota","jasminesamayoa",
               "jcontrerastrinidad","jessiemorales","jlegaspina","joneal2",
               "jwong290","kchen129","leogarciaortiz","lillieyang",
               "lindaespinozamunoz","lorenackerman","mdesilva","rbujji",
               "roderickma","skodur","sraman7","tolaniyan","trevoroh")

macOS <- c("adimasagurto","ahmiyasalter","alannahtanner","aleroux",
          "alizeibarra","apatterson9","asingh368",
          "eflores136","elmermartinez","emendozagonzalez",
          "emoya8","isidrohernandez","jaisingh","jangel15","jardindo",
          "jessecaclark","jmandujano4","jperez460","kamryntaylor",
          "kchen132","kvu56","lalagos","malachifuqua","manroopkaur",
          "mayraarias","msuccari","omarkhalil","rbeattie","seanjimenez",
          "vchezhiyan","xcortes2")


language_table_students <- c("edmondcheng", "angeliachunyu")
teamsize  <- 7

# Set seed for reproducibility
set.seed("20260223")
shuffled_win11  <- sample(windows11, replace = FALSE)
tables    <- c(rep(1:3,each=teamsize),rep(4:5,each=(teamsize+1)))
teams_win11     <- split(shuffled_win11, tables)


shuffled_macOS  <- sample(macOS, replace = FALSE)
tables    <- c(rep(6,(teamsize-1)),rep(7,(teamsize-2)),rep(8:9,each=teamsize),rep(10,(teamsize-1)))
teams_macOS   <- split(shuffled_macOS, tables)
teams_macOS$`7` <- c(teams_macOS$`7`,language_table_students)

teams <- lapply(c(teams_win11,teams_macOS),sort)

invisible(lapply(seq_along(teams), function(i) {
  cat("Team at Table", i, ":", teams[[i]], "\n")
}))
## Team at Table 1 : bhernandezarteaga craigelijaesoriano jasminesamayoa jessiemorales kchen129 rbujji roderickma 
## Team at Table 2 : aflores-hernandez cchen271 ekjotjohal elliottwhitney jwong290 leogarciaortiz lindaespinozamunoz 
## Team at Table 3 : cbettencourt2 efernando ghendrickson jcontrerastrinidad lorenackerman sraman7 tolaniyan 
## Team at Table 4 : cornelas3 dvargas38 ecastillo-quevedo fromerobojorquez gurindersahota jlegaspina mdesilva trevoroh 
## Team at Table 5 : akwong25 bdelacruz-angeles davila-castaneda genaxiong ggonzalez-ramirez joneal2 lillieyang skodur 
## Team at Table 6 : jaisingh jessecaclark jmandujano4 kamryntaylor malachifuqua mayraarias 
## Team at Table 7 : alannahtanner alizeibarra angeliachunyu edmondcheng emoya8 kvu56 msuccari 
## Team at Table 8 : ahmiyasalter asingh368 emendozagonzalez isidrohernandez kchen132 omarkhalil seanjimenez 
## Team at Table 9 : adimasagurto aleroux elmermartinez jperez460 manroopkaur rbeattie vchezhiyan 
## Team at Table 10 : apatterson9 eflores136 jangel15 jardindo lalagos xcortes2

Acknowledgments

Instructions for Completing and Submitting This Assignment

  1. Download and open today’s template notebook in RStudio
  2. Personalize the file by writing your name in the YAML header (replace “FirstName LastName”) — be sure to do this or you will lose points!
  3. Save with your name in RStudio and move to course directory: In RStudio select File → Save as..., find your course directory files and move and rename the file to include your name (e.g., FirstName_LastName_DataTypes_Quiz.qmd)
  4. Render to HTML
  5. Follow instructions from the HTML rendered output by editing your personalized notebook.
  6. As you work the assignment, keep rendering and editing the file, asking for help from your team until you get all CORRECT for each problem. Two or more students may ask for help from the instructors.
  7. Render to HTML and submit to Catcourses. Turn in as much CORRECT work as you can by the end of class today. Submission by end of class qualifies you for credit.
  8. Resubmit your best work by midnight tonight for better grade or fully accepted work – only your latest and best work gets graded.

Assignment

Indexing Observations from Samples in R

Demonstration 1: Extract the 10th element of the Nile dataset by evaluating Nile[10]

answer <- NULL
print_and_check(answer,"ce90aade6a1ec7d0b5158db878be46a6")
## NULL
## [1] "INCORRECT"

Tabulating Categorical Data in R

Demonstration 2: Tabulate the state variable of the Puromycin dataset using the code table(Puromycin$state)

answer <- NULL
print_and_check(answer,"3a6a355b2081a0265dc75d62de9966b8")
## NULL
## [1] "INCORRECT"

Demonstration 3: Tabulate the Species variable of the iris dataset using the code table(iris$Species)

answer <- NULL
print_and_check(answer,"bb229e287dff815bb5ee0444b6107d55")
## NULL
## [1] "INCORRECT"

Computing Order Statistics

Demonstration 4: Compute Order Statistics of a Numerical Variable

To compute order statistics on a sample of ordinal or numerical values of one variable, sort the values. The order statistics are determined by the sorted values from smallest to largest. The order statistics within sets of observations with tied values is arbitrary. To compute order statistics for the conc variable in the Puromycin data, call the sort() function on it.

answer <- NULL
print_and_check(answer,"9dde8196b5047ae558f5191333c50a16")
## NULL
## [1] "INCORRECT"

Computing Extrema and Range

Demonstration 5: Compute Minimum, Maximum and Range of Numerical Variable

You can use order statistics to compute minumum, maximum and the range of the data (which is its maximum minus its minimum). Down in the R Console, use the max() and min() functions together to compute the range of the Puromycin$conc statistical variable. Once you get the correct answer, copy your correct working code into the code chunk below as the RHS of the assignment to answer.

answer <- NULL
print_and_check(answer,"26b2c2c65f33955d2d2d81c246776d61")
## NULL
## [1] "INCORRECT"

Ranking Data

Demonstration 6: Add a Column of Ranks to the Puromycin Dataset

Down in the R Console, use the assignment operator (<-) and the rank() function in R to add a column of ranks of the concentration variable in a new column with the name conc_rank. An example is shown on the slide. Once you get the correct answer, copy your correct working code into the code chunk below above the line answer <- Puromycin. It’s OK to reassign a value to an existing column, so it’s no harm to rerun the same assignment code creating or changing one column in the R data-frame.

answer <- Puromycin
print_and_check(answer,"861a72d6dfcc18efced6326a02853895")
##    conc rate     state
## 1  0.02   76   treated
## 2  0.02   47   treated
## 3  0.06   97   treated
## 4  0.06  107   treated
## 5  0.11  123   treated
## 6  0.11  139   treated
## 7  0.22  159   treated
## 8  0.22  152   treated
## 9  0.56  191   treated
## 10 0.56  201   treated
## 11 1.10  207   treated
## 12 1.10  200   treated
## 13 0.02   67 untreated
## 14 0.02   51 untreated
## 15 0.06   84 untreated
## 16 0.06   86 untreated
## 17 0.11   98 untreated
## 18 0.11  115 untreated
## 19 0.22  131 untreated
## 20 0.22  124 untreated
## 21 0.56  144 untreated
## 22 0.56  158 untreated
## 23 1.10  160 untreated
## [1] "INCORRECT"

Demonstration 7: Compute Ranks for Sunfish Pigmentation Data.

This demonstration is already completed for you; you only need to look at the code and its output.

pigment <- rep(c("no","faint","mod","heavy","solid"),c(13,68,44,21,8))
pigment_factor <- ordered(pigment,levels=c("no","faint","mod","heavy","solid"))
sunfish <- data.frame(pigment=pigment_factor,rank=rank(pigment_factor))
answer <- sunfish
print_and_check(answer,"519288620549207c834ad81dbe320c21")
##     pigment  rank
## 1        no   7.0
## 2        no   7.0
## 3        no   7.0
## 4        no   7.0
## 5        no   7.0
## 6        no   7.0
## 7        no   7.0
## 8        no   7.0
## 9        no   7.0
## 10       no   7.0
## 11       no   7.0
## 12       no   7.0
## 13       no   7.0
## 14    faint  47.5
## 15    faint  47.5
## 16    faint  47.5
## 17    faint  47.5
## 18    faint  47.5
## 19    faint  47.5
## 20    faint  47.5
## 21    faint  47.5
## 22    faint  47.5
## 23    faint  47.5
## 24    faint  47.5
## 25    faint  47.5
## 26    faint  47.5
## 27    faint  47.5
## 28    faint  47.5
## 29    faint  47.5
## 30    faint  47.5
## 31    faint  47.5
## 32    faint  47.5
## 33    faint  47.5
## 34    faint  47.5
## 35    faint  47.5
## 36    faint  47.5
## 37    faint  47.5
## 38    faint  47.5
## 39    faint  47.5
## 40    faint  47.5
## 41    faint  47.5
## 42    faint  47.5
## 43    faint  47.5
## 44    faint  47.5
## 45    faint  47.5
## 46    faint  47.5
## 47    faint  47.5
## 48    faint  47.5
## 49    faint  47.5
## 50    faint  47.5
## 51    faint  47.5
## 52    faint  47.5
## 53    faint  47.5
## 54    faint  47.5
## 55    faint  47.5
## 56    faint  47.5
## 57    faint  47.5
## 58    faint  47.5
## 59    faint  47.5
## 60    faint  47.5
## 61    faint  47.5
## 62    faint  47.5
## 63    faint  47.5
## 64    faint  47.5
## 65    faint  47.5
## 66    faint  47.5
## 67    faint  47.5
## 68    faint  47.5
## 69    faint  47.5
## 70    faint  47.5
## 71    faint  47.5
## 72    faint  47.5
## 73    faint  47.5
## 74    faint  47.5
## 75    faint  47.5
## 76    faint  47.5
## 77    faint  47.5
## 78    faint  47.5
## 79    faint  47.5
## 80    faint  47.5
## 81    faint  47.5
## 82      mod 103.5
## 83      mod 103.5
## 84      mod 103.5
## 85      mod 103.5
## 86      mod 103.5
## 87      mod 103.5
## 88      mod 103.5
## 89      mod 103.5
## 90      mod 103.5
## 91      mod 103.5
## 92      mod 103.5
## 93      mod 103.5
## 94      mod 103.5
## 95      mod 103.5
## 96      mod 103.5
## 97      mod 103.5
## 98      mod 103.5
## 99      mod 103.5
## 100     mod 103.5
## 101     mod 103.5
## 102     mod 103.5
## 103     mod 103.5
## 104     mod 103.5
## 105     mod 103.5
## 106     mod 103.5
## 107     mod 103.5
## 108     mod 103.5
## 109     mod 103.5
## 110     mod 103.5
## 111     mod 103.5
## 112     mod 103.5
## 113     mod 103.5
## 114     mod 103.5
## 115     mod 103.5
## 116     mod 103.5
## 117     mod 103.5
## 118     mod 103.5
## 119     mod 103.5
## 120     mod 103.5
## 121     mod 103.5
## 122     mod 103.5
## 123     mod 103.5
## 124     mod 103.5
## 125     mod 103.5
## 126   heavy 136.0
## 127   heavy 136.0
## 128   heavy 136.0
## 129   heavy 136.0
## 130   heavy 136.0
## 131   heavy 136.0
## 132   heavy 136.0
## 133   heavy 136.0
## 134   heavy 136.0
## 135   heavy 136.0
## 136   heavy 136.0
## 137   heavy 136.0
## 138   heavy 136.0
## 139   heavy 136.0
## 140   heavy 136.0
## 141   heavy 136.0
## 142   heavy 136.0
## 143   heavy 136.0
## 144   heavy 136.0
## 145   heavy 136.0
## 146   heavy 136.0
## 147   solid 150.5
## 148   solid 150.5
## 149   solid 150.5
## 150   solid 150.5
## 151   solid 150.5
## 152   solid 150.5
## 153   solid 150.5
## 154   solid 150.5
## [1] "CORRECT"

Cutting Numerical Variables into Ordinal

Demonstration 8: Cut Rates of Puromycin Data into Ordinal Categories

This demonstration is already completed for you; you only need to look at the code and its output.

Puromycin$rate_bin <- cut(Puromycin$rate,3,labels=c("low","medium","high"))
answer <- Puromycin
print_and_check(answer,"e3f4b70750c3b726db492b196b06603d")
##    conc rate     state rate_bin
## 1  0.02   76   treated      low
## 2  0.02   47   treated      low
## 3  0.06   97   treated      low
## 4  0.06  107   treated   medium
## 5  0.11  123   treated   medium
## 6  0.11  139   treated   medium
## 7  0.22  159   treated     high
## 8  0.22  152   treated   medium
## 9  0.56  191   treated     high
## 10 0.56  201   treated     high
## 11 1.10  207   treated     high
## 12 1.10  200   treated     high
## 13 0.02   67 untreated      low
## 14 0.02   51 untreated      low
## 15 0.06   84 untreated      low
## 16 0.06   86 untreated      low
## 17 0.11   98 untreated      low
## 18 0.11  115 untreated   medium
## 19 0.22  131 untreated   medium
## 20 0.22  124 untreated   medium
## 21 0.56  144 untreated   medium
## 22 0.56  158 untreated     high
## 23 1.10  160 untreated     high
## [1] "INCORRECT"