# 70 students as of Feb 25
windows11 <- c("aflores-hernandez","akwong25","bdelacruz-angeles",
"bhernandezarteaga","cbettencourt2","cchen271","cornelas3",
"craigelijaesoriano","davila-castaneda","dvargas38",
"ecastillo-quevedo","efernando","ekjotjohal","elliottwhitney",
"fromerobojorquez","genaxiong","ggonzalez-ramirez",
"ghendrickson","gurindersahota","jasminesamayoa",
"jcontrerastrinidad","jessiemorales","jlegaspina","joneal2",
"jwong290","kchen129","leogarciaortiz","lillieyang",
"lindaespinozamunoz","lorenackerman","mdesilva","rbujji",
"roderickma","skodur","sraman7","tolaniyan","trevoroh")
macOS <- c("adimasagurto","ahmiyasalter","alannahtanner","aleroux",
"alizeibarra","apatterson9","asingh368",
"eflores136","elmermartinez","emendozagonzalez",
"emoya8","isidrohernandez","jaisingh","jangel15","jardindo",
"jessecaclark","jmandujano4","jperez460","kamryntaylor",
"kchen132","kvu56","lalagos","malachifuqua","manroopkaur",
"mayraarias","msuccari","omarkhalil","rbeattie","seanjimenez",
"vchezhiyan","xcortes2")
language_table_students <- c("edmondcheng", "angeliachunyu")
teamsize <- 7
# Set seed for reproducibility
set.seed("20260225")
shuffled_win11 <- sample(windows11, replace = FALSE)
tables <- c(rep(1:3,each=teamsize),rep(4:5,each=(teamsize+1)))
teams_win11 <- split(shuffled_win11, tables)
shuffled_macOS <- sample(macOS, replace = FALSE)
tables <- c(rep(6,(teamsize-1)),rep(7,(teamsize-2)),rep(8:9,each=teamsize),rep(10,(teamsize-1)))
teams_macOS <- split(shuffled_macOS, tables)
teams_macOS$`7` <- c(teams_macOS$`7`,language_table_students)
teams <- lapply(c(teams_win11,teams_macOS),sort)
invisible(lapply(seq_along(teams), function(i) {
cat("Team at Table", i, ":", teams[[i]], "\n")
}))
## Team at Table 1 : cbettencourt2 elliottwhitney jlegaspina lorenackerman mdesilva skodur tolaniyan
## Team at Table 2 : cornelas3 craigelijaesoriano ggonzalez-ramirez ghendrickson jessiemorales lindaespinozamunoz trevoroh
## Team at Table 3 : akwong25 dvargas38 ecastillo-quevedo jasminesamayoa joneal2 kchen129 sraman7
## Team at Table 4 : aflores-hernandez bdelacruz-angeles bhernandezarteaga cchen271 efernando genaxiong gurindersahota jwong290
## Team at Table 5 : davila-castaneda ekjotjohal fromerobojorquez jcontrerastrinidad leogarciaortiz lillieyang rbujji roderickma
## Team at Table 6 : apatterson9 emendozagonzalez jangel15 kamryntaylor kvu56 rbeattie
## Team at Table 7 : alizeibarra angeliachunyu edmondcheng eflores136 emoya8 msuccari omarkhalil
## Team at Table 8 : alannahtanner aleroux asingh368 jardindo jperez460 kchen132 malachifuqua
## Team at Table 9 : adimasagurto elmermartinez isidrohernandez jaisingh manroopkaur seanjimenez xcortes2
## Team at Table 10 : ahmiyasalter jessecaclark jmandujano4 lalagos mayraarias vchezhiyanDSC 011 S26 Lecture 14 Demo (KEY)
Quantiles
Preliminaries: Assignment of Teams and Team Tables
The class will split into OS-specific randomly assigned teams of about seven. The teams and team tables are determined by random sampling without replacement (also known as permutation) as follows:
Instructions for Completing and Submitting This Assignment
- Download and open today’s template notebook in RStudio
- Personalize the file by writing your name in the YAML header (replace “FirstName LastName”) — be sure to do this or you will lose points!
- Save with your name in RStudio and move to course directory: In RStudio select
File ??? Save as..., find your course directory files and move and rename the file to include your name (e.g.,FirstName_LastName_Quantiles_Demo.qmd) - Render to HTML
- Follow instructions from the HTML rendered output by editing your personalized notebook.
- As you work the assignment, keep rendering and editing the file, asking for help from your team until you get all CORRECT for each problem. Two or more students may ask for help from the instructors.
- Render to HTML and submit to Catcourses. Turn in as much CORRECT work as you can by the end of class today. Submission by end of class qualifies you for credit.
- Resubmit your best work by midnight tonight for better grade or fully accepted work – only your latest and best work gets graded.
Assignment
Ranking Data
Demonstration 1: Add a Column of Ranks to the Puromycin Dataset
Down in the R Console, use the assignment operator (<-) and the rank() function in R to add a column of ranks of the concentration variable to a copy of the Puromycin data in a new column with the name conc_rank using the selection operator (assign to Puromycin$conc_rank. Copy this code that creates a new column into the code chunk below, in a line of the code chunk before the assignment of a copy of the modified Puromycin with extra column to answer.
Puromycin$conc_rank <- rank(Puromycin$conc)
answer <- Puromycin
print_and_check(answer, "7d8eb9495ea95ab0a32febac478cdb283aa6887c2d5552d042ff8d749c66b341")
## conc rate state conc_rank
## 1 0.02 76 treated 2.5
## 2 0.02 47 treated 2.5
## 3 0.06 97 treated 6.5
## 4 0.06 107 treated 6.5
## 5 0.11 123 treated 10.5
## 6 0.11 139 treated 10.5
## 7 0.22 159 treated 14.5
## 8 0.22 152 treated 14.5
## 9 0.56 191 treated 18.5
## 10 0.56 201 treated 18.5
## 11 1.10 207 treated 22.0
## 12 1.10 200 treated 22.0
## 13 0.02 67 untreated 2.5
## 14 0.02 51 untreated 2.5
## 15 0.06 84 untreated 6.5
## 16 0.06 86 untreated 6.5
## 17 0.11 98 untreated 10.5
## 18 0.11 115 untreated 10.5
## 19 0.22 131 untreated 14.5
## 20 0.22 124 untreated 14.5
## 21 0.56 144 untreated 18.5
## 22 0.56 158 untreated 18.5
## 23 1.10 160 untreated 22.0
## [VALUE] CORRECTDemonstration 2: Compute Ranks for Sunfish Pigmentation Data.
This demonstration is already completed for you; you only need to look at the code and its output.
pigment <- rep(c("no","faint","mod","heavy","solid"),c(13,68,44,21,8))
pigment_factor <- ordered(pigment,levels=c("no","faint","mod","heavy","solid"))
sunfish <- data.frame(pigment=pigment_factor,rank=rank(pigment_factor))
answer <- sunfish
print_and_check(answer,"1f42e3a9227dc5af4f0ad8d49e9590e2f0e2305c5a33f362afbe28cd35ffd9bc")
## pigment rank
## 1 no 7.0
## 2 no 7.0
## 3 no 7.0
## 4 no 7.0
## 5 no 7.0
## 6 no 7.0
## 7 no 7.0
## 8 no 7.0
## 9 no 7.0
## 10 no 7.0
## 11 no 7.0
## 12 no 7.0
## 13 no 7.0
## 14 faint 47.5
## 15 faint 47.5
## 16 faint 47.5
## 17 faint 47.5
## 18 faint 47.5
## 19 faint 47.5
## 20 faint 47.5
## 21 faint 47.5
## 22 faint 47.5
## 23 faint 47.5
## 24 faint 47.5
## 25 faint 47.5
## 26 faint 47.5
## 27 faint 47.5
## 28 faint 47.5
## 29 faint 47.5
## 30 faint 47.5
## 31 faint 47.5
## 32 faint 47.5
## 33 faint 47.5
## 34 faint 47.5
## 35 faint 47.5
## 36 faint 47.5
## 37 faint 47.5
## 38 faint 47.5
## 39 faint 47.5
## 40 faint 47.5
## 41 faint 47.5
## 42 faint 47.5
## 43 faint 47.5
## 44 faint 47.5
## 45 faint 47.5
## 46 faint 47.5
## 47 faint 47.5
## 48 faint 47.5
## 49 faint 47.5
## 50 faint 47.5
## 51 faint 47.5
## 52 faint 47.5
## 53 faint 47.5
## 54 faint 47.5
## 55 faint 47.5
## 56 faint 47.5
## 57 faint 47.5
## 58 faint 47.5
## 59 faint 47.5
## 60 faint 47.5
## 61 faint 47.5
## 62 faint 47.5
## 63 faint 47.5
## 64 faint 47.5
## 65 faint 47.5
## 66 faint 47.5
## 67 faint 47.5
## 68 faint 47.5
## 69 faint 47.5
## 70 faint 47.5
## 71 faint 47.5
## 72 faint 47.5
## 73 faint 47.5
## 74 faint 47.5
## 75 faint 47.5
## 76 faint 47.5
## 77 faint 47.5
## 78 faint 47.5
## 79 faint 47.5
## 80 faint 47.5
## 81 faint 47.5
## 82 mod 103.5
## 83 mod 103.5
## 84 mod 103.5
## 85 mod 103.5
## 86 mod 103.5
## 87 mod 103.5
## 88 mod 103.5
## 89 mod 103.5
## 90 mod 103.5
## 91 mod 103.5
## 92 mod 103.5
## 93 mod 103.5
## 94 mod 103.5
## 95 mod 103.5
## 96 mod 103.5
## 97 mod 103.5
## 98 mod 103.5
## 99 mod 103.5
## 100 mod 103.5
## 101 mod 103.5
## 102 mod 103.5
## 103 mod 103.5
## 104 mod 103.5
## 105 mod 103.5
## 106 mod 103.5
## 107 mod 103.5
## 108 mod 103.5
## 109 mod 103.5
## 110 mod 103.5
## 111 mod 103.5
## 112 mod 103.5
## 113 mod 103.5
## 114 mod 103.5
## 115 mod 103.5
## 116 mod 103.5
## 117 mod 103.5
## 118 mod 103.5
## 119 mod 103.5
## 120 mod 103.5
## 121 mod 103.5
## 122 mod 103.5
## 123 mod 103.5
## 124 mod 103.5
## 125 mod 103.5
## 126 heavy 136.0
## 127 heavy 136.0
## 128 heavy 136.0
## 129 heavy 136.0
## 130 heavy 136.0
## 131 heavy 136.0
## 132 heavy 136.0
## 133 heavy 136.0
## 134 heavy 136.0
## 135 heavy 136.0
## 136 heavy 136.0
## 137 heavy 136.0
## 138 heavy 136.0
## 139 heavy 136.0
## 140 heavy 136.0
## 141 heavy 136.0
## 142 heavy 136.0
## 143 heavy 136.0
## 144 heavy 136.0
## 145 heavy 136.0
## 146 heavy 136.0
## 147 solid 150.5
## 148 solid 150.5
## 149 solid 150.5
## 150 solid 150.5
## 151 solid 150.5
## 152 solid 150.5
## 153 solid 150.5
## 154 solid 150.5
## [VALUE] CORRECTCutting Numerical Variables into Ordinal
Demonstration Cut Rates of Puromycin Data into Ordinal Categories
This demonstration is already completed for you; you only need to look at the code and its output.
Puromycin$rate_bin <- cut(Puromycin$rate,3,labels=c("low","medium","high"))
answer <- Puromycin
print_and_check(answer,"ca30b4a708f5a154e86e038213b03de4e38710ba01c2186fdb567a0609b80a8d")
## conc rate state rate_bin
## 1 0.02 76 treated low
## 2 0.02 47 treated low
## 3 0.06 97 treated low
## 4 0.06 107 treated medium
## 5 0.11 123 treated medium
## 6 0.11 139 treated medium
## 7 0.22 159 treated high
## 8 0.22 152 treated medium
## 9 0.56 191 treated high
## 10 0.56 201 treated high
## 11 1.10 207 treated high
## 12 1.10 200 treated high
## 13 0.02 67 untreated low
## 14 0.02 51 untreated low
## 15 0.06 84 untreated low
## 16 0.06 86 untreated low
## 17 0.11 98 untreated low
## 18 0.11 115 untreated medium
## 19 0.22 131 untreated medium
## 20 0.22 124 untreated medium
## 21 0.56 144 untreated medium
## 22 0.56 158 untreated high
## 23 1.10 160 untreated high
## [VALUE] INCORRECTComputing Percentiles in R
Demonstration 3: Computing percentiles, and specifically the 6th percentile, of the Nile dataset
- In the Console, use the
quantile()function on theNiledataset to see its default output (the minimum, 1st quartile, median, 3rd quartile, and maximum). Then copy and paste this working R expression code from the Console into the code chunk below, replacingNULLas the argument to thequote()to test correctness.
answer <- quote(quantile(Nile))
print_and_check_expr(
answer,
value_key = "70729c2977f73064eb5ffc9500de62bfae84ee3f842be2c09cc875665d712540",
required_fns = "quantile"
)
## 0% 25% 50% 75% 100%
## 456.0 798.5 893.5 1032.5 1370.0
## [VALUE] CORRECT
## [CODE] Uses quantile(): YES -- CORRECT- In the Console, evaluate the expression
1:100/100to see a numeric vector of proportions used to compute all 100 percentiles of the Nile data. Then copy and paste this working R expression code from the Console into the code chunk below, replacingNULLas the argument to thequote()to test its correctness.
answer <- quote(1:100/100)
print_and_check_expr(
answer,
value_key = "4847c34597c5f6608bb98bd504bce9b304664a0b5c0756aec4a465f5a6879e72"
)
## [1] 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15
## [16] 0.16 0.17 0.18 0.19 0.20 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30
## [31] 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.41 0.42 0.43 0.44 0.45
## [46] 0.46 0.47 0.48 0.49 0.50 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.60
## [61] 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.70 0.71 0.72 0.73 0.74 0.75
## [76] 0.76 0.77 0.78 0.79 0.80 0.81 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 0.90
## [91] 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00
## [VALUE] CORRECT- In the Console, add
Nileas the first argument andprobs = 1:100/100as the second argument ofquantile()to compute all 100 percentiles of the Nile data. Then copy and paste this working R expression code from the Console into the code chunk below, replacingNULLas the argument to thequote()to test correctness.
print_and_check_expr(
quote(quantile(Nile, 1: 100/100)),
value_key = "b1da29a7b765dc40c139585d5d2cbe03aea3f55224276c21c00b875c3b90b6f2",
required_fns = "quantile"
)
## 1% 2% 3% 4% 5% 6% 7% 8% 9% 10%
## 647.07 675.46 691.52 693.92 697.80 700.82 701.93 713.04 717.64 725.20
## 11% 12% 13% 14% 15% 16% 17% 18% 19% 20%
## 738.46 741.76 743.74 744.00 745.70 748.52 757.30 763.10 767.24 770.40
## 21% 22% 23% 24% 25% 26% 27% 28% 29% 30%
## 773.37 779.46 792.55 796.76 798.50 800.48 809.03 812.72 814.42 819.20
## 31% 32% 33% 34% 35% 36% 37% 38% 39% 40%
## 821.69 823.36 828.69 831.66 832.65 836.20 839.26 843.10 845.00 845.00
## 41% 42% 43% 44% 45% 46% 47% 48% 49% 50%
## 845.59 847.16 854.84 861.12 863.10 864.54 869.77 874.00 882.16 893.50
## 51% 52% 53% 54% 55% 56% 57% 58% 59% 60%
## 898.96 903.40 908.82 913.84 916.90 918.44 920.72 928.04 937.05 941.60
## 61% 62% 63% 64% 65% 66% 67% 68% 69% 70%
## 949.46 958.76 961.11 965.16 971.10 978.06 984.66 988.56 994.31 999.50
## 71% 72% 73% 74% 75% 76% 77% 78% 79% 80%
## 1012.90 1020.00 1020.00 1022.60 1032.50 1040.00 1042.30 1050.00 1060.50 1100.00
## 81% 82% 83% 84% 85% 86% 87% 88% 89% 90%
## 1100.00 1101.80 1111.70 1120.00 1123.00 1140.00 1141.30 1151.20 1160.00 1160.00
## 91% 92% 93% 94% 95% 96% 97% 98% 99% 100%
## 1160.90 1170.80 1182.10 1210.00 1210.50 1220.40 1230.60 1250.20 1261.10 1370.00
## [VALUE] CORRECT
## [CODE] Uses quantile(): YES -- CORRECT- In the Console, use the up-arrow to recall the last expression computing percentiles of the
Niledata. In R you can select specific values from a return value vector by using the indexed selection operator ([i]), whereiis an integer, or vector of integers, greater than or equal to 1 and less than or equal to the length of the vector. Edit your previous command by adding[6]after the expression that evaluates to all percentiles ofNilein order to extract only the 6th element, representing the 6th percentile of the Nile data. After you get it to work, then copy and paste this working R expression code from the Console into the code chunk below, replacingNULLas the argument to thequote()to test correctness.
print_and_check_expr(
quote(quantile(Nile, 1: 100/100)[6]),
value_key = "7e37b39f08125b5e4030df1d5d6cf610599a393ede2be464121f0e01b205c9f5",
required_fns = "quantile"
)
## 6%
## 700.82
## [VALUE] CORRECT
## [CODE] Uses quantile(): YES -- CORRECTComputing and Labelling Percentiles on a Histogram
Demonstration 4: Computing and labelling percentiles for the bins of the default histogram of the Nile dataset
- To compute the proportion of the
Niledata contained in bins of the default Nile histogram and then label the histogram with them, first compute the proportion of the Nile data sample contained in each successive bar (or absence of bar) of the histogram using its return value, and show that they sum to one:
return_value <- hist(Nile, plot = FALSE)
return_value$density * 100
## [1] 0.01 0.00 0.05 0.20 0.25 0.19 0.12 0.11 0.06 0.01
print(paste0("Sum of values: ", sum(return_value$density * 100)))
## [1] "Sum of values: 1"Next we use the cumsum() function to compute cumulative sums of the elements:
cumsum(return_value$density * 100)
## [1] 0.01 0.01 0.06 0.26 0.51 0.70 0.82 0.93 0.99 1.00Notice that the cumulative sum of the second element (which did not add any observations as none occurred between 500 and 600 units of Annual Flow) remained 1%.
Now lets turn these cumulative proportions into percentages and add a percentage sign to each of them:
paste0(cumsum(return_value$density * 100) * 100, "%")
## [1] "1%" "1%" "6%" "26%" "51%" "70%" "82%" "93%" "99%" "100%"Copy the previous expression as the value of the labels optional argument to hist() to see the cumulative sample proportions labelled over the bars.
hist(Nile,
main = "Annual Flow of the River Nile (1871-1970)",
xlab = "Annual Flow (10^8 m^3)",
ylim = c(0, 30),
labels = paste0(cumsum(return_value$density * 100) * 100, "%"))Plotting Empirical Cumulative Distribution Functions
Demonstration 5: Plot the ECDF of the control arm of the cell cycle data
Pass the vector-valued expression c(34, 22, 12), representing the sample from the control arm of the cell cycle data, as the argument to ecdf() in the following expression:
plot(ecdf(c(34, 22, 12)),
main = "ECDF of Cells/Dish, Control Arm of Cell Cycle Experiment")Demonstration 6: Plot the ECDF of the Nile data
Modify the expression below to compute the ECDF of the Nile data.
plot(ecdf(c(Nile)), main = "ECDF of Nile Data")