Name: Jared Ali
netID: 230005904 Collaborated with:
Your homework must be submitted in Word or PDF format, created by calling “Knit Word” or “Knit PDF” from RStudio on your R Markdown document. Submission in other formats may receive a grade of 0. Your responses must be supported by both textual explanations and the code you generate to produce your result. Note that all R code used to produce your results must be shown in your knitted file. You can collaborate with your classmates, but you must identify their names above, and you must submit your own homework as a knitted file.
Throughout, try to use as few lines of code as possible. Majority of tasks can be completed in one line.
x.vec to contain the
integers 1 through 100. Check that it has length 100. Report the data
type being stored in x.vec. Add up the numbers in
x.vec, by calling one built-in R function.x.vec <- as.integer(1:100)
length(x.vec)
## [1] 100
typeof(x.vec)
## [1] "integer"
"integer_value <- x.vecL"
## [1] "integer_value <- x.vecL"
sum(x.vec, na.rm = FALSE)
## [1] 5050
x.vec into a matrix with 20 rows and 5 columns,
and store this as x.mat. Here x.mat should be
filled out in the default order (column major order). Check the
dimensions of x.mat, and the data type as well. Compute the
sums of each of the 5 columns of x.mat, by calling a
built-in R function. Check (using a comparison operator) that the sum of
column sums of x.mat equals the sum of
x.vec.x.mat <- matrix(data = x.vec, nrow= 20, ncol = 5)
dim(x.mat)
## [1] 20 5
typeof(x.mat)
## [1] "integer"
colSums(x.mat, na.rm = FALSE, dims = 1)
## [1] 210 610 1010 1410 1810
sum(colSums(x.mat, na.rm = FALSE, dims = 1)) == sum(x.mat)
## [1] TRUE
x.mat, with a
single line of code. Answer the following questions, how many elements
in row 2 of x.mat are larger than 40? How many elements in
column 3 are in between 45 and 50 (exclusive)? How many elements in
column 5 are odd? Hint: take advantage of the sum()
function applied to Boolean vectors.subset(row.names<- (x.mat, (1, 5, 17)))
sum(x.mat, na.rm = FALSE)
## Error: <text>:1:26: unexpected ','
## 1: subset(row.names<- (x.mat,
## ^
There are three elements in row 2 that are larger than 40. 6 elements are in between 45 and 50 in column 3. 10 elements in column 5 are odd.
x.vec so that every even
number in this vector is incremented by 10, and every odd number is left
alone. Print out the result to the console. Repeat using
ifelse() to do the same thing, again using just a single
line of code. Hint: the remainder of division by 2 for even numbers is
0.modifyList(x.vec[x.vec$evens == "+10"], keep.null = FALSE)
## Error in x.vec$evens: $ operator is invalid for atomic vectors
x.list created below. Complete the
following tasks, each with a single line of code: extract all but the
second element of x.list—seeking here a list as the final
answer. Extract the first and third elements of x.list,
then extract the second element of the resulting list—seeking here a
vector as the final answer. Extract the second element of
x.list as a vector, and then extract the first 10 elements
of this vector—seeking here a vector as the final answer. Note: pay
close attention to what is asked and use either single brackets
[ ] or double brackets [[ ]] as
appropriate.x.list = list(rnorm(6), letters, sample(c(TRUE,FALSE), size=4, replace=TRUE))
substr(x.list, 1, 3)
## [1] "c(1" "c(\"" "c(F"
substr(x.list, 2, 3)
## [1] "(1" "(\"" "(F"
Below we construct a data frame, of 50 states x 10 variables. The
first 8 variables are numeric and the last 2 are factors. The numeric
variables here come from the built-in state.x77 matrix,
which records various demographic factors on 50 US states, measured in
the 1970s. You can learn more about this state data set by typing
?state.x77 into your R console.
state.df = data.frame(state.x77, Region=state.region, Division=state.division)
state.df, containing the state
abbreviations that are stored in the built-in vector
state.abb. Name this column Abbr. You can do
this in (at least) two ways: by using a call to
data.frame(), or by directly defining
state.df$Abbr. Display the first 3 rows and all 11 columns
of the new state.df.Abbr <- 'state.df$Abbr'
print('state.df$Abbr')
## [1] "state.df$Abbr"
Region column from state.df.
You can do this in (at least) two ways: by using negative indexing, or
by directly setting state.df$Region to be
NULL. Display the first 3 rows and all 10 columns of
state.df.state.df <- (state.df[state.df$Region == NULL])
print(state.df)
## data frame with 0 columns and 50 rows
state.df, containing the x and y
coordinates (longitude and latitude, respectively) of the center of the
states, that are stored in the (existing) list
state.center. Hint: take a look at this list in the
console, to see what its elements are named. Name these two columns
Center.x and Center.y. Display the first 3
rows and all 12 columns of state.df.cbind(state.df, state.center(x = "longitude, y = "latitude", logicals = TRUE))
x <- 'Center.x'
y <- 'Center.y'
## Error: <text>:1:51: unexpected symbol
## 1: cbind(state.df, state.center(x = "longitude, y = "latitude
## ^
state.sub1, and subset()
called state.sub2. Check that they are equal to each other,
using an appropriate function call.state.sub2 <- my.df(state.sub1)
subset(state.sub2)
state.sub1 is == state.sub2
## Error: <text>:3:12: unexpected symbol
## 2: subset(state.sub2)
## 3: state.sub1 is
## ^
state.sub3, which contains only
the states whose longitude is less than -100, and whose murder rate is
above 9%. Print this new data frame to the console. Among the states in
this new data frame, which has the highest average life expectancy?state.sub3 <- data.frame(x < -100, row.names = NULL, check.rows = FALSE)
## Error in eval(expr, envir, enclos): object 'x' not found
We’re going to look at a data set on 97 men who have prostate cancer (from the book The Elements of Statistical Learning). There are 9 variables measured on these 97 men:
lpsa: log PSA scorelcavol: log cancer volumelweight: log prostate weightage: age of patientlbph: log of the amount of benign prostatic
hyperplasiasvi: seminal vesicle invasionlcp: log of capsular penetrationgleason: Gleason scorepgg45: percent of Gleason scores 4 or 5Use the following to load the prostate cancer data set into your R
session, and store it as a data frame pros.df:
pros.df <- read.table("http://www.stat.cmu.edu/~ryantibs/statcomp/data/pros.dat")
sapply(), calculate the mean of each variable.
Also, calculate the standard deviation of each variable. Each should
require just one line of code. Display your results.sapply(pros.df, mean('lpsa'), sd('lpsa'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("lpsa"): argument is not numeric or logical: returning
## NA
## Error in match.fun(FUN): 'mean("lpsa")' is not a function, character or symbol
sapply(pros.df, mean('lcavol'), sd('lcavol'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("lcavol"): argument is not numeric or logical:
## returning NA
## Error in match.fun(FUN): 'mean("lcavol")' is not a function, character or symbol
sapply(pros.df, mean('lweight'), sd('lweight'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("lweight"): argument is not numeric or logical:
## returning NA
## Error in match.fun(FUN): 'mean("lweight")' is not a function, character or symbol
sapply(pros.df, mean('age'), sd('age'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("age"): argument is not numeric or logical: returning
## NA
## Error in match.fun(FUN): 'mean("age")' is not a function, character or symbol
sapply(pros.df, mean('lpsa'), sd('lpsa'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("lpsa"): argument is not numeric or logical: returning
## NA
## Error in match.fun(FUN): 'mean("lpsa")' is not a function, character or symbol
sapply(pros.df, mean('svi'), sd('svi'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("svi"): argument is not numeric or logical: returning
## NA
## Error in match.fun(FUN): 'mean("svi")' is not a function, character or symbol
sapply(pros.df, mean('lcp'), sd('lcp'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("lcp"): argument is not numeric or logical: returning
## NA
## Error in match.fun(FUN): 'mean("lcp")' is not a function, character or symbol
sapply(pros.df, mean('gleason'), sd('gleason'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("gleason"): argument is not numeric or logical:
## returning NA
## Error in match.fun(FUN): 'mean("gleason")' is not a function, character or symbol
sapply(pros.df, mean('pgg45'), sd('pgg45'), simplify = TRUE, USE.NAMES = TRUE)
## Warning in mean.default("pgg45"): argument is not numeric or logical: returning
## NA
## Error in match.fun(FUN): 'mean("pgg45")' is not a function, character or symbol
lapply(),
plot each column, excluding SVI, on the y-axis with SVI on the x-axis.
This should require just one line of code.lapply(pros.df, plot(SVI, non-SVI))
## Error in eval(expr, envir, enclos): object 'SVI' not found
lapply() to perform t-tests for each variable
in the data set, between SVI and non-SVI groups. To be precise, you will
perform a t-test for each variable excluding the SVI variable itself.
For convenience, we’ve defined a function t.test.by.ind()
below, which takes a numeric variable x, and then an
indicator variable ind (of 0s and 1s) that defines the
groups. Run this function on the columns of pros.dat,
excluding the SVI column itself, and save the result as
tests. What kind of data structure is tests?
Print it to the console.t.test.by.ind = function(x, ind) {
stopifnot(all(ind %in% c(0, 1)))
return(t.test(x[ind == 0], x[ind == 1]))
}
lapply(pros.dat, t.test.by.ind(x, ind = pros.dat))
## Error in eval(expr, envir, enclos): object 'pros.dat' not found
tests is a data structure that allows you to assess variables in data sets.
lapply() again, extract the p-values from the
tests object you created in the last question, with just a
single line of code. Hint: first, take a look at the first element of
tests, what kind of object is it, and how is the p-value
stored? Second, run the command "[["(pros.df, "lcavol") in
your console—what does this do? It is similar to the concept in the
figure for lapply() in the notes. Now use what you’ve
learned to extract p-values from the tests object.lapply(tests, subset(p-value))
## Error in eval(expr, envir, enclos): object 'p' not found
The command displays the log of the cancer volume.
Use the following to load the prostate cancer data set into your R
session, and store it as a matrix pros.dat:
pros.dat =
as.matrix(read.table("http://www.stat.cmu.edu/~ryantibs/statcomp/data/pros.dat"))
pros.dat (i.e., how many
rows and how many columns)? Using integer indexing, print the first 6
rows and all columns; again using integer indexing, print the last 6
rows and all columns.print(pros.dat, nrow(1:6))
## lcavol lweight age lbph svi lcp gleason pgg45
## 1 -0.579818495 2.769459 50 -1.38629436 0 -1.38629436 6 0
## 2 -0.994252273 3.319626 58 -1.38629436 0 -1.38629436 6 0
## 3 -0.510825624 2.691243 74 -1.38629436 0 -1.38629436 7 20
## 4 -1.203972804 3.282789 58 -1.38629436 0 -1.38629436 6 0
## 5 0.751416089 3.432373 62 -1.38629436 0 -1.38629436 6 0
## 6 -1.049822124 3.228826 50 -1.38629436 0 -1.38629436 6 0
## 7 0.737164066 3.473518 64 0.61518564 0 -1.38629436 6 0
## 8 0.693147181 3.539509 58 1.53686722 0 -1.38629436 6 0
## 9 -0.776528789 3.539509 47 -1.38629436 0 -1.38629436 6 0
## 10 0.223143551 3.244544 63 -1.38629436 0 -1.38629436 6 0
## 11 0.254642218 3.604138 65 -1.38629436 0 -1.38629436 6 0
## 12 -1.347073648 3.598681 63 1.26694760 0 -1.38629436 6 0
## 13 1.613429934 3.022861 63 -1.38629436 0 -0.59783700 7 30
## 14 1.477048724 2.998229 67 -1.38629436 0 -1.38629436 7 5
## 15 1.205970807 3.442019 57 -1.38629436 0 -0.43078292 7 5
## 16 1.541159072 3.061052 66 -1.38629436 0 -1.38629436 6 0
## 17 -0.415515444 3.516013 70 1.24415459 0 -0.59783700 7 30
## 18 2.288486169 3.649359 66 -1.38629436 0 0.37156356 6 0
## 19 -0.562118918 3.267666 41 -1.38629436 0 -1.38629436 6 0
## 20 0.182321557 3.825375 70 1.65822808 0 -1.38629436 6 0
## 21 1.147402453 3.419365 59 -1.38629436 0 -1.38629436 6 0
## 22 2.059238834 3.501043 60 1.47476301 0 1.34807315 7 20
## 23 -0.544727175 3.375880 59 -0.79850770 0 -1.38629436 6 0
## 24 1.781709133 3.451574 63 0.43825493 0 1.17865500 7 60
## 25 0.385262401 3.667400 69 1.59938758 0 -1.38629436 6 0
## 26 1.446918983 3.124565 68 0.30010459 0 -1.38629436 6 0
## 27 0.512823626 3.719651 65 -1.38629436 0 -0.79850770 7 70
## 28 -0.400477567 3.865979 67 1.81645208 0 -1.38629436 7 20
## 29 1.040276712 3.128951 67 0.22314355 0 0.04879016 7 80
## 30 2.409644165 3.375880 65 -1.38629436 0 1.61938824 6 0
## 31 0.285178942 4.090169 65 1.96290773 0 -0.79850770 6 0
## 32 0.182321557 3.804438 65 1.70474809 0 -1.38629436 6 0
## 33 1.275362800 3.037354 71 1.26694760 0 -1.38629436 6 0
## 34 0.009950331 3.267666 54 -1.38629436 0 -1.38629436 6 0
## 35 -0.010050336 3.216874 63 -1.38629436 0 -0.79850770 6 0
## 36 1.308332820 4.119850 64 2.17133681 0 -1.38629436 7 5
## 37 1.423108334 3.657131 73 -0.57981850 0 1.65822808 8 15
## 38 0.457424847 2.374906 64 -1.38629436 0 -1.38629436 7 15
## 39 2.660958594 4.085136 68 1.37371558 1 1.83258146 7 35
## 40 0.797507196 3.013081 56 0.93609336 0 -0.16251893 7 5
## 41 0.620576488 3.141995 60 -1.38629436 0 -1.38629436 9 80
## 42 1.442201993 3.682610 68 -1.38629436 0 -1.38629436 7 10
## 43 0.582215620 3.865979 62 1.71379793 0 -0.43078292 6 0
## 44 1.771556762 3.896909 61 -1.38629436 0 0.81093022 7 6
## 45 1.486139696 3.409496 66 1.74919985 0 -0.43078292 7 20
## 46 1.663926098 3.392829 61 0.61518564 0 -1.38629436 7 15
## 47 2.727852828 3.995445 79 1.87946505 1 2.65675691 9 100
## 48 1.163150810 4.035125 68 1.71379793 0 -0.43078292 7 40
## 49 1.745715531 3.498022 43 -1.38629436 0 -1.38629436 6 0
## 50 1.220829921 3.568123 70 1.37371558 0 -0.79850770 6 0
## 51 1.091923301 3.993603 68 -1.38629436 0 -1.38629436 7 50
## 52 1.660131027 4.234831 64 2.07317193 0 -1.38629436 6 0
## 53 0.512823626 3.633631 64 1.49290410 0 0.04879016 7 70
## 54 2.127040520 4.121473 68 1.76644166 0 1.44691898 7 40
## 55 3.153590358 3.516013 59 -1.38629436 0 -1.38629436 7 5
## 56 1.266947603 4.280132 66 2.12226154 0 -1.38629436 7 15
## 57 0.974559640 2.865054 47 -1.38629436 0 0.50077529 7 4
## 58 0.463734016 3.764682 49 1.42310833 0 -1.38629436 6 0
## 59 0.542324291 4.178226 70 0.43825493 0 -1.38629436 7 20
## 60 1.061256502 3.851211 61 1.29472717 0 -1.38629436 7 40
## 61 0.457424847 4.524502 73 2.32630162 0 -1.38629436 6 0
## 62 1.997417706 3.719651 63 1.61938824 1 1.90954250 7 40
## 63 2.775708850 3.524889 72 -1.38629436 0 1.55814462 9 95
## 64 2.034705648 3.917011 66 2.00821403 1 2.11021320 7 60
## 65 2.073171929 3.623007 64 -1.38629436 0 -1.38629436 6 0
## 66 1.458615023 3.836221 61 1.32175584 0 -0.43078292 7 20
## 67 2.022871190 3.878466 68 1.78339122 0 1.32175584 7 70
## 68 2.198335072 4.050915 72 2.30757263 0 -0.43078292 7 10
## 69 -0.446287103 4.408547 69 -1.38629436 0 -1.38629436 6 0
## 70 1.193922468 4.780383 72 2.32630162 0 -0.79850770 7 5
## 71 1.864080131 3.593194 60 -1.38629436 1 1.32175584 7 60
## 72 1.160020917 3.341093 77 1.74919985 0 -1.38629436 7 25
## 73 1.214912744 3.825375 69 -1.38629436 1 0.22314355 7 20
## 74 1.838961071 3.236716 60 0.43825493 1 1.17865500 9 90
## 75 2.999226163 3.849083 69 -1.38629436 1 1.90954250 7 20
## 76 3.141130476 3.263849 68 -0.05129329 1 2.42036813 7 50
## 77 2.010894999 4.433789 72 2.12226154 0 0.50077529 7 60
## 78 2.537657215 4.354784 78 2.32630162 0 -1.38629436 7 10
## 79 2.648300197 3.582129 69 -1.38629436 1 2.58399755 7 70
## 80 2.779440197 3.823192 63 -1.38629436 0 0.37156356 7 50
## 81 1.467874348 3.070376 66 0.55961579 0 0.22314355 7 40
## 82 2.513656063 3.473518 57 0.43825493 0 2.32727771 7 60
## 83 2.613006652 3.888754 77 -0.52763274 1 0.55961579 7 30
## 84 2.677590994 3.838376 65 1.11514159 0 1.74919985 9 70
## 85 1.562346305 3.709907 60 1.69561561 0 0.81093022 7 30
## 86 3.302849259 3.518980 64 -1.38629436 1 2.32727771 7 60
## 87 2.024193067 3.731699 58 1.63899671 0 -1.38629436 6 0
## 88 1.731655545 3.369018 62 -1.38629436 1 0.30010459 7 30
## 89 2.807593831 4.718052 65 -1.38629436 1 2.46385324 7 60
## 90 1.562346305 3.695110 76 0.93609336 1 0.81093022 7 75
## 91 3.246490992 4.101817 68 -1.38629436 0 -1.38629436 6 0
## 92 2.532902848 3.677566 61 1.34807315 1 -1.38629436 7 15
## 93 2.830267834 3.876396 68 -1.38629436 1 1.32175584 7 60
## 94 3.821003607 3.896909 44 -1.38629436 1 2.16905370 7 40
## 95 2.907447359 3.396185 52 -1.38629436 1 2.46385324 7 10
## 96 2.882563575 3.773910 68 1.55814462 1 1.55814462 7 80
## 97 3.471966453 3.974998 68 0.43825493 1 2.90416508 7 20
## lpsa
## 1 -0.4307829
## 2 -0.1625189
## 3 -0.1625189
## 4 -0.1625189
## 5 0.3715636
## 6 0.7654678
## 7 0.7654678
## 8 0.8544153
## 9 1.0473190
## 10 1.0473190
## 11 1.2669476
## 12 1.2669476
## 13 1.2669476
## 14 1.3480731
## 15 1.3987169
## 16 1.4469190
## 17 1.4701758
## 18 1.4929041
## 19 1.5581446
## 20 1.5993876
## 21 1.6389967
## 22 1.6582281
## 23 1.6956156
## 24 1.7137979
## 25 1.7316555
## 26 1.7664417
## 27 1.8000583
## 28 1.8164521
## 29 1.8484548
## 30 1.8946169
## 31 1.9242487
## 32 2.0082140
## 33 2.0082140
## 34 2.0215476
## 35 2.0476928
## 36 2.0856721
## 37 2.1575593
## 38 2.1916535
## 39 2.2137539
## 40 2.2772673
## 41 2.2975726
## 42 2.3075726
## 43 2.3272777
## 44 2.3749058
## 45 2.5217206
## 46 2.5533438
## 47 2.5687881
## 48 2.5687881
## 49 2.5915164
## 50 2.5915164
## 51 2.6567569
## 52 2.6775910
## 53 2.6844403
## 54 2.6912431
## 55 2.7047113
## 56 2.7180005
## 57 2.7880929
## 58 2.7942279
## 59 2.8063861
## 60 2.8124102
## 61 2.8419982
## 62 2.8535925
## 63 2.8535925
## 64 2.8820035
## 65 2.8820035
## 66 2.8875901
## 67 2.9204698
## 68 2.9626924
## 69 2.9626924
## 70 2.9729753
## 71 3.0130809
## 72 3.0373539
## 73 3.0563569
## 74 3.0750055
## 75 3.2752562
## 76 3.3375474
## 77 3.3928291
## 78 3.4355988
## 79 3.4578927
## 80 3.5130369
## 81 3.5160131
## 82 3.5307626
## 83 3.5652984
## 84 3.5709402
## 85 3.5876769
## 86 3.6309855
## 87 3.6800909
## 88 3.7123518
## 89 3.9843437
## 90 3.9936030
## 91 4.0298060
## 92 4.1295508
## 93 4.3851468
## 94 4.6844434
## 95 5.1431245
## 96 5.4775090
## 97 5.5829322
print(pros.dat[,ncol(pros.dat)])
## 1 2 3 4 5 6 7
## -0.4307829 -0.1625189 -0.1625189 -0.1625189 0.3715636 0.7654678 0.7654678
## 8 9 10 11 12 13 14
## 0.8544153 1.0473190 1.0473190 1.2669476 1.2669476 1.2669476 1.3480731
## 15 16 17 18 19 20 21
## 1.3987169 1.4469190 1.4701758 1.4929041 1.5581446 1.5993876 1.6389967
## 22 23 24 25 26 27 28
## 1.6582281 1.6956156 1.7137979 1.7316555 1.7664417 1.8000583 1.8164521
## 29 30 31 32 33 34 35
## 1.8484548 1.8946169 1.9242487 2.0082140 2.0082140 2.0215476 2.0476928
## 36 37 38 39 40 41 42
## 2.0856721 2.1575593 2.1916535 2.2137539 2.2772673 2.2975726 2.3075726
## 43 44 45 46 47 48 49
## 2.3272777 2.3749058 2.5217206 2.5533438 2.5687881 2.5687881 2.5915164
## 50 51 52 53 54 55 56
## 2.5915164 2.6567569 2.6775910 2.6844403 2.6912431 2.7047113 2.7180005
## 57 58 59 60 61 62 63
## 2.7880929 2.7942279 2.8063861 2.8124102 2.8419982 2.8535925 2.8535925
## 64 65 66 67 68 69 70
## 2.8820035 2.8820035 2.8875901 2.9204698 2.9626924 2.9626924 2.9729753
## 71 72 73 74 75 76 77
## 3.0130809 3.0373539 3.0563569 3.0750055 3.2752562 3.3375474 3.3928291
## 78 79 80 81 82 83 84
## 3.4355988 3.4578927 3.5130369 3.5160131 3.5307626 3.5652984 3.5709402
## 85 86 87 88 89 90 91
## 3.5876769 3.6309855 3.6800909 3.7123518 3.9843437 3.9936030 4.0298060
## 92 93 94 95 96 97
## 4.1295508 4.3851468 4.6844434 5.1431245 5.4775090 5.5829322
print(pros.dat, nrow(-6))
## lcavol lweight age lbph svi lcp gleason pgg45
## 1 -0.579818495 2.769459 50 -1.38629436 0 -1.38629436 6 0
## 2 -0.994252273 3.319626 58 -1.38629436 0 -1.38629436 6 0
## 3 -0.510825624 2.691243 74 -1.38629436 0 -1.38629436 7 20
## 4 -1.203972804 3.282789 58 -1.38629436 0 -1.38629436 6 0
## 5 0.751416089 3.432373 62 -1.38629436 0 -1.38629436 6 0
## 6 -1.049822124 3.228826 50 -1.38629436 0 -1.38629436 6 0
## 7 0.737164066 3.473518 64 0.61518564 0 -1.38629436 6 0
## 8 0.693147181 3.539509 58 1.53686722 0 -1.38629436 6 0
## 9 -0.776528789 3.539509 47 -1.38629436 0 -1.38629436 6 0
## 10 0.223143551 3.244544 63 -1.38629436 0 -1.38629436 6 0
## 11 0.254642218 3.604138 65 -1.38629436 0 -1.38629436 6 0
## 12 -1.347073648 3.598681 63 1.26694760 0 -1.38629436 6 0
## 13 1.613429934 3.022861 63 -1.38629436 0 -0.59783700 7 30
## 14 1.477048724 2.998229 67 -1.38629436 0 -1.38629436 7 5
## 15 1.205970807 3.442019 57 -1.38629436 0 -0.43078292 7 5
## 16 1.541159072 3.061052 66 -1.38629436 0 -1.38629436 6 0
## 17 -0.415515444 3.516013 70 1.24415459 0 -0.59783700 7 30
## 18 2.288486169 3.649359 66 -1.38629436 0 0.37156356 6 0
## 19 -0.562118918 3.267666 41 -1.38629436 0 -1.38629436 6 0
## 20 0.182321557 3.825375 70 1.65822808 0 -1.38629436 6 0
## 21 1.147402453 3.419365 59 -1.38629436 0 -1.38629436 6 0
## 22 2.059238834 3.501043 60 1.47476301 0 1.34807315 7 20
## 23 -0.544727175 3.375880 59 -0.79850770 0 -1.38629436 6 0
## 24 1.781709133 3.451574 63 0.43825493 0 1.17865500 7 60
## 25 0.385262401 3.667400 69 1.59938758 0 -1.38629436 6 0
## 26 1.446918983 3.124565 68 0.30010459 0 -1.38629436 6 0
## 27 0.512823626 3.719651 65 -1.38629436 0 -0.79850770 7 70
## 28 -0.400477567 3.865979 67 1.81645208 0 -1.38629436 7 20
## 29 1.040276712 3.128951 67 0.22314355 0 0.04879016 7 80
## 30 2.409644165 3.375880 65 -1.38629436 0 1.61938824 6 0
## 31 0.285178942 4.090169 65 1.96290773 0 -0.79850770 6 0
## 32 0.182321557 3.804438 65 1.70474809 0 -1.38629436 6 0
## 33 1.275362800 3.037354 71 1.26694760 0 -1.38629436 6 0
## 34 0.009950331 3.267666 54 -1.38629436 0 -1.38629436 6 0
## 35 -0.010050336 3.216874 63 -1.38629436 0 -0.79850770 6 0
## 36 1.308332820 4.119850 64 2.17133681 0 -1.38629436 7 5
## 37 1.423108334 3.657131 73 -0.57981850 0 1.65822808 8 15
## 38 0.457424847 2.374906 64 -1.38629436 0 -1.38629436 7 15
## 39 2.660958594 4.085136 68 1.37371558 1 1.83258146 7 35
## 40 0.797507196 3.013081 56 0.93609336 0 -0.16251893 7 5
## 41 0.620576488 3.141995 60 -1.38629436 0 -1.38629436 9 80
## 42 1.442201993 3.682610 68 -1.38629436 0 -1.38629436 7 10
## 43 0.582215620 3.865979 62 1.71379793 0 -0.43078292 6 0
## 44 1.771556762 3.896909 61 -1.38629436 0 0.81093022 7 6
## 45 1.486139696 3.409496 66 1.74919985 0 -0.43078292 7 20
## 46 1.663926098 3.392829 61 0.61518564 0 -1.38629436 7 15
## 47 2.727852828 3.995445 79 1.87946505 1 2.65675691 9 100
## 48 1.163150810 4.035125 68 1.71379793 0 -0.43078292 7 40
## 49 1.745715531 3.498022 43 -1.38629436 0 -1.38629436 6 0
## 50 1.220829921 3.568123 70 1.37371558 0 -0.79850770 6 0
## 51 1.091923301 3.993603 68 -1.38629436 0 -1.38629436 7 50
## 52 1.660131027 4.234831 64 2.07317193 0 -1.38629436 6 0
## 53 0.512823626 3.633631 64 1.49290410 0 0.04879016 7 70
## 54 2.127040520 4.121473 68 1.76644166 0 1.44691898 7 40
## 55 3.153590358 3.516013 59 -1.38629436 0 -1.38629436 7 5
## 56 1.266947603 4.280132 66 2.12226154 0 -1.38629436 7 15
## 57 0.974559640 2.865054 47 -1.38629436 0 0.50077529 7 4
## 58 0.463734016 3.764682 49 1.42310833 0 -1.38629436 6 0
## 59 0.542324291 4.178226 70 0.43825493 0 -1.38629436 7 20
## 60 1.061256502 3.851211 61 1.29472717 0 -1.38629436 7 40
## 61 0.457424847 4.524502 73 2.32630162 0 -1.38629436 6 0
## 62 1.997417706 3.719651 63 1.61938824 1 1.90954250 7 40
## 63 2.775708850 3.524889 72 -1.38629436 0 1.55814462 9 95
## 64 2.034705648 3.917011 66 2.00821403 1 2.11021320 7 60
## 65 2.073171929 3.623007 64 -1.38629436 0 -1.38629436 6 0
## 66 1.458615023 3.836221 61 1.32175584 0 -0.43078292 7 20
## 67 2.022871190 3.878466 68 1.78339122 0 1.32175584 7 70
## 68 2.198335072 4.050915 72 2.30757263 0 -0.43078292 7 10
## 69 -0.446287103 4.408547 69 -1.38629436 0 -1.38629436 6 0
## 70 1.193922468 4.780383 72 2.32630162 0 -0.79850770 7 5
## 71 1.864080131 3.593194 60 -1.38629436 1 1.32175584 7 60
## 72 1.160020917 3.341093 77 1.74919985 0 -1.38629436 7 25
## 73 1.214912744 3.825375 69 -1.38629436 1 0.22314355 7 20
## 74 1.838961071 3.236716 60 0.43825493 1 1.17865500 9 90
## 75 2.999226163 3.849083 69 -1.38629436 1 1.90954250 7 20
## 76 3.141130476 3.263849 68 -0.05129329 1 2.42036813 7 50
## 77 2.010894999 4.433789 72 2.12226154 0 0.50077529 7 60
## 78 2.537657215 4.354784 78 2.32630162 0 -1.38629436 7 10
## 79 2.648300197 3.582129 69 -1.38629436 1 2.58399755 7 70
## 80 2.779440197 3.823192 63 -1.38629436 0 0.37156356 7 50
## 81 1.467874348 3.070376 66 0.55961579 0 0.22314355 7 40
## 82 2.513656063 3.473518 57 0.43825493 0 2.32727771 7 60
## 83 2.613006652 3.888754 77 -0.52763274 1 0.55961579 7 30
## 84 2.677590994 3.838376 65 1.11514159 0 1.74919985 9 70
## 85 1.562346305 3.709907 60 1.69561561 0 0.81093022 7 30
## 86 3.302849259 3.518980 64 -1.38629436 1 2.32727771 7 60
## 87 2.024193067 3.731699 58 1.63899671 0 -1.38629436 6 0
## 88 1.731655545 3.369018 62 -1.38629436 1 0.30010459 7 30
## 89 2.807593831 4.718052 65 -1.38629436 1 2.46385324 7 60
## 90 1.562346305 3.695110 76 0.93609336 1 0.81093022 7 75
## 91 3.246490992 4.101817 68 -1.38629436 0 -1.38629436 6 0
## 92 2.532902848 3.677566 61 1.34807315 1 -1.38629436 7 15
## 93 2.830267834 3.876396 68 -1.38629436 1 1.32175584 7 60
## 94 3.821003607 3.896909 44 -1.38629436 1 2.16905370 7 40
## 95 2.907447359 3.396185 52 -1.38629436 1 2.46385324 7 10
## 96 2.882563575 3.773910 68 1.55814462 1 1.55814462 7 80
## 97 3.471966453 3.974998 68 0.43825493 1 2.90416508 7 20
## lpsa
## 1 -0.4307829
## 2 -0.1625189
## 3 -0.1625189
## 4 -0.1625189
## 5 0.3715636
## 6 0.7654678
## 7 0.7654678
## 8 0.8544153
## 9 1.0473190
## 10 1.0473190
## 11 1.2669476
## 12 1.2669476
## 13 1.2669476
## 14 1.3480731
## 15 1.3987169
## 16 1.4469190
## 17 1.4701758
## 18 1.4929041
## 19 1.5581446
## 20 1.5993876
## 21 1.6389967
## 22 1.6582281
## 23 1.6956156
## 24 1.7137979
## 25 1.7316555
## 26 1.7664417
## 27 1.8000583
## 28 1.8164521
## 29 1.8484548
## 30 1.8946169
## 31 1.9242487
## 32 2.0082140
## 33 2.0082140
## 34 2.0215476
## 35 2.0476928
## 36 2.0856721
## 37 2.1575593
## 38 2.1916535
## 39 2.2137539
## 40 2.2772673
## 41 2.2975726
## 42 2.3075726
## 43 2.3272777
## 44 2.3749058
## 45 2.5217206
## 46 2.5533438
## 47 2.5687881
## 48 2.5687881
## 49 2.5915164
## 50 2.5915164
## 51 2.6567569
## 52 2.6775910
## 53 2.6844403
## 54 2.6912431
## 55 2.7047113
## 56 2.7180005
## 57 2.7880929
## 58 2.7942279
## 59 2.8063861
## 60 2.8124102
## 61 2.8419982
## 62 2.8535925
## 63 2.8535925
## 64 2.8820035
## 65 2.8820035
## 66 2.8875901
## 67 2.9204698
## 68 2.9626924
## 69 2.9626924
## 70 2.9729753
## 71 3.0130809
## 72 3.0373539
## 73 3.0563569
## 74 3.0750055
## 75 3.2752562
## 76 3.3375474
## 77 3.3928291
## 78 3.4355988
## 79 3.4578927
## 80 3.5130369
## 81 3.5160131
## 82 3.5307626
## 83 3.5652984
## 84 3.5709402
## 85 3.5876769
## 86 3.6309855
## 87 3.6800909
## 88 3.7123518
## 89 3.9843437
## 90 3.9936030
## 91 4.0298060
## 92 4.1295508
## 93 4.3851468
## 94 4.6844434
## 95 5.1431245
## 96 5.4775090
## 97 5.5829322
head() and
tail() (i.e., do not use integer indexing), print
the first 6 rows and all columns, and also the last 6 rows and all
columns.head(nrow(1:6))
## NULL
tail(nrow(-6))
## NULL
head(ncol(pros.dat))
## [1] 9
pros.dat have names assigned to its
rows and columns, and if so, what are they? Use rownames()
and colnames() to find out. Note: these would have been
automatically created by the read.table() function that we
used above to read the data file into our R session.rownames(pros.dat)
## [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15"
## [16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"
## [31] "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44" "45"
## [46] "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56" "57" "58" "59" "60"
## [61] "61" "62" "63" "64" "65" "66" "67" "68" "69" "70" "71" "72" "73" "74" "75"
## [76] "76" "77" "78" "79" "80" "81" "82" "83" "84" "85" "86" "87" "88" "89" "90"
## [91] "91" "92" "93" "94" "95" "96" "97"
colnames(pros.dat)
## [1] "lcavol" "lweight" "age" "lbph" "svi" "lcp" "gleason"
## [8] "pgg45" "lpsa"
Yes they do have names. They go as follows: “lcavol” “lweight” “age” “lbph” “svi” “lcp” “gleason” “pgg45” “lpsa”
pros.dat that measure the log cancer volume and the log
cancer weight, and store the result as a matrix
pros.dat.sub. (Recall the explanation of variables at the
top of this lab.) Check that its dimensions make sense to you, and that
its first 6 rows are what you’d expect. Did R automatically assign
column names to pros.dat.sub?substr(pros.dat, "log cancer volume", "log cancer weight")
## Warning in substr(pros.dat, "log cancer volume", "log cancer weight"): NAs
## introduced by coercion
## Warning in substr(pros.dat, "log cancer volume", "log cancer weight"): NAs
## introduced by coercion
## [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [76] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [101] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [126] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [151] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [176] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [201] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [226] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [251] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [276] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [301] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [326] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [351] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [376] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [401] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [426] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [451] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [476] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [501] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [526] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [551] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [576] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [601] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [626] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [651] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [676] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [701] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [726] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [751] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [776] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [801] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [826] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [851] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
\[ \log(density) = \log(weight/volume) = \log(weight) - \log(volume) \] There are multiple ways to accomplish this calculation. You should be able to perform this computation for all 97 men with a single line of code, taking advantage of R’s ability to vectorize.
log(pros.dat, base =exp(97))
## Warning: NaNs produced
## lcavol lweight age lbph svi lcp
## 1 NaN 0.010501567 0.04033013 NaN -Inf NaN
## 2 NaN 0.012369610 0.04186024 NaN -Inf NaN
## 3 NaN 0.010206218 0.04437181 NaN -Inf NaN
## 4 NaN 0.012254571 0.04186024 NaN -Inf NaN
## 5 -0.0029463478 0.012713937 0.04254778 NaN -Inf NaN
## 6 NaN 0.012083697 0.04033013 NaN -Inf NaN
## 7 -0.0031437608 0.012836783 0.04287508 -0.0050085691 -Inf NaN
## 8 -0.0037784837 0.013030804 0.04186024 0.0044303719 -Inf NaN
## 9 NaN 0.013030804 0.03969224 NaN -Inf NaN
## 10 -0.0154632988 0.012133761 0.04271273 NaN -Inf NaN
## 11 -0.0141020184 0.013217347 0.04303492 NaN -Inf NaN
## 12 NaN 0.013201726 0.04271273 0.0024392839 -Inf NaN
## 13 0.0049315702 0.011404162 0.04271273 NaN -Inf NaN
## 14 0.0040210927 0.011319812 0.04334735 NaN -Inf NaN
## 15 0.0019307721 0.012742868 0.04168094 NaN -Inf NaN
## 16 0.0044591214 0.011533594 0.04319232 NaN -Inf NaN
## 17 NaN 0.012962141 0.04379892 0.0022521263 -Inf NaN
## 18 0.0085349540 0.013345892 0.04319232 NaN -Inf -0.010206550
## 19 NaN 0.012206969 0.03828425 NaN -Inf NaN
## 20 -0.0175462201 0.013831510 0.04379892 0.0052139135 -Inf NaN
## 21 0.0014175325 0.012674792 0.04203647 NaN -Inf NaN
## 22 0.0074467672 0.012918154 0.04220974 0.0040051269 -Inf 0.003079137
## 23 NaN 0.012542846 0.04203647 NaN -Inf NaN
## 24 0.0059543618 0.012771447 0.04271273 -0.0085046856 -Inf 0.001694577
## 25 -0.0098333053 0.013396732 0.04365058 0.0048414515 -Inf NaN
## 26 0.0038086233 0.011745310 0.04350008 -0.0124084972 -Inf NaN
## 27 -0.0068847763 0.013542576 0.04303492 NaN -Inf NaN
## 28 NaN 0.013940360 0.04334735 0.0061534556 -Inf NaN
## 29 0.0004070799 0.011759771 0.04334735 -0.0154632989 -Inf -0.031136357
## 30 0.0090667947 0.012542846 0.04303492 NaN -Inf 0.004969572
## 31 -0.0129344168 0.014521508 0.04303492 0.0069528547 -Inf NaN
## 32 -0.0175462201 0.013774931 0.04303492 0.0054991480 -Inf NaN
## 33 0.0025075329 0.011453472 0.04394515 0.0024392839 -Inf NaN
## 34 -0.0475273140 0.012206969 0.04112355 NaN -Inf NaN
## 35 NaN 0.012045465 0.04271273 NaN -Inf NaN
## 36 0.0027706564 0.014596049 0.04287508 0.0079932270 -Inf NaN
## 37 0.0036375613 0.013367824 0.04423154 NaN -Inf 0.005213914
## 38 -0.0080633266 0.008917091 0.04287508 NaN -Inf NaN
## 39 0.0100895508 0.014508815 0.04350008 0.0032733935 0 0.006244594
## 40 -0.0023326229 0.011370754 0.04149847 -0.0006808254 -Inf NaN
## 41 -0.0049186228 0.011802659 0.04220974 NaN -Inf NaN
## 42 0.0037749599 0.013439399 0.04350008 NaN -Inf NaN
## 43 -0.0055764373 0.013940360 0.04254778 0.0055537311 -Inf NaN
## 44 0.0058954504 0.014022512 0.04238014 NaN -Inf -0.002160549
## 45 0.0040843500 0.012644995 0.04319232 0.0057645202 -Inf NaN
## 46 0.0052492776 0.012594475 0.04238014 -0.0050085691 -Inf NaN
## 47 0.0103455133 0.014279948 0.04504585 0.0065050226 0 0.010073260
## 48 0.0015580674 0.014381828 0.04350008 0.0055537311 -Inf NaN
## 49 0.0057439641 0.012909254 0.03877526 NaN -Inf NaN
## 50 0.0020570195 0.013113811 0.04379892 0.0032733935 -Inf NaN
## 51 0.0009066045 0.014275194 0.04350008 NaN -Inf NaN
## 52 0.0052257374 0.014879829 0.04287508 0.0075162863 -Inf NaN
## 53 -0.0068847763 0.013301365 0.04287508 0.0041311679 -Inf -0.031136357
## 54 0.0077807380 0.014600110 0.04350008 0.0058656408 -Inf 0.003808623
## 55 0.0118406351 0.012962141 0.04203647 NaN -Inf NaN
## 56 0.0024392840 0.014989524 0.04319232 0.0077575493 -Inf NaN
## 57 -0.0002656656 0.010851414 0.03969224 NaN -Inf -0.007129874
## 58 -0.0079221045 0.013666633 0.04012186 0.0036375613 -Inf NaN
## 59 -0.0063081560 0.014741101 0.04379892 -0.0085046856 -Inf NaN
## 60 0.0006129236 0.013900904 0.04238014 0.0026628865 -Inf NaN
## 61 -0.0080633266 0.015561933 0.04423154 0.0087039146 -Inf NaN
## 62 0.0071325278 0.013542576 0.04271273 0.0049695716 0 0.006668698
## 63 0.0105248057 0.012988133 0.04408934 NaN -Inf 0.004572121
## 64 0.0073232079 0.014075555 0.04319232 0.0071881009 0 0.007698856
## 65 0.0075162863 0.013271179 0.04287508 NaN -Inf NaN
## 66 0.0038916224 0.013860699 0.04238014 0.0028758870 -Inf NaN
## 67 0.0072630710 0.013973605 0.04350008 0.0059640900 -Inf 0.002875887
## 68 0.0081206215 0.014422091 0.04408934 0.0086205790 -Inf NaN
## 69 NaN 0.015294280 0.04365058 NaN -Inf NaN
## 70 0.0018272585 0.016129079 0.04408934 0.0087039146 -Inf NaN
## 71 0.0064202856 0.013185995 0.04220974 NaN 0 0.002875887
## 72 0.0015302890 0.012436062 0.04478150 0.0057645202 -Inf NaN
## 73 0.0020069305 0.013831510 0.04365058 NaN 0 -0.015463299
## 74 0.0062804204 0.012108858 0.04220974 -0.0085046856 0 0.001694577
## 75 0.0113232403 0.013895206 0.04365058 NaN 0 0.006668698
## 76 0.0117998223 0.012194919 0.04350008 NaN 0 0.009112574
## 77 0.0072018546 0.015353139 0.04408934 0.0077575493 -Inf -0.007129874
## 78 0.0096004258 0.015167784 0.04491452 0.0087039146 -Inf NaN
## 79 0.0100403917 0.013154199 0.04365058 NaN 0 0.009786986
## 80 0.0105386550 0.013825626 0.04271273 NaN -Inf -0.010206550
## 81 0.0039568591 0.011564949 0.04319232 -0.0059845858 -Inf -0.015463299
## 82 0.0095024566 0.012836783 0.04168094 -0.0085046856 -Inf 0.008708239
## 83 0.0099020777 0.014000915 0.04478150 NaN 0 -0.005984586
## 84 0.0101537887 0.013866488 0.04303492 0.0011235194 -Inf 0.005764520
## 85 0.0045998838 0.013515534 0.04220974 0.0054437718 -Inf -0.002160549
## 86 0.0123173764 0.012970837 0.04287508 NaN 0 0.008708239
## 87 0.0072698055 0.013575914 0.04186024 0.0050936525 -Inf NaN
## 88 0.0056605970 0.012521869 0.04254778 NaN 0 -0.012408497
## 89 0.0106425549 0.015993773 0.04303492 NaN 0 0.009296149
## 90 0.0045998838 0.013474333 0.04464674 -0.0006808254 0 -0.002160549
## 91 0.0121399455 0.014550825 0.04350008 NaN -Inf NaN
## 92 0.0095810929 0.013425269 0.04238014 0.0030791369 0 NaN
## 93 0.0107254778 0.013968102 0.04350008 NaN 0 0.002875887
## 94 0.0138197228 0.014022512 0.03901226 NaN 0 0.007982381
## 95 0.0110028402 0.012604667 0.04073447 NaN 0 0.009296149
## 96 0.0109142271 0.013691872 0.04350008 0.0045721213 0 0.004572121
## 97 0.0128321766 0.014227054 0.04350008 -0.0085046856 0 0.010991195
## gleason pgg45 lpsa
## 1 0.01847175 -Inf NaN
## 2 0.01847175 -Inf NaN
## 3 0.02006093 0.03088384 NaN
## 4 0.01847175 -Inf NaN
## 5 0.01847175 -Inf -0.0102065488
## 6 0.01847175 -Inf -0.0027553415
## 7 0.01847175 -Inf -0.0027553415
## 8 0.01847175 -Inf -0.0016220402
## 9 0.01847175 -Inf 0.0004766347
## 10 0.01847175 -Inf 0.0004766347
## 11 0.01847175 -Inf 0.0024392839
## 12 0.01847175 -Inf 0.0024392839
## 13 0.02006093 0.03506389 0.0024392839
## 14 0.02006093 0.01659214 0.0030791365
## 15 0.02006093 0.01659214 0.0034593332
## 16 0.01847175 -Inf 0.0038086234
## 17 0.02006093 0.03506389 0.0039730102
## 18 0.01847175 -Inf 0.0041311679
## 19 0.01847175 -Inf 0.0045721212
## 20 0.01847175 -Inf 0.0048414516
## 21 0.01847175 -Inf 0.0050936524
## 22 0.02006093 0.03088384 0.0052139136
## 23 0.01847175 -Inf 0.0054437718
## 24 0.02006093 0.04220974 0.0055537309
## 25 0.01847175 -Inf 0.0056605968
## 26 0.01847175 -Inf 0.0058656411
## 27 0.02006093 0.04379892 0.0060599902
## 28 0.02006093 0.03088384 0.0061534557
## 29 0.02006093 0.04517553 0.0063335056
## 30 0.01847175 -Inf 0.0065878006
## 31 0.01847175 -Inf 0.0067477898
## 32 0.01847175 -Inf 0.0071881007
## 33 0.01847175 -Inf 0.0071881007
## 34 0.01847175 -Inf 0.0072563233
## 35 0.01847175 -Inf 0.0073888010
## 36 0.02006093 0.01659214 0.0075782593
## 37 0.02143754 0.02791804 0.0079276044
## 38 0.02006093 0.02791804 0.0080892400
## 39 0.02006093 0.03665307 0.0081926770
## 40 0.02006093 0.01659214 0.0084842904
## 41 0.02265180 0.04517553 0.0085758059
## 42 0.02006093 0.02373799 0.0086205789
## 43 0.01847175 -Inf 0.0087082393
## 44 0.02006093 0.01847175 0.0089170904
## 45 0.02006093 0.03088384 0.0095354788
## 46 0.02006093 0.02791804 0.0096639566
## 47 0.02265180 0.04747598 0.0097261261
## 48 0.02006093 0.03802969 0.0097261261
## 49 0.01847175 -Inf 0.0098169401
## 50 0.01847175 -Inf 0.0098169401
## 51 0.02006093 0.04033013 0.0100732595
## 52 0.01847175 -Inf 0.0101537888
## 53 0.02006093 0.04379892 0.0101801263
## 54 0.02006093 0.03802969 0.0102062186
## 55 0.02006093 0.01659214 0.0102576823
## 56 0.02006093 0.02791804 0.0103082113
## 57 0.02006093 0.01429169 0.0105706991
## 58 0.01847175 -Inf 0.0105933590
## 59 0.02006093 0.03088384 0.0106381193
## 60 0.02006093 0.03802969 0.0106602251
## 61 0.01847175 -Inf 0.0107681175
## 62 0.02006093 0.03802969 0.0108100900
## 63 0.02265180 0.04694718 0.0108100900
## 64 0.02006093 0.04220974 0.0109122238
## 65 0.01847175 -Inf 0.0109122238
## 66 0.02006093 0.03088384 0.0109321884
## 67 0.02006093 0.04379892 0.0110489123
## 68 0.02006093 0.02373799 0.0111968912
## 69 0.01847175 -Inf 0.0111968912
## 70 0.02006093 0.01659214 0.0112326107
## 71 0.02006093 0.04220974 0.0113707537
## 72 0.02006093 0.03318429 0.0114534712
## 73 0.02006093 0.03088384 0.0115177696
## 74 0.02265180 0.04638979 0.0115804813
## 75 0.02006093 0.03088384 0.0122308876
## 76 0.02006093 0.04033013 0.0124251157
## 77 0.02006093 0.04220974 0.0125944754
## 78 0.02006093 0.02373799 0.0127236210
## 79 0.02006093 0.04379892 0.0127903027
## 80 0.02006093 0.04033013 0.0129534111
## 81 0.02006093 0.03802969 0.0129621413
## 82 0.02006093 0.04220974 0.0130052977
## 83 0.02006093 0.03506389 0.0131056469
## 84 0.02265180 0.04379892 0.0131219477
## 85 0.02006093 0.03506389 0.0131701535
## 86 0.02006093 0.04220974 0.0132938567
## 87 0.01847175 -Inf 0.0134323449
## 88 0.02006093 0.03506389 0.0135223256
## 89 0.02006093 0.04220974 0.0142512640
## 90 0.02006093 0.04451019 0.0142751941
## 91 0.01847175 -Inf 0.0143682292
## 92 0.02006093 0.02791804 0.0146202952
## 93 0.02006093 0.04220974 0.0152394134
## 94 0.02006093 0.03802969 0.0159200732
## 95 0.02006093 0.02373799 0.0168831008
## 96 0.02006093 0.04517553 0.0175324787
## 97 0.02006093 0.03088384 0.0177290116
pros.dat, using cbind(). The new
pros.dat matrix should now have 10 columns. Set the last
column name to be ldens. Print its first 6 rows, to check
that you’ve done all this right.cbind(pros.dat,deparse.level = 1)
## lcavol lweight age lbph svi lcp gleason pgg45
## 1 -0.579818495 2.769459 50 -1.38629436 0 -1.38629436 6 0
## 2 -0.994252273 3.319626 58 -1.38629436 0 -1.38629436 6 0
## 3 -0.510825624 2.691243 74 -1.38629436 0 -1.38629436 7 20
## 4 -1.203972804 3.282789 58 -1.38629436 0 -1.38629436 6 0
## 5 0.751416089 3.432373 62 -1.38629436 0 -1.38629436 6 0
## 6 -1.049822124 3.228826 50 -1.38629436 0 -1.38629436 6 0
## 7 0.737164066 3.473518 64 0.61518564 0 -1.38629436 6 0
## 8 0.693147181 3.539509 58 1.53686722 0 -1.38629436 6 0
## 9 -0.776528789 3.539509 47 -1.38629436 0 -1.38629436 6 0
## 10 0.223143551 3.244544 63 -1.38629436 0 -1.38629436 6 0
## 11 0.254642218 3.604138 65 -1.38629436 0 -1.38629436 6 0
## 12 -1.347073648 3.598681 63 1.26694760 0 -1.38629436 6 0
## 13 1.613429934 3.022861 63 -1.38629436 0 -0.59783700 7 30
## 14 1.477048724 2.998229 67 -1.38629436 0 -1.38629436 7 5
## 15 1.205970807 3.442019 57 -1.38629436 0 -0.43078292 7 5
## 16 1.541159072 3.061052 66 -1.38629436 0 -1.38629436 6 0
## 17 -0.415515444 3.516013 70 1.24415459 0 -0.59783700 7 30
## 18 2.288486169 3.649359 66 -1.38629436 0 0.37156356 6 0
## 19 -0.562118918 3.267666 41 -1.38629436 0 -1.38629436 6 0
## 20 0.182321557 3.825375 70 1.65822808 0 -1.38629436 6 0
## 21 1.147402453 3.419365 59 -1.38629436 0 -1.38629436 6 0
## 22 2.059238834 3.501043 60 1.47476301 0 1.34807315 7 20
## 23 -0.544727175 3.375880 59 -0.79850770 0 -1.38629436 6 0
## 24 1.781709133 3.451574 63 0.43825493 0 1.17865500 7 60
## 25 0.385262401 3.667400 69 1.59938758 0 -1.38629436 6 0
## 26 1.446918983 3.124565 68 0.30010459 0 -1.38629436 6 0
## 27 0.512823626 3.719651 65 -1.38629436 0 -0.79850770 7 70
## 28 -0.400477567 3.865979 67 1.81645208 0 -1.38629436 7 20
## 29 1.040276712 3.128951 67 0.22314355 0 0.04879016 7 80
## 30 2.409644165 3.375880 65 -1.38629436 0 1.61938824 6 0
## 31 0.285178942 4.090169 65 1.96290773 0 -0.79850770 6 0
## 32 0.182321557 3.804438 65 1.70474809 0 -1.38629436 6 0
## 33 1.275362800 3.037354 71 1.26694760 0 -1.38629436 6 0
## 34 0.009950331 3.267666 54 -1.38629436 0 -1.38629436 6 0
## 35 -0.010050336 3.216874 63 -1.38629436 0 -0.79850770 6 0
## 36 1.308332820 4.119850 64 2.17133681 0 -1.38629436 7 5
## 37 1.423108334 3.657131 73 -0.57981850 0 1.65822808 8 15
## 38 0.457424847 2.374906 64 -1.38629436 0 -1.38629436 7 15
## 39 2.660958594 4.085136 68 1.37371558 1 1.83258146 7 35
## 40 0.797507196 3.013081 56 0.93609336 0 -0.16251893 7 5
## 41 0.620576488 3.141995 60 -1.38629436 0 -1.38629436 9 80
## 42 1.442201993 3.682610 68 -1.38629436 0 -1.38629436 7 10
## 43 0.582215620 3.865979 62 1.71379793 0 -0.43078292 6 0
## 44 1.771556762 3.896909 61 -1.38629436 0 0.81093022 7 6
## 45 1.486139696 3.409496 66 1.74919985 0 -0.43078292 7 20
## 46 1.663926098 3.392829 61 0.61518564 0 -1.38629436 7 15
## 47 2.727852828 3.995445 79 1.87946505 1 2.65675691 9 100
## 48 1.163150810 4.035125 68 1.71379793 0 -0.43078292 7 40
## 49 1.745715531 3.498022 43 -1.38629436 0 -1.38629436 6 0
## 50 1.220829921 3.568123 70 1.37371558 0 -0.79850770 6 0
## 51 1.091923301 3.993603 68 -1.38629436 0 -1.38629436 7 50
## 52 1.660131027 4.234831 64 2.07317193 0 -1.38629436 6 0
## 53 0.512823626 3.633631 64 1.49290410 0 0.04879016 7 70
## 54 2.127040520 4.121473 68 1.76644166 0 1.44691898 7 40
## 55 3.153590358 3.516013 59 -1.38629436 0 -1.38629436 7 5
## 56 1.266947603 4.280132 66 2.12226154 0 -1.38629436 7 15
## 57 0.974559640 2.865054 47 -1.38629436 0 0.50077529 7 4
## 58 0.463734016 3.764682 49 1.42310833 0 -1.38629436 6 0
## 59 0.542324291 4.178226 70 0.43825493 0 -1.38629436 7 20
## 60 1.061256502 3.851211 61 1.29472717 0 -1.38629436 7 40
## 61 0.457424847 4.524502 73 2.32630162 0 -1.38629436 6 0
## 62 1.997417706 3.719651 63 1.61938824 1 1.90954250 7 40
## 63 2.775708850 3.524889 72 -1.38629436 0 1.55814462 9 95
## 64 2.034705648 3.917011 66 2.00821403 1 2.11021320 7 60
## 65 2.073171929 3.623007 64 -1.38629436 0 -1.38629436 6 0
## 66 1.458615023 3.836221 61 1.32175584 0 -0.43078292 7 20
## 67 2.022871190 3.878466 68 1.78339122 0 1.32175584 7 70
## 68 2.198335072 4.050915 72 2.30757263 0 -0.43078292 7 10
## 69 -0.446287103 4.408547 69 -1.38629436 0 -1.38629436 6 0
## 70 1.193922468 4.780383 72 2.32630162 0 -0.79850770 7 5
## 71 1.864080131 3.593194 60 -1.38629436 1 1.32175584 7 60
## 72 1.160020917 3.341093 77 1.74919985 0 -1.38629436 7 25
## 73 1.214912744 3.825375 69 -1.38629436 1 0.22314355 7 20
## 74 1.838961071 3.236716 60 0.43825493 1 1.17865500 9 90
## 75 2.999226163 3.849083 69 -1.38629436 1 1.90954250 7 20
## 76 3.141130476 3.263849 68 -0.05129329 1 2.42036813 7 50
## 77 2.010894999 4.433789 72 2.12226154 0 0.50077529 7 60
## 78 2.537657215 4.354784 78 2.32630162 0 -1.38629436 7 10
## 79 2.648300197 3.582129 69 -1.38629436 1 2.58399755 7 70
## 80 2.779440197 3.823192 63 -1.38629436 0 0.37156356 7 50
## 81 1.467874348 3.070376 66 0.55961579 0 0.22314355 7 40
## 82 2.513656063 3.473518 57 0.43825493 0 2.32727771 7 60
## 83 2.613006652 3.888754 77 -0.52763274 1 0.55961579 7 30
## 84 2.677590994 3.838376 65 1.11514159 0 1.74919985 9 70
## 85 1.562346305 3.709907 60 1.69561561 0 0.81093022 7 30
## 86 3.302849259 3.518980 64 -1.38629436 1 2.32727771 7 60
## 87 2.024193067 3.731699 58 1.63899671 0 -1.38629436 6 0
## 88 1.731655545 3.369018 62 -1.38629436 1 0.30010459 7 30
## 89 2.807593831 4.718052 65 -1.38629436 1 2.46385324 7 60
## 90 1.562346305 3.695110 76 0.93609336 1 0.81093022 7 75
## 91 3.246490992 4.101817 68 -1.38629436 0 -1.38629436 6 0
## 92 2.532902848 3.677566 61 1.34807315 1 -1.38629436 7 15
## 93 2.830267834 3.876396 68 -1.38629436 1 1.32175584 7 60
## 94 3.821003607 3.896909 44 -1.38629436 1 2.16905370 7 40
## 95 2.907447359 3.396185 52 -1.38629436 1 2.46385324 7 10
## 96 2.882563575 3.773910 68 1.55814462 1 1.55814462 7 80
## 97 3.471966453 3.974998 68 0.43825493 1 2.90416508 7 20
## lpsa
## 1 -0.4307829
## 2 -0.1625189
## 3 -0.1625189
## 4 -0.1625189
## 5 0.3715636
## 6 0.7654678
## 7 0.7654678
## 8 0.8544153
## 9 1.0473190
## 10 1.0473190
## 11 1.2669476
## 12 1.2669476
## 13 1.2669476
## 14 1.3480731
## 15 1.3987169
## 16 1.4469190
## 17 1.4701758
## 18 1.4929041
## 19 1.5581446
## 20 1.5993876
## 21 1.6389967
## 22 1.6582281
## 23 1.6956156
## 24 1.7137979
## 25 1.7316555
## 26 1.7664417
## 27 1.8000583
## 28 1.8164521
## 29 1.8484548
## 30 1.8946169
## 31 1.9242487
## 32 2.0082140
## 33 2.0082140
## 34 2.0215476
## 35 2.0476928
## 36 2.0856721
## 37 2.1575593
## 38 2.1916535
## 39 2.2137539
## 40 2.2772673
## 41 2.2975726
## 42 2.3075726
## 43 2.3272777
## 44 2.3749058
## 45 2.5217206
## 46 2.5533438
## 47 2.5687881
## 48 2.5687881
## 49 2.5915164
## 50 2.5915164
## 51 2.6567569
## 52 2.6775910
## 53 2.6844403
## 54 2.6912431
## 55 2.7047113
## 56 2.7180005
## 57 2.7880929
## 58 2.7942279
## 59 2.8063861
## 60 2.8124102
## 61 2.8419982
## 62 2.8535925
## 63 2.8535925
## 64 2.8820035
## 65 2.8820035
## 66 2.8875901
## 67 2.9204698
## 68 2.9626924
## 69 2.9626924
## 70 2.9729753
## 71 3.0130809
## 72 3.0373539
## 73 3.0563569
## 74 3.0750055
## 75 3.2752562
## 76 3.3375474
## 77 3.3928291
## 78 3.4355988
## 79 3.4578927
## 80 3.5130369
## 81 3.5160131
## 82 3.5307626
## 83 3.5652984
## 84 3.5709402
## 85 3.5876769
## 86 3.6309855
## 87 3.6800909
## 88 3.7123518
## 89 3.9843437
## 90 3.9936030
## 91 4.0298060
## 92 4.1295508
## 93 4.3851468
## 94 4.6844434
## 95 5.1431245
## 96 5.4775090
## 97 5.5829322
Let’s reload the prostate data so the prompts do not depend on the previous question.
pros.dat =
as.matrix(read.table("http://www.stat.cmu.edu/~ryantibs/statcomp/data/pros.dat"))
We will also recreate the pros.dat.svi and
pros.dat.no.svi from Lab.
pros.dat.svi <- pros.dat[pros.dat[,"svi"]==1,]
pros.dat.no.svi <- pros.dat[pros.dat[,"svi"]==0,]
pros.dat.svi.sd of length
ncol(pros.dat) (of length 9). The second line defines an
index variable i and sets it equal to 1. Write a third line
of code to compute the standard deviation of the ith column
of pros.dat.svi, using a built-in R function, and store
this value in the ith element of
pros.dat.svi.sd.pros.dat.svi.sd = vector(length=ncol(pros.dat.svi))
i = 1
sd(pros.dat.svi[i], na.rm = FALSE)
## [1] NA
i <- 'pros.dat.svi.sd'
pros.dat.no.svi.sd of length
ncol(pros.dat) (of length 9), the second should define an
index variable i and set it equal to 1, and the third
should fill the ith element of
pros.dat.no.svi.sd with the standard deviation of the
ith column of pros.dat.no.svi.pros.dat.no.svi(length(ncol(pros.dat)))
variable.names(i, = 1)
pros.dat.no.svi[1]
sd(pros.dat.no.svi, na.rm = FALSE)
## Error: <text>:2:19: unexpected '='
## 1: pros.dat.no.svi(length(ncol(pros.dat)))
## 2: variable.names(i, =
## ^
for() loop to compute the standard deviations
of the columns of pros.dat.svi and
pros.dat.no.svi, and store the results in the vectors
pros.dat.svi.sd and pros.dat.no.svi.sd,
respectively, that were created above. Note: you should have a single
for() loop here, not two for loops. And if it helps,
consider breaking this task down into two steps: as the first step,
write a for() loop that iterates an index variable
i over the integers between 1 and the number of columns of
pros.dat (don’t just manually write 9 here, pull out the
number of columns programmatically), with an empty body. As the second
step, paste relevant pieces of your solution code from part a and b into
the body of the for() loop. Print out the resulting vectors
pros.dat.svi.sd and pros.dat.no.svi.sd to the
console.i=1:9
for (i in pros.dat.svi){
if(i>1){
print(pros.dat.svi.sd(i, 'pros.dat.no.svi.sd'))
}else{
print(pros.dat.no.svi.sd(i, ''))
}
}
## Error in pros.dat.svi.sd(i, "pros.dat.no.svi.sd"): could not find function "pros.dat.svi.sd"
Use the pros.dat matrix from the
read.table() function for the following parts.
pros.dat.denom, according to the formula above. Take
advantage of vectorization. Make sure not to include any hard constants
(e.g., don’t just manually write 21 here for \(n\)); as always, programmatically define
all the relevant quantities. Then compute a vector of t-statistics for
the 9 variables in our data set, called pros.dat.t.stat,
according to the formula above, and using pros.dat.denom.
Again, take advantage of vectorization; this calculation should require
just a single line of code. Print out the t-statistics to the
console.pros.dat.denom <- is.vector(t, mode = "\frac{\bar{X} - \bar{Y}}{\sqrt{\frac{s_X^2}{n} + \frac{s_Y^2}{m}}}")
pros.dat.t.stat <- as.vector(pros.dat.denom($\bar{X}=\sum_{i=1}^n X_i/n$, $X$, $s_X^2 = \sum_{i=1}^n (X_i-\bar{X})^2/(n-1)$)
print(pros.dat.denom)
print(pros.dat.t.stat)
## Error: '\s' is an unrecognized escape in character string (<text>:1:66)
pros.dat.df. This might look like a complicated/ugly
calculation, but really, it’s not too bad: it only involves arithmetic
operators, and taking advantage of vectorization, the calculation should
only require a single line of code. Hint: to simplify this line of code,
it will help to first set short variable names for variables/quantities
you will be using, as in sx = pros.dat.svi.sd,
n = nrow(pros.dat.svi), and so on. Print out these degrees
of freedom values to the console.pros.dat.df <- `sx = pros.dat.svi.sd`, `n = nrow(pros.dat.svi)`
## Error: <text>:1:38: unexpected ','
## 1: pros.dat.df <- `sx = pros.dat.svi.sd`,
## ^
The function pt() evaluates the distribution function of
the t-distribution. E.g.,
pt(x, df=v, lower.tail=FALSE)
returns the probability that a t-distributed random variable, with
v degrees of freedom, exceeds the value x.
Importantly, pt() is vectorized: if x is a
vector, and so is v, then the above returns, in vector
format: the probability that a t-distributed variate with
v[1] degrees of freedom exceeds x[1], the
probability that a t-distributed variate with v[2] degrees
of freedom exceeds x[2], and so on.
pt() as in the above line, but replace
x by the absolute values of the t-statistics you computed
for the 9 variables in our data set, and v by the degrees
of freedom values associated with these t-statistics. Multiply the
output by 2, and store it as a vector pros.dat.p.val. These
are called p-values for the t-tests of mean difference
between SVI and non-SVI patients, over the 9 variables in our data set.
Print out the p-values to the console. Identify the variables for which
the p-value is smaller than 0.05 (hence deemed to have a significant
difference between SVI and non-SVI patients). Identify the variable with
the smallest p-value (the most significant difference between SVI and
non-SVI patients).pt(x)
## Error in eval(expr, envir, enclos): object 'x' not found
pt(v) * 2 <- is.vector(pros.dat.p.val)
## Error in eval(expr, envir, enclos): object 'pros.dat.p.val' not found
The variable with the smallest p-value is v.