Arithmetic Operators Section
Division
2/3
[1] 0.6666667
Square Root
sqrt(2)
[1] 1.414214
#logarithms
log(2)
[1] 0.6931472
Rational Operators section is to test whether something is less than, greater than or equal.
Checking to see if 3 is equal to 8 since it is not equal returns FALSE
3==8
[1] FALSE
Check to see if 3 is not equal to 8, in this case 3 is not equal returns TRUE
3 != 8
[1] TRUE
Check to see if 3 is less than or equal to 8, this shows that 3 is less than 8 so it is TRUE.
3 <= 8
[1] TRUE
Logical operators Section: logical AND, | for logical OR, and ! for NOT.
Logical Conjuction (OR). FALSE OR FALSE returns a FALSE
FALSE | FALSE
[1] FALSE
Logical Conjuction (AND). A TRUE AND FALSE will be equal to FALSE.
TRUE & FALSE
[1] FALSE
Negation. If not FALSE is TRUE
! FALSE
[1] TRUE
Combination of statements. This can be 2 less than 2 OR 1 == 5, if one is true then it true, if both are true it is true but if both are FALSE then it is FALSE
2 < 3 | 1 == 5
[1] TRUE
Assigning Values to Variables
foo will hold the value of 2 + 2 which would be 4, the you multiply foo by 3 gives a result of 12
foo * 3
[1] 12
Use ls function to obtaine a list variables currently defined.
ls()
[1] "a" "anova_out" "anxiety_ak" "anxiety_hi" "assets.df" "b"
[7] "Batteries" "coffeeA" "coffeeB" "crop_out" "d" "data"
[13] "df" "DFAsg1" "dif" "domestic_rate" "e" "f"
[19] "file" "first_matrix" "flu_med" "flu_nomed" "foo" "foreign_rate"
[25] "g" "gardens" "growth" "h" "Is_x1_greater" "isWinnerTaller"
[31] "k" "metal_data" "myvector" "myvector1" "myvector1a" "myvector2"
[37] "myvector2a" "opponent" "pooledSD" "presid_name" "sample1" "sample2"
[43] "score_buli" "score_bull" "score_normal" "second_matrix" "spcoffe" "sumAB"
[49] "sumTotal" "temp_matrix" "test_matrix" "test_vector" "Two_Means_Ex1" "Two_Means_Ex2"
[55] "twoway_out" "winner" "x" "x1" "x2" "y"
[61] "year"
To remove a variable from the list, use rm = remove
Show list of variable to see if foo was removed.
ls()
[1] "a" "anova_out" "anxiety_ak" "anxiety_hi" "assets.df" "b"
[7] "Batteries" "coffeeA" "coffeeB" "crop_out" "d" "data"
[13] "df" "DFAsg1" "dif" "domestic_rate" "e" "f"
[19] "file" "first_matrix" "flu_med" "flu_nomed" "foreign_rate" "g"
[25] "gardens" "growth" "h" "Is_x1_greater" "isWinnerTaller" "k"
[31] "metal_data" "myvector" "myvector1" "myvector1a" "myvector2" "myvector2a"
[37] "opponent" "pooledSD" "presid_name" "sample1" "sample2" "score_buli"
[43] "score_bull" "score_normal" "second_matrix" "spcoffe" "sumAB" "sumTotal"
[49] "temp_matrix" "test_matrix" "test_vector" "Two_Means_Ex1" "Two_Means_Ex2" "twoway_out"
[55] "winner" "x" "x1" "x2" "y" "year"
Vectors
A vector is a sequence of elements that share the same data type.
Use the c() function as in concatenate or combine. bar will be the vector containing 2,5,10,2,1
Then to see the values in bar just type bar
bar
[1] 2 5 10 2 1
Here will create the vector baz with values 2,2,3,3,3
Then type baz to disply the vector.
baz
[1] 2 2 3 3 3
Use the replicate Function to replicate 2 5 times
rep(2,5)
[1] 2 2 2 2 2
Consecutive Numbers will give create a list of number in this case from 1 to 5
1:5
[1] 1 2 3 4 5
Sequence from 1 to 10 with a step by two or list numbers adding 2 to each number.
Example first number is 1 then 1 + 2 = 3, 3 is your next number and then 3 + 2 = 5, so next number is 5 etc..
seq(1,10,by=2)
[1] 1 3 5 7 9
Adding vectors in this case we will add the bar + baz vector, it will add 1st item 2 in bar to 1st items 2 in baz which is = to 4, then adds the second item in bar to second item in baz 5 + 2 = 7 etc..
bar + baz
[1] 4 7 13 5 4
Compare Vectors here we are compare 1st item in bar to 1st item in baz, the second item in bar to second item in baz, since item 1 in bar is 2 and item 1 in baz is 2 then returns TRUE, Item 1 in bar is 5 and Item 2 in baz is 2 then since they are not equal will return FALSE.
bar == baz
[1] TRUE FALSE FALSE FALSE FALSE
Find the length or size of the vector use length, there are 5 items in bar so 5 is returned.
length(bar)
[1] 5
Find the minium value in a vector. Looking for the minimum/smallest value bar which is 1 in this case.
min(bar)
[1] 1
Find the average or mean of a vector, in this case the average of bar is 4
mean(bar)
[1] 4
Access parts of a vector or recalling a value of the vector bar.
First we display the vector bar to recall what values bar holds.
bar
[1] 2 5 10 2 1
To recall a value from the vector bar we use bar brackets inside the brackets the position of the value you want to recall. This case we are recalling position 1 of the vector which 2.
bar[1]
[1] 2
To obtain the value of the last position we use length of bar which is the last position 5 and obtaining the value of the last position in bar. The value of the last position in the vector bar is 1.
bar[length(bar)]
[1] 1
Extract multiple values from a vector, sample would be the 2nd and 4th values of bar vector. Here we are extracting the values from the 2nd position, 3rd position and the 4th position in bar vector. Which is 5 , 10, 2.
bar[c(2,3,4)]
[1] 5 10 2
Vectors allow to store string values or logical values.
quxx
[1] "a" "b" "cde" "fg"
DATA FRAMES Section:
Data Frames are like spreadsheets, with rows as observation and columns as variables.
To Create a data frame, we use the data.frame() function.
Most often you be using data frames loaded from a file. Here we loaded the schoolsurvey from a web site and called it dat.
HOW TO MAKE A RANDOM SAMPLE
Select a random sample by using the sample() function.
Here we are going to randomly select 5 values between 1 and 10.
The first argument give the vector of data to select elements from 1:10
The second argument (size) gives the size of the sample(s) to select.
sample(1:10, size=5)
[1] 4 1 2 8 6
Taking a sample from a data frame is a little more complicated, having two steps
1. Use sample() to select a sample of size n from a vector of the row numbers of the data frame.
2. Use the index operator [ to selext those rows from the data frame.
Suppose we like to select a random sample of size 5. First, define a variable in this case n with the size of the sample, ie 5.
Now, select a sample of the size 5 from the vector with 1 to 10 (the number of rows in bar). Use the thection nrow(0) this will find the number of rows in bar instead of manually entering the number.
To Vreate a vector with all the integers between 1 and the number of row in bar use samplerows <- sample(1:nrow(bar), size=n)
To display/print sample rows
samplerows
[1] 10 6 9 8 7
The variable samplerows contains the rows of bar which make a randm sample from all the rows in bar.
Create a new data frame called barsample with a random sample of rows from bar.
Extract those rows from bar with the following command:
The same code above can be done in one line of code.
USING TABLES SECTION
The table commands table() to look at tables, Its simplest usage looks like table(x) where x is a categorical variable.
For example, a survey asks peple if the smoke or not. The data is
Yes, No, No, Yes, Yes
We can enter this into R with the c() command, and summarize with the table command as follows:
table(x)
x
No Yes
2 3
NUMERIC MEASURES OF CENTER AND SPREAD
Suppose, CEO yearly comensations are sampled and the following are found (in millions)
12.4, 5, 2, 50, 8, 3, 1, 4, 0.25
To get the means of sals
mean(sals)
[1] 8.565
To find the variance of sals
var(sals)
[1] 225.5145
Find the Standard Deviation of sals
sd(sals)
[1] 15.01714
Tukey’s five number summary, useful for boxplots
To obtain summary statistics for sals
summary(sals)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.250 1.250 3.500 8.565 7.250 50.000
HOW ABOUT THE MODE SECTION:
In R we can write our own functions, and a first example of a function is shown below in order to compute the mode of a vector of observations x.
Create a function to find the mode. i.e. most frequent value
ux[which.max(tabulate(match(x,ux)))]
Error: object 'ux' not found
Find the most frequent value in te vector baz by using function getMode.
getMode(baz)
[1] 3
