Attempt all questions and DO NOT use ChatGPT Or Bard but rather try doing if stuck reference cheat sheets from posit blog, R for data science book and ask your peers.

vectors:

Create a vector containing elements 10, 22, 27, 19, 20 and assign it with a name.

# your code here
vec <- c(10,22,27,19,20)

Use R as a calculator to compute the following values. a). 27(38-15) b). ln(14^7) c). sqrt(436/12)

# your code here
27*(38-15)
## [1] 621
log(14^7,base = exp(1))
## [1] 18.4734
sqrt(436/12)
## [1] 6.027714
  1. Create the following vectors: b = (87, 86, 85, …, 56)
  2. What is the 19th, 20th, and 21st elements of b?
# your code here
b <- 87:56;b
##  [1] 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63
## [26] 62 61 60 59 58 57 56
b[19:21]
## [1] 69 68 67

compute the following statistics of b: 1) sum 2) median 3) standard deviation

# your code here
sum(b)
## [1] 2288
median(b)
## [1] 71.5
sd(b)
## [1] 9.380832

Create a vector that contains 100 elements with value of 1. (Hint: use ?rep for help). Use the option results = ’hide” in the code chuck to hide the result

# your code here
(vec2 <- rep(1,100))
##   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Matrices:

Create a matrix with 16 elements with 4 rows both by row and column wise store it as Ex1 and Ex2

  1. Access 3rd row 3rd column element of the matrices

  2. Access 2nd row 1st column element of them

# your code here
m <- sort(sample(1:12,16,replace = T))
ex1 <- matrix(m,4,byrow = T);ex1
##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]    4    4    5    7
## [3,]    8    8    9   10
## [4,]   11   11   11   12
ex2 <- matrix(m,4,byrow = F);ex2
##      [,1] [,2] [,3] [,4]
## [1,]    1    4    8   11
## [2,]    2    4    8   11
## [3,]    3    5    9   11
## [4,]    4    7   10   12

Imagine you have data on temperature readings for different cities In Africa for 3 months. How would you create a matrix in R to store this data, with cities as rows and months as columns?

# your code here
temp <- sample(36:42,15,replace = T)
cities <- c("Nairobi","Accra","Cairo","Capetown","Mogadishu")
temp_data <- matrix(temp,5)
colnames(temp_data) <- month.abb[6:8]
row.names(temp_data) <- cities
temp_data
##           Jun Jul Aug
## Nairobi    37  40  39
## Accra      41  42  38
## Cairo      42  36  39
## Capetown   40  40  41
## Mogadishu  42  36  42

You need to calculate the average temperature for each city over the 3 months. How would you access and manipulate specific elements of the matrix in R to achieve this? Write the code for both row and column-wise calculations

# your code here
colMeans(temp_data)
##  Jun  Jul  Aug 
## 40.4 38.8 39.8

Data Frames

Suppose you have information about different employees, including their name, department, age, and salary. How would you create a data frame in R to store this data? Include different data types for each variable.

# your code here
names <- c("denzell","joshua","lawrence","ismail","amos","abuga")
department <- c("transport","marketing","nutrition","finance","security","recreation")
age <- c(20,19,21,21,24,22)
salary <- c(75000,60000,70000,90000,85000,45000)
data <- data.frame(names,department,age,salary)
data
##      names department age salary
## 1  denzell  transport  20  75000
## 2   joshua  marketing  19  60000
## 3 lawrence  nutrition  21  70000
## 4   ismail    finance  21  90000
## 5     amos   security  24  85000
## 6    abuga recreation  22  45000

You want to filter the data frame to find employees in the “Marketing” department with a salary above Ksh50,000. Write the code using appropriate indexing and logical operators.

# your code here
subset(data,department == "marketing"&salary>50000)
##    names department age salary
## 2 joshua  marketing  19  60000

Lists:

Create a list that stores different types of data: a numeric vector, a character string, and another list.

# your code here
my_list1 <- list(24.5,c(12,45,67,82,46),"somebody to you-the vamps",list(25:40,c(23,65,71,91),"somebody you .."));my_list1
## [[1]]
## [1] 24.5
## 
## [[2]]
## [1] 12 45 67 82 46
## 
## [[3]]
## [1] "somebody to you-the vamps"
## 
## [[4]]
## [[4]][[1]]
##  [1] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
## 
## [[4]][[2]]
## [1] 23 65 71 91
## 
## [[4]][[3]]
## [1] "somebody you .."

You want to add a new element (a logical value) to the end of the list. How would you achieve this? Include different methods.

# your code here
my_list2 <- c(my_list1,TRUE);my_list2
## [[1]]
## [1] 24.5
## 
## [[2]]
## [1] 12 45 67 82 46
## 
## [[3]]
## [1] "somebody to you-the vamps"
## 
## [[4]]
## [[4]][[1]]
##  [1] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
## 
## [[4]][[2]]
## [1] 23 65 71 91
## 
## [[4]][[3]]
## [1] "somebody you .."
## 
## 
## [[5]]
## [1] TRUE

Factors:

Imagine you have survey data where participants chose their favorite R packages from a set of options. How would you create a factor variable in R to store this data?

# your code here
packages <- c("ggplot","shiny","tidyverse","dplyr","tydr")
raw.data <- c("dplyr","shiny","tidyverse","shiny","dplyr","ggplot","dplyr","ggplot","tidyverse","dplyr","tydr","shiny","dplyr")
faves <- factor(raw.data,levels = packages);faves
##  [1] dplyr     shiny     tidyverse shiny     dplyr     ggplot    dplyr    
##  [8] ggplot    tidyverse dplyr     tydr      shiny     dplyr    
## Levels: ggplot shiny tidyverse dplyr tydr
summary(faves)
##    ggplot     shiny tidyverse     dplyr      tydr 
##         2         3         2         5         1

You want to analyze the number of people who chose each color. How would you use the table function and factor levels to get a frequency table? Write the code and interpret the results.

# your code here
color <- c("red","blue","green","purple")
collected <- c("blue","purple","red","blue","green","purple","green","red","red","blue","green","purple","red","blue","green","blue","red","purple")
col_fact <- factor(collected,levels = color)
summary(col_fact)
##    red   blue  green purple 
##      5      5      4      4
col_frequency <- table(col_fact);col_frequency
## col_fact
##    red   blue  green purple 
##      5      5      4      4

Bonus

Describe the difference between is.finite(x) and !is.infinite(x).

# your answer here
cat("In summary, is.finite(x) identifies finite elements in a vector, while !is.infinite(x) identifies elements that are not infinite.")
## In summary, is.finite(x) identifies finite elements in a vector, while !is.infinite(x) identifies elements that are not infinite.
#..

Install swirl package by running install.packages(“swirl”) and do the the R Programming: The basics in programming in R course writing your code in a markdown file, publish it on your RPubs account and push it to your GitHub under a repository name dekut_r_sessions.