Attempt all questions and DO NOT use ChatGPT Or Bard but rather try doing if stuck reference cheat sheets from posit blog, R for data science book and ask your peers.

vectors:

Create a vector containing elements 10, 22, 27, 19, 20 and assign it with a name.

# your code here
vec <- c(10,22,27,19,20)

Use R as a calculator to compute the following values. a). 27(38-15) b). ln(14^7) c). sqrt(436/12)

# your code here
27*(38-15)

## [1] 621

log(14^7,base = exp(1))

## [1] 18.4734

sqrt(436/12)

## [1] 6.027714

Create the following vectors: b = (87, 86, 85, …, 56)
What is the 19th, 20th, and 21st elements of b?

# your code here
b <- 87:56;b

##  [1] 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63
## [26] 62 61 60 59 58 57 56

b[19:21]

## [1] 69 68 67

compute the following statistics of b: 1) sum 2) median 3) standard deviation

# your code here
sum(b)

## [1] 2288

median(b)

## [1] 71.5

sd(b)

## [1] 9.380832

Create a vector that contains 100 elements with value of 1. (Hint: use ?rep for help). Use the option results = ’hide” in the code chuck to hide the result

# your code here
(vec2 <- rep(1,100))

##   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Matrices:

Create a matrix with 16 elements with 4 rows both by row and column wise store it as Ex1 and Ex2

Access 3rd row 3rd column element of the matrices
Access 2nd row 1st column element of them

# your code here
m <- sort(sample(1:12,16,replace = T))
ex1 <- matrix(m,4,byrow = T);ex1

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]    4    4    5    7
## [3,]    8    8    9   10
## [4,]   11   11   11   12

ex2 <- matrix(m,4,byrow = F);ex2

##      [,1] [,2] [,3] [,4]
## [1,]    1    4    8   11
## [2,]    2    4    8   11
## [3,]    3    5    9   11
## [4,]    4    7   10   12

Imagine you have data on temperature readings for different cities In Africa for 3 months. How would you create a matrix in R to store this data, with cities as rows and months as columns?

# your code here
temp <- sample(36:42,15,replace = T)
cities <- c("Nairobi","Accra","Cairo","Capetown","Mogadishu")
temp_data <- matrix(temp,5)
colnames(temp_data) <- month.abb[6:8]
row.names(temp_data) <- cities
temp_data

##           Jun Jul Aug
## Nairobi    37  40  39
## Accra      41  42  38
## Cairo      42  36  39
## Capetown   40  40  41
## Mogadishu  42  36  42

You need to calculate the average temperature for each city over the 3 months. How would you access and manipulate specific elements of the matrix in R to achieve this? Write the code for both row and column-wise calculations

# your code here
colMeans(temp_data)

##  Jun  Jul  Aug 
## 40.4 38.8 39.8

Data Frames

Suppose you have information about different employees, including their name, department, age, and salary. How would you create a data frame in R to store this data? Include different data types for each variable.

# your code here
names <- c("denzell","joshua","lawrence","ismail","amos","abuga")
department <- c("transport","marketing","nutrition","finance","security","recreation")
age <- c(20,19,21,21,24,22)
salary <- c(75000,60000,70000,90000,85000,45000)
data <- data.frame(names,department,age,salary)
data

##      names department age salary
## 1  denzell  transport  20  75000
## 2   joshua  marketing  19  60000
## 3 lawrence  nutrition  21  70000
## 4   ismail    finance  21  90000
## 5     amos   security  24  85000
## 6    abuga recreation  22  45000

You want to filter the data frame to find employees in the “Marketing” department with a salary above Ksh50,000. Write the code using appropriate indexing and logical operators.

# your code here
subset(data,department == "marketing"&salary>50000)

##    names department age salary
## 2 joshua  marketing  19  60000

Lists:

Create a list that stores different types of data: a numeric vector, a character string, and another list.

# your code here
my_list1 <- list(24.5,c(12,45,67,82,46),"somebody to you-the vamps",list(25:40,c(23,65,71,91),"somebody you .."));my_list1

## [[1]]
## [1] 24.5
## 
## [[2]]
## [1] 12 45 67 82 46
## 
## [[3]]
## [1] "somebody to you-the vamps"
## 
## [[4]]
## [[4]][[1]]
##  [1] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
## 
## [[4]][[2]]
## [1] 23 65 71 91
## 
## [[4]][[3]]
## [1] "somebody you .."

You want to add a new element (a logical value) to the end of the list. How would you achieve this? Include different methods.

# your code here
my_list2 <- c(my_list1,TRUE);my_list2

## [[1]]
## [1] 24.5
## 
## [[2]]
## [1] 12 45 67 82 46
## 
## [[3]]
## [1] "somebody to you-the vamps"
## 
## [[4]]
## [[4]][[1]]
##  [1] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
## 
## [[4]][[2]]
## [1] 23 65 71 91
## 
## [[4]][[3]]
## [1] "somebody you .."
## 
## 
## [[5]]
## [1] TRUE

Factors:

Imagine you have survey data where participants chose their favorite R packages from a set of options. How would you create a factor variable in R to store this data?

# your code here
packages <- c("ggplot","shiny","tidyverse","dplyr","tydr")
raw.data <- c("dplyr","shiny","tidyverse","shiny","dplyr","ggplot","dplyr","ggplot","tidyverse","dplyr","tydr","shiny","dplyr")
faves <- factor(raw.data,levels = packages);faves

##  [1] dplyr     shiny     tidyverse shiny     dplyr     ggplot    dplyr    
##  [8] ggplot    tidyverse dplyr     tydr      shiny     dplyr    
## Levels: ggplot shiny tidyverse dplyr tydr

summary(faves)

##    ggplot     shiny tidyverse     dplyr      tydr 
##         2         3         2         5         1

You want to analyze the number of people who chose each color. How would you use the table function and factor levels to get a frequency table? Write the code and interpret the results.

# your code here
color <- c("red","blue","green","purple")
collected <- c("blue","purple","red","blue","green","purple","green","red","red","blue","green","purple","red","blue","green","blue","red","purple")
col_fact <- factor(collected,levels = color)
summary(col_fact)

##    red   blue  green purple 
##      5      5      4      4

col_frequency <- table(col_fact);col_frequency

## col_fact
##    red   blue  green purple 
##      5      5      4      4

Bonus

Describe the difference between is.finite(x) and !is.infinite(x).

# your answer here
cat("In summary, is.finite(x) identifies finite elements in a vector, while !is.infinite(x) identifies elements that are not infinite.")

## In summary, is.finite(x) identifies finite elements in a vector, while !is.infinite(x) identifies elements that are not infinite.

#..

Install swirl package by running install.packages(“swirl”) and do the the R Programming: The basics in programming in R course writing your code in a markdown file, publish it on your RPubs account and push it to your GitHub under a repository name dekut_r_sessions.

R data structures assignment

dekut r community

2024-02-13