Assignment 1 (10%)

[Mubarak Ahmed]

[DNO - Student ID: 501345730]

This assignment can be submitted using either Python or R, whichever you prefer. If using R, you must submit an RMD file with its knitted file (PDF or HTML). To learn more about knitting and R markdown, visit R Markdown.

If using Python, you must submit an IPYNB file and its exported PDF/HTML with clearly printed/shown answers.

Failing to submit both files ({RMD + knitted PDF/HTML} OR {IPYNB + PDF/HTML}) will be subject to a 30% mark deduction.

NOTE: IF YOU USE R STUDIO , YOU SHOULD NEVER HAVE install.packages IN YOUR CODE; OTHERWISE, THE Knit OPTION WILL RAISE AN ERROR. COMMENT OUT ALL PACKAGE INSTALLATIONS BUT KEEP library() CALLS.

NOTE: If you answer the questions in R, all your answers should be in R (ignore Python questions). If you answer the questions in Python, all your answers

Question 1 (40 Points)

Q. 1a (5 points)

Create a vector of all even numbers from 3 to 49. Hint: Use seq() in R or range() in python

x <- seq(4, 48, by = 2) # This creates a sequence of even no. from 3 to 49
x

##  [1]  4  6  8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48

Q. 1b (5 points)

Create and print a vector x with all integers from 1 to 80, and a vector y with odd integer numbers in the same range. Hint: use seq()function in R or range()in python. Calculate the difference in lengths of the vectors x and y. Hint: use length() in R or len() in python

x <- 1:80
y <- seq(1, 79, by=2)
x

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
## [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
## [76] 76 77 78 79 80

##  [1]  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
## [26] 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79

length_diff <- length(x)-length(y) # length of x is 80 & length of y is 40, so difference is 40
length_diff

## [1] 40

Q. 1c (10 points)

Create a new vector, “x_square”, with the square of elements at indices 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 from the variable “x”. (Hint: Use indexing rather than a for loop.)

Calculate the mean and median of the LAST 7 values from x_square.

indices <- seq(3, 41, by = 2)
indices

##  [1]  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41

x_square <- x[indices]^2 # This squares all the elements at given indices
x_square

##  [1]    9   25   49   81  121  169  225  289  361  441  529  625  729  841  961
## [16] 1089 1225 1369 1521 1681

last_seven <- tail(x_square, 7)
mean_last_seven <- mean(last_seven) # This calculates the mean of last 7 values of x_square
median_last_seven <- median(last_seven) # This calculates the median of last 7 values of x_square
mean_last_seven # This prints the mean

## [1] 1241

median_last_seven # This prints the median

## [1] 1225

Q. 1d (10 Points)

For a given factor variable factorVar <- factor(c(“3.5”, “8.1”, “5.2”, “9.8”)) in R or factorVar = [“3.5”, “8.1”, “5.2”, “9.8”] in python Please convert it into numeric data type and show proof that it is converted to numeric.

factorVar <- factor(c("3.5", "8.1", "5.2", "9.8"))
numericVar <- as.numeric(as.character(factorVar))
numericVar # This prints out the numeric values

## [1] 3.5 8.1 5.2 9.8

class(numericVar) # This shows the class as numeric

## [1] "numeric"

Q. 1e (10 points)

A comma-separated values file dataset.csv consists of missing values represented by Not a Number (NaN) and question mark (?). How can you read this type of files in R?

data <- read.csv("E:\\University\\CIND-123\\Asignments\\dataset.csv", header = FALSE, na = c("null", "?", "NaN")) # This replaces all null & ? values with NA and print the dataset & also header setting as FALSE make sure to not read first row as header.
head(data) # This displays the data

# Following code depicts the method to replace na values with the median value of the data
data_matrix <- as.matrix(data) # This converts the data frame into a matrix
median_value <- median(data_matrix, na.rm = TRUE) # This calculate median of the matrix removing na values
median_value

## [1] 85.5

data[is.na(data)] <- median_value # This replaces all the values in data frame that are na with the median value
data # This displays the data where na values are replaced with the median value i.e "85.5"

Question 2 (60 Points)

Compute with using R or Python commands

Q. 2a (10 points)

\[\sum_{n=1}^{100}\frac{4^{n}}{(n+1)!}\] Hint: Use factorial(n) to compute \(n!\)

n <- 1:100
terms <- 4^n
factorials <- factorial(n + 1)
values <- terms / factorials
result_q2a <- sum(values)
result_q2a

## [1] 12.39954

Q. 2b (10 points)

\[\sum_{n=1}^{25}\left(\frac{3^{n}}{n^3} + \frac{n^{4}}{4^{n}}\right)\]

n <- 1:25
term1 <- 3^n / n^3
term2 <- n^4 / 4^n
values <- term1 + term2
result_q2b <- sum(values)
result_q2b

## [1] 87186876

Q. 2c (10 points)

\[\sum_{n=0}^{100} \frac{(n+1)5^n}{25^{n+1}}\]

n <- 0:100
numerator <- n + 1  * 5^n
denominator <- 25 ^(n+1)
values <- numerator / denominator
result_q2c <- sum(values)
result_q2c

## [1] 0.05173611

Q. 2d (10 points)

\[\prod_{n=2}^{24} \left(3n + \frac{3}{\sqrt[4]{n}}\right)\]

n <- 2:24
terms <- 3 * n + 3 / n^(1/4)
result_q2d <- prod(terms)
result_q2d

## [1] 3.06002e+35

Q. 2e (10 points)

\[\sum_{n=1}^{9}\left(\frac{3^{n}}{n^3} + \frac{n^{5}}{4^{5}}+\frac{n^{7}}{7^{n}}+\frac{n^{9}}{9^{n}}\right)\](5 points)

n <- 1:9
term1 <- 3^n / n^3
term2 <- n^5 / 4^5
term3 <- n^7 / 7^n
term4 <- n^9 / 9^n
values <- term1 + term2 + term3 + term4
result_q2e <- sum(values)
result_q2e

## [1] 338.3396

Q. 2f (10 points)

Describe the purpose of is.logical() , is.character() , is.numeric() , and is.na() functions in R.

In Python, describe the purpose of isinstance() , type() , and pd.isna() functions from the pandas library.

Hint: Please use x <- c(“a”, FALSE, “b”, NA, 2, TRUE) in R or x = [“a”, False, “b”, None, 2, True] in Python to explain your description.

x <- c("a", FALSE, "b", NA, 2, TRUE)

print(is.logical(x))  # This checks if x is a logical vector. Since all values are not logical in vector x, it returns "False"

## [1] FALSE

print(is.character(x))  # This checks if x is a character vector. Since, a & b are character in vector x, it consider the whole vector as character & returns "True".

## [1] TRUE

print(is.numeric(x))  # This checks if x is a numeric vector. Since, in vector x only 2 is numeric and rest are other types, it returns "False".

## [1] FALSE

print(is.na(x))  # This identifies NA values. In vector x, 4th element is NA so it returns True for 4th element and False for the rest.

## [1] FALSE FALSE FALSE  TRUE FALSE FALSE

Data Analytics Basic Methods