"""
%% is modulus operator
sink()
letters[1:5]            # "a" "b" "c" "d" "e"
LETTERS[3:8]  # "C" "D" "E" "F" "G" "H"
diag(1:5)           # Diagonal matrix with 1:5 on diagonal
matrix(NA, 5, 4)    # Empty matrix (all NA)
mat[1:3, c(2, 4)]  # Rows 1-3, columns 2 and 4
mat[1:3, 3] <- c(13, 19, 31)  # Replace part of column
U * V              # Element-wise multiplication
U[U < 5 | U > 17] <- 0  # Replace values meeting condition
cbind(A1, A2)      # Combine columns (side by side)
rbind(A1, B)       # Combine rows (stack vertically)
sapply()    # Apply to all elements
"""
'p' is not recognized as an internal or external command,
operable program or batch file.

Introduction to R & RStudio

📌 Key Points: R is a statistical programming language, open-source and free.

RStudio is an IDE (Integrated Development Environment) that makes coding in R easier.

R must be installed before RStudio.

CRAN is the official repository for R packages.

Rtools is needed for compiling packages with C/C++/Fortran code on Windows.

❓ Concept Check: What is the difference between R and RStudio?

Why is R considered “open-source”?

When do you need Rtools?

Important Commands:

# Check R version
version
R.version.string
# Check working directory
getwd()
# Set working directory
setwd("C:\\Users\\Dell\\OneDrive\\Desktop\\STATISTICS\\2nd Year\\R programming\\Incourse prep")
# Execute a script file named "my_script.R"
source("30 Nov.R")

Vectors, Matrices & Linear Algebra

# Start diverting output to a file named "output.txt"
sink("output.txt")
# Any output below will be written to "output.txt" instead of the console
print("This will be written to the file")
summary(c(1, 3, 5, 7, 9))
# Stop diverting output and return to the console
sink()

Vectors are ordered collections of the same type.

Created with c(), seq(), rep(), :

Matrices are 2D arrays of the same type.

Created with matrix().

Use %% for matrix multiplication (not ).

Important functions: t(), solve(), det(), eigen(), matrix(), dim(), rbind(), cbind(), diag()

Data Frames & Lists

Data frames are tables with columns of different types.

Lists can hold different types of objects.

data.frame() list() read.csv(), write.csv() read.xlsx(), write.xlsx() head(), summary() === View data lapply(), sapply() === Apply functions to lists

# Data frame
df <- data.frame(
  Name = c("Alice", "Bob"),
  Age = c(25, 30),
  Score = c(85, 90)
)

# Write to CSV
write.csv(df, "data.csv", row.names = FALSE)   # New csv file created & stored the data. 

# List
my_list <- list(
  vec = 1:5,
  mat = matrix(1:4, nrow = 2),
  df = df
)

my_list

Problem 1: Student Marks Simulation (From Assignment 1)

Simulating and Analyzing Student Exam Scores in R

In this assignment, you will simulate exam scores for a class and analyze a random sample of students using R.

Final Deliverables

Submit the following:

A. R code (separately written) for Questions 1–7
B. The data frame SampleData.df showing the IDs and marks of the 10 selected students
C. The numerical results from Task 7, clearly formatted (either printed neatly or shown in a data frame)


Class Information

  • Total number of students: 91
  • Student IDs: 001 to 091

Task 1: Simulate Exam Marks

  1. Generate exam marks for all 91 students using a normal distribution:
    • Mean = 77
    • Standard Deviation = 1
  2. Use the function rnorm() to generate the marks.
  3. Store the generated marks in a vector named Marks.
  4. Set the random seed using set.seed(209) to make the results reproducible.
  5. Create a vector ID containing student IDs as integers from 1 to 91.
set.seed(209)
ID <- 1:91
Marks <- rnorm(91, mean = 77, sd = 1)

Task 2: Randomly Select 10 Students

  1. Randomly select 10 students from the class without replacement.
  2. Use the function sample() to select the student IDs.
  3. Set the random seed using set.seed(2) before sampling.
  4. Extract the exam marks corresponding to the selected student IDs.
set.seed(2)
sample_id <- sample(ID, 10, replace = FALSE)
sample_marks <- Marks[sample_id]

Task 3: Sort the Sample

  1. Sort the selected student IDs in ascending order.
  2. Rearrange the corresponding marks so they match the sorted IDs.

Task 4: Create a Data Frame

Create a data frame named SampleData.df with the following columns: - ID: Sorted sampled student IDs
- Marks: Corresponding exam marks (rounded to 2 decimal places)

ord <- order(sample_id)

SampleData.df <- data.frame(
  ID = sample_id[ord],
  Marks = round(sample_marks[ord], 2)
)

Task 5: Save the Data Frame

Save SampleData.df in two formats:

Files to Save

  • Excel file: "Sample_Marks.xlsx"
  • Use openxlsx::write.xlsx()
  • CSV file: "SampleDataGPA.csv"
  • Use the base R function write.csv()

Instructions

  1. Create a new folder on your Desktop named:
    YourFirstName123
  • Replace YourFirstName with your actual first name
  • Replace 123 with the last three digits of your class roll number
  1. Set this folder as your working directory using setwd() before saving the files.
setwd("C:\\Users\\Dell\\OneDrive\\Desktop\\STATISTICS\\2nd Year\\R programming\\Incourse prep")

openxlsx::write.xlsx(SampleData.df, "Sample_Marks.xlsx")
write.csv(SampleData.df, "SampleDataGPA.csv", row.names = FALSE)

Task 6: Read the Files Back into R

  1. Read the Excel file using openxlsx::read.xlsx() and store it in:
  • SampleData_from_excel
  1. Read the CSV file using read.csv() and store it in:
  • SampleData_from_csv
  1. Display the first few rows of both data frames using the head() function.
SampleData_from_excel <- openxlsx::read.xlsx("Sample_Marks.xlsx")
SampleData_from_csv <- read.csv("SampleDataGPA.csv")

head(SampleData_from_excel)
head(SampleData_from_csv)

Task 7: Statistical Analysis (Using the 10 Sampled Marks)

Calculate the following statistics: 1. Mean
2. Standard Deviation
3. Standard Error
4. 2nd Central Moment
5. 4th Central Moment
6. Skewness
7. Kurtosis

x <- SampleData.df$Marks
n <- length(x)
x_bar <- mean(x)

results <- data.frame(
  Mean = mean(x),
  SD = sd(x),
  SE = sd(x)/sqrt(n),
  Central_Moment_2 = mean((x - x_bar)^2),
  Central_Moment_4 = mean((x - x_bar)^4),
  Skewness = mean((x - x_bar)^3)/(sd(x)^3),
  Kurtosis = mean((x - x_bar)^4)/(sd(x)^4)
)

results

Problem 2: Matrix Operations

# Create matrices
A <- matrix(1:4, nrow = 2)
A
B <- matrix(c(2, 0, 1, 3), nrow = 2)
B
# Compute: A² + 2B
result <- A %*% A + 2 * B
result

# Check if invertible
if(det(result) != 0) {
  inv_result <- solve(result)
  inv_result
}

IMPORTANT EXAM PROBLEM PATTERN:

# Task from lecture: Fill even rows with even numbers
A <- matrix(NA, 5, 4)
even_numbers <- seq(2, 20, by = 2)
k <- 1

for (i in 1:nrow(A)) {
    if (i %% 2 == 0) {  # Even rows
        for (j in 1:ncol(A)) {
            A[i, j] <- even_numbers[k]
            k <- k + 1
        }
    }
}
A

Eigenvalues and Eigenvectors:

For matrix A, if Av = λv, then: λ = eigenvalue (scalar) v = eigenvector (vector)

M <- matrix(c(2, 1, 1, 2), nrow = 2)
eigen_result <- eigen(M)

eigen_result$values    # Eigenvalues
eigen_result$vectors   # Eigenvectors

Matrix Operations Task

Problem Statement

  1. Create a Vector: Set the seed using the last 3 digits of your roll number (e.g., 245). Generate a vector v containing 8 random integers between 1 and 15, selected without replacement.
  2. Construct Matrices: Using vector v, create the following four matrices:
  • A1: matrix, filled row-wise.
  • B1: matrix, filled row-wise.
  • A2: matrix, filled column-wise.
  • B2: matrix, filled column-wise.
  1. Verify Transpose: Find the transpose of A1. Check if it is identical to B1, A2, or B2.
  2. Matrix Multiplication: Calculate the product of A1 and B1. Is the resulting matrix square? Is it invertible?
# 1. Create vector v
set.seed(245)  # Using last 3 digits of roll number
v <- sample(1:15, 8, replace = FALSE)

# 2. Create matrices
A1 <- matrix(v, nrow = 2, ncol = 4, byrow = TRUE)   # 2x4, row-wise
B1 <- matrix(v, nrow = 4, ncol = 2, byrow = TRUE)   # 4x2, row-wise
A2 <- matrix(v, nrow = 2, ncol = 4)                  # 2x4, column-wise
B2 <- matrix(v, nrow = 4, ncol = 2)                  # 4x2, column-wise

# 3. Check transposition
tA1 <- t(A1)
identical(tA1, B1)  # FALSE
identical(tA1, A2)  # FALSE
identical(tA1, B2)  # FALSE

# 4. Matrix multiplication
result <- A1 %*% B1
is_square <- nrow(result) == ncol(result)
is_invertible <- if(is_square) det(result) != 0 else FALSE

Question

Write an R function named gaussian_solver() that solves a system of linear equations using Gaussian Elimination with partial pivoting.

The function should meet the following requirements:

  1. The function must take two inputs:
    • A: a square coefficient matrix of order \(n \times n\)
    • b: a column vector of constants of length \(n\)
  2. Inside the function:
    • Create an augmented matrix by combining A and b
    • Perform forward elimination to convert the augmented matrix into an upper triangular form
    • Use partial pivoting at each step to improve numerical stability
    • Perform back substitution to compute the solution vector
  3. Store the solution in a numeric vector x:
    • Assign names to the solution variables as x1, x2, ..., xn
  4. The function should return the result as a list containing:
    • solution: the vector of solved values

Your function should work for any valid system of linear equations where a unique solution exists.

gaussian_solver_simple <- function(A, b) {
    n <- nrow(A)
    aug <- cbind(A, b)
    
    # Forward elimination
    for (i in 1:(n-1)) {
        # Find pivot row
        max_row <- which.max(abs(aug[i:n, i])) + i - 1
        if (max_row != i) {
            aug[c(i, max_row), ] <- aug[c(max_row, i), ]
        }
        
        # Eliminate below
        for (j in (i+1):n) {
            factor <- aug[j, i] / aug[i, i]
            aug[j, ] <- aug[j, ] - factor * aug[i, ]
        }
    }
    
    # Back substitution
    x <- numeric(n)
    for (i in n:1) {
        if (i == n) {
            x[i] <- aug[i, n+1] / aug[i, i]
        } else {
            x[i] <- (aug[i, n+1] - sum(aug[i, (i+1):n] * x[(i+1):n])) / aug[i, i]
        }
    }
    
    return(x)  # Just return the solution
}

# Test it
A <- matrix(c(2, 1, 0, 0, 0,
               1, 2, 1, 0, 0,
               0, 1, 2, 1, 0,
               0, 0, 1, 2, 1,
               0, 0, 0, 1, 2), 
             nrow = 5, byrow = TRUE)
b <- c(3, 5, 6, 7, 5)
solution <- gaussian_solver_simple(A, b)
print(solution)
---
title: "R practice"
output: html_notebook
---
```{p}
"""
%% is modulus operator
sink()
letters[1:5]            # "a" "b" "c" "d" "e"
LETTERS[3:8]  # "C" "D" "E" "F" "G" "H"
diag(1:5)           # Diagonal matrix with 1:5 on diagonal
matrix(NA, 5, 4)    # Empty matrix (all NA)
mat[1:3, c(2, 4)]  # Rows 1-3, columns 2 and 4
mat[1:3, 3] <- c(13, 19, 31)  # Replace part of column
U * V              # Element-wise multiplication
U[U < 5 | U > 17] <- 0  # Replace values meeting condition
cbind(A1, A2)      # Combine columns (side by side)
rbind(A1, B)       # Combine rows (stack vertically)
sapply()    # Apply to all elements
"""
```
# Introduction to R & RStudio

📌 Key Points:
R is a statistical programming language, open-source and free.

RStudio is an IDE (Integrated Development Environment) that makes coding in R easier.

R must be installed before RStudio.

CRAN is the official repository for R packages.

Rtools is needed for compiling packages with C/C++/Fortran code on Windows.


❓ Concept Check:
What is the difference between R and RStudio?

Why is R considered "open-source"?

When do you need Rtools?


# Important Commands:
```{r}
# Check R version
version
R.version.string
# Check working directory
getwd()
# Set working directory
setwd("C:\\Users\\Dell\\OneDrive\\Desktop\\STATISTICS\\2nd Year\\R programming\\Incourse prep")
```


```{r}
# Execute a script file named "my_script.R"
source("30 Nov.R")
```
# Vectors, Matrices & Linear Algebra
```{r}
# Start diverting output to a file named "output.txt"
sink("output.txt")
# Any output below will be written to "output.txt" instead of the console
print("This will be written to the file")
summary(c(1, 3, 5, 7, 9))
# Stop diverting output and return to the console
sink()
```

Vectors are ordered collections of the same type.

Created with c(), seq(), rep(), :

Matrices are 2D arrays of the same type.

Created with matrix().

Use %*% for matrix multiplication (not *).

Important functions: 
t(), 
solve(), 
det(), 
eigen(), 
matrix(), 
dim(), 
rbind(), 
cbind(), 
diag()

# Data Frames & Lists

Data frames are tables with columns of different types.

Lists can hold different types of objects.

data.frame()
list()
read.csv(), write.csv()
read.xlsx(), write.xlsx()
head(), summary()	=== View data
lapply(), sapply() ===	Apply functions to lists

```{r}
# Data frame
df <- data.frame(
  Name = c("Alice", "Bob"),
  Age = c(25, 30),
  Score = c(85, 90)
)

# Write to CSV
write.csv(df, "data.csv", row.names = FALSE)   # New csv file created & stored the data. 

# List
my_list <- list(
  vec = 1:5,
  mat = matrix(1:4, nrow = 2),
  df = df
)

my_list
```
# Problem 1: Student Marks Simulation (From Assignment 1)

## Simulating and Analyzing Student Exam Scores in R

In this assignment, you will simulate exam scores for a class and analyze a random sample of students using **R**.

## Final Deliverables
Submit the following:
  
**A.** R code (separately written) for **Questions 1–7**  
**B.** The data frame **`SampleData.df`** showing the IDs and marks of the 10 selected students  
**C.** The numerical results from **Task 7**, clearly formatted (either printed neatly or shown in a data frame)

---

## Class Information
- Total number of students: **91**
- Student IDs: **001 to 091**

---

## Task 1: Simulate Exam Marks
1. Generate exam marks for all 91 students using a **normal distribution**:
   - Mean = **77**
   - Standard Deviation = **1**
2. Use the function `rnorm()` to generate the marks.
3. Store the generated marks in a vector named **`Marks`**.
4. Set the random seed using `set.seed(209)` to make the results reproducible.
5. Create a vector **`ID`** containing student IDs as integers from **1 to 91**.
```{r}
set.seed(209)
ID <- 1:91
Marks <- rnorm(91, mean = 77, sd = 1)
```
## Task 2: Randomly Select 10 Students
1. Randomly select **10 students** from the class **without replacement**.
2. Use the function `sample()` to select the student IDs.
3. Set the random seed using `set.seed(2)` before sampling.
4. Extract the exam marks corresponding to the selected student IDs.
```{r}
set.seed(2)
sample_id <- sample(ID, 10, replace = FALSE)
sample_marks <- Marks[sample_id]
```

## Task 3: Sort the Sample
1. Sort the selected student IDs in **ascending order**.
2. Rearrange the corresponding marks so they match the sorted IDs.

## Task 4: Create a Data Frame
Create a data frame named **`SampleData.df`** with the following columns:
- **ID**: Sorted sampled student IDs  
- **Marks**: Corresponding exam marks (rounded to **2 decimal places**)
```{r}
ord <- order(sample_id)

SampleData.df <- data.frame(
  ID = sample_id[ord],
  Marks = round(sample_marks[ord], 2)
)

```
## Task 5: Save the Data Frame
  Save `SampleData.df` in two formats:
  
### Files to Save
- **Excel file**: `"Sample_Marks.xlsx"`  
- Use `openxlsx::write.xlsx()`
- **CSV file**: `"SampleDataGPA.csv"`  
- Use the base R function `write.csv()`

### Instructions
1. Create a new folder on your **Desktop** named:  
  **`YourFirstName123`**  
  - Replace `YourFirstName` with your actual first name  
- Replace `123` with the last three digits of your class roll number
2. Set this folder as your working directory using `setwd()` before saving the files.
```{r}
setwd("C:\\Users\\Dell\\OneDrive\\Desktop\\STATISTICS\\2nd Year\\R programming\\Incourse prep")

openxlsx::write.xlsx(SampleData.df, "Sample_Marks.xlsx")
write.csv(SampleData.df, "SampleDataGPA.csv", row.names = FALSE)

```
## Task 6: Read the Files Back into R
  1. Read the Excel file using `openxlsx::read.xlsx()` and store it in:
  - **`SampleData_from_excel`**
  2. Read the CSV file using `read.csv()` and store it in:
  - **`SampleData_from_csv`**
  3. Display the first few rows of both data frames using the `head()` function.
```{r}
SampleData_from_excel <- openxlsx::read.xlsx("Sample_Marks.xlsx")
SampleData_from_csv <- read.csv("SampleDataGPA.csv")

head(SampleData_from_excel)
head(SampleData_from_csv)

```
## Task 7: Statistical Analysis (Using the 10 Sampled Marks)
Calculate the following statistics:
1. Mean  
2. Standard Deviation  
3. Standard Error  
4. 2nd Central Moment  
5. 4th Central Moment  
6. Skewness  
7. Kurtosis  
```{r}
x <- SampleData.df$Marks
n <- length(x)
x_bar <- mean(x)

results <- data.frame(
  Mean = mean(x),
  SD = sd(x),
  SE = sd(x)/sqrt(n),
  Central_Moment_2 = mean((x - x_bar)^2),
  Central_Moment_4 = mean((x - x_bar)^4),
  Skewness = mean((x - x_bar)^3)/(sd(x)^3),
  Kurtosis = mean((x - x_bar)^4)/(sd(x)^4)
)

results
```
# Problem 2: Matrix Operations
```{r}
# Create matrices
A <- matrix(1:4, nrow = 2)
A
B <- matrix(c(2, 0, 1, 3), nrow = 2)
B
# Compute: A² + 2B
result <- A %*% A + 2 * B
result

# Check if invertible
if(det(result) != 0) {
  inv_result <- solve(result)
  inv_result
}

```
# IMPORTANT EXAM PROBLEM PATTERN:
```{r}
# Task from lecture: Fill even rows with even numbers
A <- matrix(NA, 5, 4)
even_numbers <- seq(2, 20, by = 2)
k <- 1

for (i in 1:nrow(A)) {
    if (i %% 2 == 0) {  # Even rows
        for (j in 1:ncol(A)) {
            A[i, j] <- even_numbers[k]
            k <- k + 1
        }
    }
}
A
```
Eigenvalues and Eigenvectors:

For matrix A, if Av = λv, then:
λ = eigenvalue (scalar)
v = eigenvector (vector)

```{r}
M <- matrix(c(2, 1, 1, 2), nrow = 2)
eigen_result <- eigen(M)

eigen_result$values    # Eigenvalues
eigen_result$vectors   # Eigenvectors
```

# Matrix Operations Task

### Problem Statement

1. **Create a Vector:**
Set the seed using the last 3 digits of your roll number (e.g., 245). Generate a vector `v` containing 8 random integers between 1 and 15, selected without replacement.
2. **Construct Matrices:**
Using vector `v`, create the following four matrices:
* **A1:**  matrix, filled row-wise.
* **B1:**  matrix, filled row-wise.
* **A2:**  matrix, filled column-wise.
* **B2:**  matrix, filled column-wise.


3. **Verify Transpose:**
Find the transpose of **A1**. Check if it is identical to **B1**, **A2**, or **B2**.
4. **Matrix Multiplication:**
Calculate the product of **A1** and **B1**. Is the resulting matrix square? Is it invertible?

```{r}
# 1. Create vector v
set.seed(245)  # Using last 3 digits of roll number
v <- sample(1:15, 8, replace = FALSE)

# 2. Create matrices
A1 <- matrix(v, nrow = 2, ncol = 4, byrow = TRUE)   # 2x4, row-wise
B1 <- matrix(v, nrow = 4, ncol = 2, byrow = TRUE)   # 4x2, row-wise
A2 <- matrix(v, nrow = 2, ncol = 4)                  # 2x4, column-wise
B2 <- matrix(v, nrow = 4, ncol = 2)                  # 4x2, column-wise

# 3. Check transposition
tA1 <- t(A1)
identical(tA1, B1)  # FALSE
identical(tA1, A2)  # FALSE
identical(tA1, B2)  # FALSE

# 4. Matrix multiplication
result <- A1 %*% B1
is_square <- nrow(result) == ncol(result)
is_invertible <- if(is_square) det(result) != 0 else FALSE
```
## Question

Write an R function named `gaussian_solver()` that solves a system of linear equations using **Gaussian Elimination with partial pivoting**.

The function should meet the following requirements:

1. The function must take two inputs:
   - `A`: a square coefficient matrix of order \( n \times n \)
   - `b`: a column vector of constants of length \( n \)

2. Inside the function:
   - Create an **augmented matrix** by combining `A` and `b`
   - Perform **forward elimination** to convert the augmented matrix into an upper triangular form
   - Use **partial pivoting** at each step to improve numerical stability
   - Perform **back substitution** to compute the solution vector

3. Store the solution in a numeric vector `x`:
   - Assign names to the solution variables as `x1, x2, ..., xn`

4. The function should return the result as a **list** containing:
   - `solution`: the vector of solved values

Your function should work for any valid system of linear equations where a unique solution exists.
```{r}
gaussian_solver_simple <- function(A, b) {
    n <- nrow(A)
    aug <- cbind(A, b)
    
    # Forward elimination
    for (i in 1:(n-1)) {
        # Find pivot row
        max_row <- which.max(abs(aug[i:n, i])) + i - 1
        if (max_row != i) {
            aug[c(i, max_row), ] <- aug[c(max_row, i), ]
        }
        
        # Eliminate below
        for (j in (i+1):n) {
            factor <- aug[j, i] / aug[i, i]
            aug[j, ] <- aug[j, ] - factor * aug[i, ]
        }
    }
    
    # Back substitution
    x <- numeric(n)
    for (i in n:1) {
        if (i == n) {
            x[i] <- aug[i, n+1] / aug[i, i]
        } else {
            x[i] <- (aug[i, n+1] - sum(aug[i, (i+1):n] * x[(i+1):n])) / aug[i, i]
        }
    }
    
    return(x)  # Just return the solution
}

# Test it
A <- matrix(c(2, 1, 0, 0, 0,
               1, 2, 1, 0, 0,
               0, 1, 2, 1, 0,
               0, 0, 1, 2, 1,
               0, 0, 0, 1, 2), 
             nrow = 5, byrow = TRUE)
b <- c(3, 5, 6, 7, 5)
solution <- gaussian_solver_simple(A, b)
print(solution)
```




