R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot. Question:1.i Aim: To identify two attributes from the mtcars dataset that exhibit a negative correlation using the Pearson correlation method and visualize their relationship using a scatter plot.

data <- mtcars[, c("disp", "hp", "drat", "wt", "qsec")]
cor_matrix <- cor(data, method = "pearson")
print(cor_matrix)
plot(data$wt, data$drat,
     main = "Negative Correlation: wt vs drat",
     xlab = "Weight (wt)", ylab = "Rear Axle Ratio (drat)",
     col = "blue", pch = 19)

Question:1.ii Aim:To identify two attributes from the mtcars dataset that display a positive correlation using the Spearman correlation method and visualize their relationship using a scatter plot.

spearman_corr <- cor(data, method = "spearman")

print(spearman_corr)

plot(data$disp, data$hp,

     main = "Positive Correlation: disp vs hp",

     xlab = "Displacement (disp)", ylab = "Horsepower (hp)",

     col = "green", pch = 19)

Question:2.i Aim:To estimate the linear regression line equation for the given midterm grades (x) and final exam grades (y) and express it in the form y=mx+c.

x <- c(77, 50, 71, 72, 81, 94, 96, 99, 67)
y <- c(82, 66, 78, 34, 47, 85, 99, 99, 68)
model <- lm(y ~ x)
print(coef(model))
plot(x, y, main = "Linear Regression: Midterm vs Final Grades",
     xlab = "Midterm Grades", ylab = "Final Grades", col = "blue", pch = 19)
abline(model, col = "red")

Question:2.ii Aim:To predict the final exam grade of a student who scored 85 on the midterm report using the derived regression line equation.

predicted_grade <- predict(model, newdata = data.frame(x = 85))
print(predicted_grade)

Question:3 Aim:to analyze the relationship between Profit and two independent variables - Administration and Marketing Spend - specifically for New York City

data <- data.frame(
  R.D.Spend = c(165349.2, 162597.7, 0, 131876.9, 0, 91992.39, 0, 61136.38, 46014.02, 28663.76),
  Administration = c(136897.8, 151377.6, 118671.9, 99814.71, 147198.9, 135495.07, 127864.55, 152211.77, 85047.44, 127056.21),
  Marketing.Spend = c(471784.1, 443898.5, 383199.6, 362861.36, 127716.82, 252664.93, 353183.81, 211025.09, 205517.6, 201126.8),
  State = c("New York", "California", "New York", "New York", "California", "New York", "New York", "New York", "New York", "Florida"),
  Profit = c(192261.8, 191792.1, 182901.99, 156991.1, 156122.51, 132602.0, 105733.54, 97483.56, 96479.51, 90708.19)
)

new_york_data <- data %>% filter(State == "New York")

model <- lm(Profit ~ Administration + Marketing.Spend, data = new_york_data)

summary(model)