This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the plot.
Question:1.i Aim: To identify two attributes
from the mtcars dataset that exhibit a negative correlation using the
Pearson correlation method and visualize their relationship using a
scatter plot.
data <- mtcars[, c("disp", "hp", "drat", "wt", "qsec")]
cor_matrix <- cor(data, method = "pearson")
print(cor_matrix)
plot(data$wt, data$drat,
main = "Negative Correlation: wt vs drat",
xlab = "Weight (wt)", ylab = "Rear Axle Ratio (drat)",
col = "blue", pch = 19)
Question:1.ii Aim:To identify two attributes from the mtcars dataset that display a positive correlation using the Spearman correlation method and visualize their relationship using a scatter plot.
spearman_corr <- cor(data, method = "spearman")
print(spearman_corr)
plot(data$disp, data$hp,
main = "Positive Correlation: disp vs hp",
xlab = "Displacement (disp)", ylab = "Horsepower (hp)",
col = "green", pch = 19)
Question:2.i Aim:To estimate the linear regression line equation for the given midterm grades (x) and final exam grades (y) and express it in the form y=mx+c.
x <- c(77, 50, 71, 72, 81, 94, 96, 99, 67)
y <- c(82, 66, 78, 34, 47, 85, 99, 99, 68)
model <- lm(y ~ x)
print(coef(model))
plot(x, y, main = "Linear Regression: Midterm vs Final Grades",
xlab = "Midterm Grades", ylab = "Final Grades", col = "blue", pch = 19)
abline(model, col = "red")
Question:2.ii Aim:To predict the final exam grade of a student who scored 85 on the midterm report using the derived regression line equation.
predicted_grade <- predict(model, newdata = data.frame(x = 85))
print(predicted_grade)
Question:3 Aim:to analyze the relationship between Profit and two independent variables - Administration and Marketing Spend - specifically for New York City
data <- data.frame(
R.D.Spend = c(165349.2, 162597.7, 0, 131876.9, 0, 91992.39, 0, 61136.38, 46014.02, 28663.76),
Administration = c(136897.8, 151377.6, 118671.9, 99814.71, 147198.9, 135495.07, 127864.55, 152211.77, 85047.44, 127056.21),
Marketing.Spend = c(471784.1, 443898.5, 383199.6, 362861.36, 127716.82, 252664.93, 353183.81, 211025.09, 205517.6, 201126.8),
State = c("New York", "California", "New York", "New York", "California", "New York", "New York", "New York", "New York", "Florida"),
Profit = c(192261.8, 191792.1, 182901.99, 156991.1, 156122.51, 132602.0, 105733.54, 97483.56, 96479.51, 90708.19)
)
new_york_data <- data %>% filter(State == "New York")
model <- lm(Profit ~ Administration + Marketing.Spend, data = new_york_data)
summary(model)