library(MASS)
library(ggplot2)
library(caret)
## Loading required package: lattice
# Define colors for each species
lookup <- c(setosa = 'blue', versicolor = 'green', virginica = 'orange')
# Assign colors based on species
col.ind <- lookup[iris$Species]
# Scatterplot matrix with colored points
pairs(iris[-5], pch = 21, col = "gray", bg = col.ind)
# Perform LDA on the iris dataset
lda.fit <- lda(Species ~ ., data = iris)
lda.fit
## Call:
## lda(Species ~ ., data = iris)
##
## Prior probabilities of groups:
## setosa versicolor virginica
## 0.3333333 0.3333333 0.3333333
##
## Group means:
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## setosa 5.006 3.428 1.462 0.246
## versicolor 5.936 2.770 4.260 1.326
## virginica 6.588 2.974 5.552 2.026
##
## Coefficients of linear discriminants:
## LD1 LD2
## Sepal.Length 0.8293776 0.02410215
## Sepal.Width 1.5344731 2.16452123
## Petal.Length -2.2012117 -0.93192121
## Petal.Width -2.8104603 2.83918785
##
## Proportion of trace:
## LD1 LD2
## 0.9912 0.0088
# Make predictions
lda.pred <- predict(lda.fit)
head(lda.pred$x)
## LD1 LD2
## 1 8.061800 0.3004206
## 2 7.128688 -0.7866604
## 3 7.489828 -0.2653845
## 4 6.813201 -0.6706311
## 5 8.132309 0.5144625
## 6 7.701947 1.4617210
# Plot the first two discriminant functions
plot(LD2 ~ LD1, data = lda.pred$x, pch=21, col="gray", bg=col.ind)
#Q2: Section 4.7.3 Example
library(ISLR)
data(Smarket)
lda_fit <- lda(Direction ~ Lag1 + Lag2, data = Smarket, subset = Year < 2005)
lda_fit
## Call:
## lda(Direction ~ Lag1 + Lag2, data = Smarket, subset = Year <
## 2005)
##
## Prior probabilities of groups:
## Down Up
## 0.491984 0.508016
##
## Group means:
## Lag1 Lag2
## Down 0.04279022 0.03389409
## Up -0.03954635 -0.03132544
##
## Coefficients of linear discriminants:
## LD1
## Lag1 -0.6420190
## Lag2 -0.5135293
# Make predictions for 2005 data
lda_pred <- predict(lda_fit, Smarket[Smarket$Year == 2005, ])
lda_class <- lda_pred$class
# Create confusion matrix
table(lda_class, Smarket$Direction[Smarket$Year == 2005])
##
## lda_class Down Up
## Down 35 35
## Up 76 106
# Calculate accuracy
mean(lda_class == Smarket$Direction[Smarket$Year == 2005])
## [1] 0.5595238
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE
parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.