Problem and Data
Simple classification problem dataset. What is the animal based on number of legs, segments and wings.
Legs and Segments (2D)
Utalising only the legs and segments data a simple scatter plot can be produced.
# Create 3D Sctter Plot
text2D(y = Animals$Legs, x = Animals$Segments, colvar = NULL,
ylab = "Legs", xlab = "Segments",
labels = Animals$Name, cex = 1,
adj = 1.2, font = 2, ylim=c(4,16), xlim = c(1,12))
# Add Text Labels
scatter2D(y = Animals$Legs, x = Animals$Segments, colvar = NULL,
ylim=c(4,16),
pch = 16, add = TRUE, colkey = FALSE, xlim = c(1,12)) As we can see Ant and Bee have the same data, this makes classification for these two impossible without further data. Lets try fitting a SVM to predict the animal names based on the two vairables.
# Fit the Support Vector Machine
svmfit = svm(Name ~ Legs + Segments, data = Animals, kernel = "linear", scale = FALSE)
# Plot it
plot(svmfit, Animals[,1:3]) We can then use the model to predict classifications. Giving back the original table, results in the same answers apart from Ant. This is due to them having the same classification data (2 legs, 3 segments)
# Fit the Support Vector Machine
svmfit2 = svm(Name ~ Legs + Segments, data = Animals, kernel = "polynomial", scale = FALSE)
# Plot it
plot(svmfit2, Animals[,1:3])Legs, Segments and Wings (3D)
By adding in an additional variable, classify Ants and Bees becomes possible. Looking at 3D Scatters it is possible to see all animals are now in distinct space each.
# Create 3D Sctter Plot
scatter3D(x = Animals$Legs, y = Animals$Segments, z = Animals$Wings, colvar = Animals$Wings, col = gg.col(100), pch = 19, cex = 1.3, bty = "b2", theta = 30, phi = 20, xlab = "Legs",
ylab ="Segments", zlab = "Wings", ticktype = "detailed", clab = c("Legs"))
# Add Text Labels
text3D(x = Animals$Legs, y = Animals$Segments, z = Animals$Wings, labels = Animals$Name,
add = TRUE, colkey = FALSE, cex = 1) Fitting a SVM to the three variables and testing the prediction. Note this cannot be plotted.