Tokenization

Tidy format

Exploratory

\(n\)-gram

  • Separation of words:
  • New count of words:
  • Network graphics (package igraph):

Modelling

Model SVM Lineal


Call:
svm.default(x = matrixTfidfTrain, y = as.factor(dataTrain$target), scale = TRUE, 
    type = "C-classification", kernel = "linear", cost = 1, probability = TRUE)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  linear 
       cost:  1 

Number of Support Vectors:  2329

Model SVM Radial


Call:
svm.default(x = matrixTfidfTrain, y = as.factor(dataTrain$target), scale = TRUE, 
    type = "C-classification", kernel = "radial", cost = 1, probability = TRUE)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  1 

Number of Support Vectors:  5086

Acknowledgments

