Modelo de clasificación acerca de las actitudes y percepciones de los estudiantes de octavo grado hacia las matemáticas.

Las evaluaciones internacionales en educación conocida por sus siglas en inglés como ILSA (International large-scale assessment), son estudios empíricos que proveen una manera disciplinada, sistemática y cuantificable de recopilar datos del entorno educativo a nivel mundial; sus herramientas metodológicas utilizan técnicas estadísticas que van desde: la recopilación de datos, organización de datos, análisis de datos e interpretación de los datos; todo ello, con el fin de comprender los sistemas educativos y el rendimiento de los estudiantes de los países participantes en un marco internacional confiable (Hastedt & Rocher, 2020).

Entre las organizaciones que se encargan de elaborar, planificar y organizar estas evaluaciones internacionales ILSA, se encuentran: la asociación internacional para la evaluación del rendimiento educativo (IEA, International Association for the Evaluation of Educational Achievement) y, la organización para la cooperación y el desarrollo económicos (OECD, The Organisation for Economic Co-operation and Development). Estas organizaciones difieren en cuanto a su filosofía de estudio, tipo de organización, metodología para realizar las evaluaciones, proceso de revisión, participación, toma de decisiones y tarifas. La IEA se enfoca en investigar, comprender y mejorar la educación en todo el mundo a través de estudios comparativos a gran escala, estos se caracterizan por ser de alta calidad a la vez proporcionan a los educadores, legisladores y padres de familia información sobre el desempeño de sus estudiantes.

Dentro de las evaluaciones ILSA encontramos: el estudio para la evaluación internacional (PISA, International Student Assesment), el estudio internacional de tendencias en matemáticas y ciencias (TIMSS, The Trends in International Mathematics and Science Study), el estudio Internacional para el progreso de la comprensión lectora (PIRLS, Progress in International Reading Literacy Study) y, el estudio de educación cívica (CIVED, The Civic Education Study). Estos estudios se diferencian en sus propósitos, áreas temáticas de contenido y años de realización. Por ejemplo, el estudio PISA se realiza cada tres años, mientras que TIMSS cada cuatro años. La IEA ha sido la encargada de administrar, organizar y planificar TIMSS desde 1995, mientras que PISA es administrado por la OECD desde el año 2000. Con respecto a las áreas temáticas CIVED se enfoca en la democracia, la ciudadanía, la identidad nacional, cohesión social y diversidad mientras que PIRLS evalúa la comprensión lectora de los estudiantes de cuarto grado (Di Giacomo et al., 2013).

A continuación les mostrare algunas variables extraidas de la base de datos de una de estas pruebas internacionales:

Variables de analisis:

## 'data.frame':    11114 obs. of  9 variables:
##  $ var1         : int  1 1 2 1 1 2 1 1 1 2 ...
##  $ var2         : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ var3         : int  3 4 5 2 3 3 5 5 5 3 ...
##  $ var4         : int  2 2 2 3 1 1 3 3 2 3 ...
##  $ var5         : int  3 3 2 4 3 3 2 2 2 2 ...
##  $ var6         : int  2 3 1 3 3 1 3 2 3 2 ...
##  $ var7         : int  3 2 2 4 2 4 3 2 2 3 ...
##  $ Puntaje      : num  473 547 455 389 536 ...
##  $ Nivel_Puntaje: int  2 4 3 1 2 2 3 2 3 3 ...

categorias de cada uan de las variables

#Variable 1:Are you girl or boy?

student.imp$var1<-as.factor(student.imp$var1)
levels(student.imp$var1)[1]<-"Girl"
levels(student.imp$var1)[2]<-"Boy"
table(student.imp$var1)
## 
## Girl  Boy 
## 5543 5571

#Variable 2:Do you have any computer at your home?

student.imp$var2<-as.factor(student.imp$var2)
levels(student.imp$var2)[1]<-"Computer"
levels(student.imp$var2)[2]<-"No computer"
table(student.imp$var2)
## 
##    Computer No computer 
##       10519         595

#Variable 3:How often you feel tired when you arreve at the school?

student.imp$var3<-as.factor(student.imp$var3)
levels(student.imp$var3)[1]<-"Once a week"
levels(student.imp$var3)[2]<-"Once every two weeks"
levels(student.imp$var3)[3]<-"one a month"
levels(student.imp$var3)[4]<-"once every two months"
levels(student.imp$var3)[5]<-"Never or almost never"
table(student.imp$var3)
## 
##           Once a week  Once every two weeks           one a month 
##                   458                   863                  1811 
## once every two months Never or almost never 
##                  2395                  5587

#Variable 4:How often do you feel tired when you arrive at the school

student.imp$var4<-as.factor(student.imp$var4)
levels(student.imp$var4)[1]<-"Every day"
levels(student.imp$var4)[2]<-"Almost every day"
levels(student.imp$var4)[3]<-"Sometimes"
levels(student.imp$var4)[4]<-"Never"
table(student.imp$var4)
## 
##        Every day Almost every day        Sometimes            Never 
##             3644             3241             3872              357

#Variable 5: I enjoy learning Mathematics

student.imp$var5<-as.factor(student.imp$var5)
levels(student.imp$var5)[1]<-"I agree a lot"
levels(student.imp$var5)[2]<-"Agree a little"
levels(student.imp$var5)[3]<-"Disagree a little"
levels(student.imp$var5)[4]<-"Disagree a lot"
table(student.imp$var5)
## 
##     I agree a lot    Agree a little Disagree a little    Disagree a lot 
##              3159              4351              1991              1613

#Variable 6:I learn many interesting things in Mathematics

student.imp$var6<-as.factor(student.imp$var6)
levels(student.imp$var6)[1]<-"I agree_lot"
levels(student.imp$var6)[2]<-"Agree_little"
levels(student.imp$var6)[3]<-"Disagree_little"
levels(student.imp$var6)[4]<-"Disagree_lot"
table(student.imp$var6)
## 
##     I agree_lot    Agree_little Disagree_little    Disagree_lot 
##            3201            4260            2409            1244

#Variable 7:Mathematics is one of my favourite subject

student.imp$var7<-as.factor(student.imp$var7)
levels(student.imp$var7)[1]<-"I agree_lot"
levels(student.imp$var7)[2]<-"Agree_little"
levels(student.imp$var7)[3]<-"Disagree_little"
levels(student.imp$var7)[4]<-"Disagree_lot"
table(student.imp$var7)
## 
##     I agree_lot    Agree_little Disagree_little    Disagree_lot 
##            3054            3852            2141            2067

#Variable: Puntaje

summary(student.imp[,8])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   200.3   449.8   517.6   517.2   587.2   819.8

#Variable: Nivel de Puntaje

## 
## Below400  401-475  476-550  551-625 more 626 
##     1323     2404     3151     2754     1482

Graficas de las variables de interes

#Grafica de la variable genero versus las otras variables .

## Warning: package 'ggplot2' was built under R version 4.2.3

library(ggplot2)
ggplot(student.imp, aes(x = var6, fill = var1)) + 
geom_bar(position = "stack")+
labs(x="I learn many interesting things in mathematics",fill= "Are you girl or a boy?")

library(ggplot2)
ggplot(student.imp, aes(x = var7, fill = var1)) + 
geom_bar(position = "stack")+
labs(x="I like mathematics",fill= "Are you girl or a boy?")

Grafica de la variable: Do you have any computer versus otras variables .

library(ggplot2)
ggplot(student.imp, aes(x = var5, fill = var2)) + 
geom_bar(position = "stack")+
labs(x="I enjoy learning mathematics",fill= "Do you have any computer at your home??")

library(ggplot2)
ggplot(student.imp, aes(x = var6, fill = var2)) + 
geom_bar(position = "stack")+
labs(x="I learn many interesting things in mathematics",fill= "Do you have any computer your home?")

library(ggplot2)
ggplot(student.imp, aes(x = var7, fill = var2)) + 
geom_bar(position = "stack")+
labs(x="I like mathematics",fill= "Do you have any computer your home?")

library(ggplot2)
ggplot(student.imp, aes(x = var5, fill = var3)) + 
geom_bar(position=position_dodge(preserve="single"))+
labs(x="I enjoy learning mathematics",fill= "About how often are you absent from you school?")

library(ggplot2)
ggplot(student.imp, aes(x = var6, fill = var3)) + 
geom_bar(position=position_dodge(preserve="single"))+
labs(x="I learn many interesting things in mathematics",fill= "About how often are you absent from you school?")

library(ggplot2)
ggplot(student.imp, aes(x = var7, fill = var3)) + 
geom_bar(position=position_dodge(preserve="single"))+
labs(x="I like mathematics",fill= "About how often are you absent from you school?")+
theme_minimal()

Grafica de la variable How often you feel tired when you arrive at the school? versus otras variables .

library(ggplot2)
ggplot(student.imp, aes(x = var5, fill = var4)) + 
geom_bar(position=position_dodge(preserve="single"))+
labs(x="I enjoy learning mathematics",fill= "How often you feel tired when you arrive at the school?")

library(ggplot2)
ggplot(student.imp, aes(x = var6, fill = var4)) + 
geom_bar(position=position_dodge(preserve="single"))+
labs(x="I learn many interesting things in mathematics",fill= "How often you feel tired when you arrive at the school?")

library(ggplot2)
ggplot(student.imp, aes(x = var7, fill = var4)) + 
geom_bar(position=position_dodge(preserve="single"))+
labs(x="I like mathematics",fill= "How often you feel tired when you arrive at the school?")+
theme_minimal()

Grafica de la variable puntaje versus las otras variables

ggplot(student.imp, aes( x=Puntaje, y= var1, color=var1) )+
geom_jitter(alpha=0.7, size=1)+
labs(x="Puntaje en matematicas" , y="Are you girl or a boy?")+
theme_minimal()+
theme(legend.position="none")

ggplot(student.imp, aes( x=Puntaje, y= var2, color=var2) )+
geom_jitter(alpha=0.7, size=1)+
labs(x="Puntaje en matematicas" , y="Do you have any computer at your home?")+
theme_minimal()+
theme(legend.position="none")

ggplot(student.imp, aes( x=Puntaje, y= var3, color=var3) )+
geom_jitter(alpha=0.7, size=1)+
labs(x="Puntaje en matematicas" , y="About how often are you absent from you school?")+
theme_minimal()+
theme(legend.position="none")

ggplot(student.imp, aes( x=Puntaje, y= var4, color=var4) )+
geom_jitter(alpha=0.7, size=1)+
labs(x="Puntaje en matematicas" , y="How often you feel tired when you arrive at the school?")+
theme_minimal()+
theme(legend.position="none")

ggplot(student.imp, aes( x=Puntaje, y= var5, color=var5) )+
geom_jitter(alpha=0.7, size=1)+
labs(x="Puntaje en matematicas" , y="I enjoy learning mathematics")+
theme_minimal()+
theme(legend.position="none")

ggplot(student.imp, aes( x=Puntaje, y= var6, color=var6) )+
geom_jitter(alpha=0.7, size=1)+
labs(x="Puntaje en matematicas" , y="I learn many interesting things in mathematics" )+
theme_minimal()+
theme(legend.position="none")

ggplot(student.imp, aes( x=Puntaje, y= var7, color=var7) )+
geom_jitter(alpha=0.7, size=1)+
labs(x="Puntaje en matematicas" , y="I like mathematics")+
theme_minimal()+
theme(legend.position="none")

ANALISIS BAYESIANO

## Warning: package 'e1071' was built under R version 4.2.3
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##  Below400   401-475   476-550   551-625  more 626 
## 0.1190390 0.2163038 0.2835163 0.2477956 0.1333453 
## 
## Conditional probabilities:
##           var1
## Y               Girl       Boy
##   Below400 0.4799698 0.5200302
##   401-475  0.4983361 0.5016639
##   476-550  0.5068232 0.4931768
##   551-625  0.5148874 0.4851126
##   more 626 0.4689609 0.5310391
library(e1071)

model2 <- naiveBayes(Nivel_Puntaje ~ var2, data = student.imp)
model2
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##  Below400   401-475   476-550   551-625  more 626 
## 0.1190390 0.2163038 0.2835163 0.2477956 0.1333453 
## 
## Conditional probabilities:
##           var2
## Y            Computer No computer
##   Below400 0.86696901  0.13303099
##   401-475  0.92013311  0.07986689
##   476-550  0.95176135  0.04823865
##   551-625  0.97893972  0.02106028
##   more 626 0.98852901  0.01147099
library(e1071)

model3 <- naiveBayes(Nivel_Puntaje ~ var3, data = student.imp)
model3
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##  Below400   401-475   476-550   551-625  more 626 
## 0.1190390 0.2163038 0.2835163 0.2477956 0.1333453 
## 
## Conditional probabilities:
##           var3
## Y          Once a week Once every two weeks one a month once every two months
##   Below400 0.149659864          0.114890401 0.157974301           0.163265306
##   401-475  0.053660566          0.098585691 0.191763727           0.200499168
##   476-550  0.025706125          0.084100286 0.170739448           0.214217709
##   551-625  0.015613653          0.053376906 0.146332607           0.245824256
##   more 626 0.004723347          0.041835358 0.134952767           0.232793522
##           var3
## Y          Never or almost never
##   Below400           0.414210128
##   401-475            0.455490849
##   476-550            0.505236433
##   551-625            0.538852578
##   more 626           0.585695007
library(e1071)

model4 <- naiveBayes(Nivel_Puntaje ~ var4, data = student.imp)
model4
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##  Below400   401-475   476-550   551-625  more 626 
## 0.1190390 0.2163038 0.2835163 0.2477956 0.1333453 
## 
## Conditional probabilities:
##           var4
## Y           Every day Almost every day  Sometimes      Never
##   Below400 0.41421013       0.21164021 0.32728647 0.04686319
##   401-475  0.37853577       0.24833611 0.33860233 0.03452579
##   476-550  0.33989210       0.29704856 0.33640114 0.02665820
##   551-625  0.28249818       0.33442266 0.35548293 0.02759622
##   more 626 0.22739541       0.34210526 0.39541161 0.03508772
library(e1071)

model5 <- naiveBayes(Nivel_Puntaje ~ var5, data = student.imp)
model5
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##  Below400   401-475   476-550   551-625  more 626 
## 0.1190390 0.2163038 0.2835163 0.2477956 0.1333453 
## 
## Conditional probabilities:
##           var5
## Y          I agree a lot Agree a little Disagree a little Disagree a lot
##   Below400    0.20634921     0.35374150        0.20483749     0.23507181
##   401-475     0.19925125     0.37562396        0.20465890     0.22046589
##   476-550     0.24912726     0.41542368        0.18882894     0.14662012
##   551-625     0.33514887     0.40740741        0.16739288     0.09005084
##   more 626    0.47165992     0.37044534        0.11605938     0.04183536
library(e1071)

model6 <- naiveBayes(Nivel_Puntaje ~ var6, data = student.imp)
model6
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##  Below400   401-475   476-550   551-625  more 626 
## 0.1190390 0.2163038 0.2835163 0.2477956 0.1333453 
## 
## Conditional probabilities:
##           var6
## Y          I agree_lot Agree_little Disagree_little Disagree_lot
##   Below400  0.29629630   0.33786848      0.20634921   0.15948602
##   401-475   0.27703827   0.37188020      0.20549085   0.14559068
##   476-550   0.27229451   0.37416693      0.23357664   0.11996192
##   551-625   0.28068264   0.40486565      0.23021060   0.08424110
##   more 626  0.34547908   0.42172740      0.18353576   0.04925776
library(e1071)

model7 <- naiveBayes(Nivel_Puntaje ~ var7, data = student.imp)
model7
## 
## Naive Bayes Classifier for Discrete Predictors
## 
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
## 
## A-priori probabilities:
## Y
##  Below400   401-475   476-550   551-625  more 626 
## 0.1190390 0.2163038 0.2835163 0.2477956 0.1333453 
## 
## Conditional probabilities:
##           var7
## Y          I agree_lot Agree_little Disagree_little Disagree_lot
##   Below400  0.18972033   0.27739985      0.20710506   0.32577475
##   401-475   0.19550749   0.31073211      0.21921797   0.27454243
##   476-550   0.24119327   0.36305935      0.20691844   0.18882894
##   551-625   0.32498184   0.38307916      0.17792302   0.11401598
##   more 626  0.45748988   0.36369771      0.13360324   0.04520918

Analisis de arboles de clasificacion

## Warning: package 'rpart' was built under R version 4.2.3
## n= 11114 
## 
## node), split, n, deviance, yval
##       * denotes terminal node
## 
##  1) root 11114 97973620 517.2172  
##    2) var3=Once a week 458  3537969 422.2812 *
##    3) var3=Once every two weeks,one a month,once every two months,Never or almost never 10656 90130360 521.2976  
##      6) var5=Disagree a little,Disagree a lot 3421 25100630 494.2533 *
##      7) var5=I agree a lot,Agree a little 7235 61344520 534.0852  
##       14) var2=No computer 343  2619691 463.8038 *
##       15) var2=Computer 6892 56946270 537.5830  
##         30) var5=Agree a little 4016 30658050 525.6979 *
##         31) var5=I agree a lot 2876 24928780 554.1792 *
library(rpart.plot)
## Warning: package 'rpart.plot' was built under R version 4.2.3
rpart.plot::rpart.plot(modelo1)