Analysis of Math Completion Status of High School Curriculum

INTRODUCTION
ANALYSIS
RESULTS
CONCLUSION

INTRODUCTION

The most frequent word one hears today is STEM (Science, Technology, Engineering, and Mathematics). We need to educate all students in the field of science, technology, engineering, and mathematics in order to be competitive in the global economy of the 21st century. Mathematics happen to be the building blocks for a STEM degree; this area of study must be emphasized in our school and colleges in other to make the course more attractive and allow students to pursue careers in science and technology in other to meet up with the industrial demand in science and technology. Mathematics gives us the ability to learn and think logically in any field of endeavor. The quality of teaching and learning in mathematics is a major challenge for educators. General concern about mathematics achievement has been evident for the last 20 years. The current debate among scholars is what students should learn to be successful in mathematics. The discussion emphasizes new instructional design techniques to produce individuals who can understand and apply fundamental mathematics concepts. A central and persisting issue is how to provide instructional environments, conditions, methods, and solutions that achieve learning goals for students with different skill and ability levels. Innovative instructional approaches and techniques should be developed to ensure that students become successful learners (Tuncay S and Omur A). It is important for educators to adopt instructional design techniques to attain higher achievement rates in mathematics. (Rasmussen & Marrongelle, 2006). Considering students’ needs and comprehension of higher-order mathematical knowledge, the instructional design provides a systematic process and a framework for analytically planning, developing, and adapting mathematics instruction (Saritas, 2004). “[Instructional design] is an effective way to alleviate many pressing problems in education. Instructional design is a linking science – a body of knowledge that prescribes instructional actions to optimize desired instructional outcomes, such as achievement and effect” (Reigeluth, 1983, p.5).

ANALYSIS

About the Data

The data contains information about five schools implementing the same math course in a semester with a total of 35 lessons. There are 30 sections in total. At the time of data collection, the semester is ¾ of the way through.

School : 5 schools (A, B, C, D and E). Nominal
Section : School Session, Ordered
Very_Ahead : very ahead (more than 5 lessons ahead). Integer
Middling : middling (5 lessons ahead to 0 lessons ahead). Integer
Behind : behind (1 to 5 lessons behind). Integer
More_Behind : more behind (6 to 10 lessons behind). Integer
Very_Behind : very behind (more than 10 lessons behind). Integer
Completed : completed (finished with the course). Integer

There are 30 observations with 8 variables. The session column is converted to ordinal since sessions are always in order. There are no missing values in all the 30 observation records and there are no duplicated records in the datasets.

School

There is a total of five schools designated with a nominal variable of A, B, C, D and E. 30 observations were recorded with only one observation from School D and School E, 3 observation from school C, 12 observation from school B and 13 observation from school A. School A has the highest number of record sessions

Section

There are 30 sessions in total. The session column is converted to ordinal as sessions are always considered to be ordered.

d <- ggplot(school_df, aes(Section)) + geom_bar(color='green', fill='blue') + theme_classic() + ggtitle('Frequency distribution of session')
d

Very Ahead

This records the number of students that are ahead with more than five lessons. There is no school with records of a student that is very ahead.

Middling

This has a minimum of 2 and a maximum of 19 with an average of 7.40 and a median of 7.50.

## middling
m <- ggplot(school_df,aes(Middling) ) + geom_histogram(color='green', fill='blue', bins = 20) + theme_classic()
n <- ggplot(school_df,aes(Middling) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(Middling) ) + geom_density(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(Middling) ) + geom_freqpoly(color='red', fill='blue') + theme_classic()

## Warning: Ignoring unknown parameters: fill

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Behind

The lowest records of the student behind is 4 while the highest record is set at 56.

## Behind
m <- ggplot(school_df,aes(Behind) ) + geom_histogram(color='green', fill='blue', bins = 20) + theme_classic()
n <- ggplot(school_df,aes(Behind) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(Behind) ) + geom_density(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(Behind) ) + geom_freqpoly(color='red', fill='blue') + theme_classic()

## Warning: Ignoring unknown parameters: fill

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

More Behind

The maximum number of students that are more behind is 12. Minimum is 0, the first quartile is 1, the median is 2 and the mean is 3.3. The distribution looks a bit skewed to the left.

## more behind

m <- ggplot(school_df,aes(More_Behind) ) + geom_histogram(color='green', fill='blue', bins = 20) + theme_classic()
n <- ggplot(school_df,aes(More_Behind) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(More_Behind) ) + geom_density(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(More_Behind) ) + geom_freqpoly(color='red', fill='blue') + theme_classic()

## Warning: Ignoring unknown parameters: fill

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Very Behind

This looks a bit skewed to the left. The minimum value is 0, maximum of 24, the first quartile is 1.25, the third quartile is 11.5, the median is 5.5 and the mean value is 6.97. There are no missing values in this column and the variable is numeric.

## very behind
m <- ggplot(school_df,aes(Very_Behind) ) + geom_histogram(color='green', fill='blue', bins = 20) + theme_classic()
n <- ggplot(school_df,aes(Very_Behind) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(Very_Behind) ) + geom_density(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(Very_Behind) ) + geom_freqpoly(color='red', fill='blue') + theme_classic()

## Warning: Ignoring unknown parameters: fill

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Completed

This variable is an integer with a minimum of 1 and maximum record of 27. There are no missing values. The mean is 10.53 and the median is 10.00 which mean the distribution is concentrated around the mean.

## completed
m <- ggplot(school_df,aes(Completed) ) + geom_histogram(color='green', fill='blue', bins = 20) + theme_classic()
n <- ggplot(school_df,aes(Completed) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(Completed) ) + geom_density(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(Completed) ) + geom_freqpoly(color='red', fill='blue') + theme_classic()

## Warning: Ignoring unknown parameters: fill

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

School and Section

School E and D has a record of one section, school C has a record of 3 sections, school B has a record of 12 sections and school A has a record of 13 sessions.

## school and section

p <- ggplot(school_df,aes(School,Section) ) + geom_count(color='red', fill='blue', size=7) + theme_classic()
p

School and Middling

School A has the highest number of student middling with an exceptional record of 19 when compared to other schools. This record of 19 can be considered as an outlier.

## school and middling
m <- ggplot(school_df,aes(School, Middling) ) + geom_boxplot(color='green', fill='blue') + theme_classic()
n <- ggplot(school_df,aes(School, Middling) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(School, Middling) ) + geom_violin(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(School, Middling) ) + geom_col(color='green', fill='blue') + theme_classic()

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

School and Behind

School E has the highest record of students that are left behind

## School and Behind

m <- ggplot(school_df,aes(School, Behind) ) + geom_boxplot(color='green', fill='blue') + theme_classic()
n <- ggplot(school_df,aes(School, Behind) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(School, Behind) ) + geom_violin(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(School, Behind) ) + geom_col(color='green', fill='blue') + theme_classic()

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

School and More Behind

School A has the highest number of students that are more behind compared to other school records.

## school and more behind
##label <- c('Boxplot', 'Dotplot','Violine','Column Plot')

m <- ggplot(school_df,aes(School, More_Behind) ) + geom_boxplot(color='green', fill='blue') + theme_classic()
n <- ggplot(school_df,aes(School, More_Behind) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(School, More_Behind) ) + geom_violin(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(School, More_Behind) ) + geom_col(color='green', fill='blue') + theme_classic()

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

School and Very Behind

School A has the highest number of students that are very behind with an exceptional record of 24 which can be considered an outlier when it has been compared to another school.

## school and very behind

m <- ggplot(school_df,aes(School, Very_Behind) ) + geom_boxplot(color='green', fill='blue') + theme_classic()
n <- ggplot(school_df,aes(School, Very_Behind) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(School, Very_Behind) ) + geom_violin(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(School, Very_Behind) ) + geom_col(color='green', fill='blue') + theme_classic()

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

School and Completed

School E has the highest number of students that completed their section

## school and completed
m <- ggplot(school_df,aes(School, Completed) ) + geom_boxplot(color='green', fill='blue') + theme_classic()
n <- ggplot(school_df,aes(School, Completed) ) + geom_dotplot(color='green', fill='blue') + theme_classic()
o <- ggplot(school_df,aes(School, Completed) ) + geom_violin(color='green', fill='blue') + theme_classic()
p <- ggplot(school_df,aes(School, Completed) ) + geom_col(color='green', fill='blue') + theme_classic()

plot_grid(m,n,o,p, nrow = 2, ncol = 2, labels = 'AUTO')

## `stat_bindot()` using `bins = 30`. Pick better value with `binwidth`.

Correlation Plot

###
## Quick correlation plots of the datasets
pairs.panels(school_df[,-c(1,3)], gap=0)

RESULTS

Middling and Behind

With a p-value less than 0.05, it is statistically significant that a student middling might eventually fall behind. There is a positive correlation between student middling and the students that are behind.

## people middling and people behind
cor.test(school_df$Middling, school_df$Behind)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$Middling and school_df$Behind
## t = 2.364, df = 28, p-value = 0.02525
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.05582621 0.66974183
## sample estimates:
##       cor 
## 0.4078917

Middling and more Behind

With p-value greater than 0.05, it’s not statistically significant that a student middling will eventually fall behind. But Middling and behind are positively correlated.

## middling and more behind
cor.test(school_df$Middling, school_df$More_Behind)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$Middling and school_df$More_Behind
## t = 1.1339, df = 28, p-value = 0.2664
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1630434  0.5298085
## sample estimates:
##       cor 
## 0.2095337

Middling and very behind

With a p-value greater than 0.05, it’s not statistically significant that a student middling will be very behind. Middling and Behind are positively correlated.

## middling and very behind
cor.test(school_df$Middling, school_df$Very_Behind)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$Middling and school_df$Very_Behind
## t = 1.7336, df = 28, p-value = 0.09398
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.05510577  0.60387806
## sample estimates:
##       cor 
## 0.3113445

Middling and Completed

The p-value is less than 0.05; it is statistically significant to say that a student middling can also complete their session

## middling and completed
cor.test(school_df$Middling, school_df$Completed)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$Middling and school_df$Completed
## t = 2.4251, df = 28, p-value = 0.02201
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.06631535 0.67550686
## sample estimates:
##       cor 
## 0.4166307

Behind and More behind

With a p-value less than 0.05, it is statistically significant to say that a student that falls behind is likely to fall more behind with a strong positive correlation of 0.6.

## behind and more behind
cor.test(school_df$Behind, school_df$More_Behind)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$Behind and school_df$More_Behind
## t = 3.5667, df = 28, p-value = 0.001325
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.2487501 0.7651287
## sample estimates:
##       cor 
## 0.5589297

Behind and very behind

With a p-value of less than 0.05, it is statistically significant to say that a student that falls behind is likely to fall very behind with a strong positive correlation of 0.7.

## behind and very behind
cor.test(school_df$Behind, school_df$Very_Behind)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$Behind and school_df$Very_Behind
## t = 5.4284, df = 28, p-value = 8.609e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4795168 0.8556158
## sample estimates:
##       cor 
## 0.7160796

More Behind and very behind

With a p-value of less than 0.05, it is statistically significant to say that a student that falls more behind is likely to fall very behind with strong positive correlation of 0.7.

## more behind and very behind
cor.test(school_df$More_Behind, school_df$Very_Behind)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$More_Behind and school_df$Very_Behind
## t = 4.6514, df = 28, p-value = 7.191e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.3935567 0.8243978
## sample estimates:
##       cor 
## 0.6602163

Behind and completed

With a p-value greater than 0.05, it is not statistically significant to say that a student that falls behind is likely to complete his section with a positive correlation of 0.2.

## behind and completed
cor.test(school_df$Behind, school_df$Completed)

## 
##  Pearson's product-moment correlation
## 
## data:  school_df$Behind and school_df$Completed
## t = 1.3333, df = 28, p-value = 0.1932
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1271213  0.5556912
## sample estimates:
##       cor 
## 0.2443381

CONCLUSION

School A has the highest number of students behind when compared to other schools. It really has the highest number of sections recorded. School E has only one session recorded with the highest number of students that are behind. School E and D have only one record section, school C has 3, school B has 12 and school A has 13. The schools need to pay attention to students that are middling and start to encourage them or initiate a program that will motivate the students to start working ahead of there section, there is no student from any of the school that is ahead of there section. Students that start to be behind are more likely to be more behind and eventually be very behind which will hinder them to complete their sections.

References
Why is math so important? BY Mike Lefkowitz  (blog.mindresearch.org)
http://itdl.org/Journal/Dec_09/article03.htm