R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

student <- '/Users/manojmaganti/Downloads/SAI TEJA/DATA SETS/00001/student.csv'
sai <- read.csv(student, sep= ";", header = TRUE)
class(sai)
## [1] "data.frame"
str(sai)
## 'data.frame':    4424 obs. of  37 variables:
##  $ Marital.status                                : int  1 1 1 1 2 2 1 1 1 1 ...
##  $ Application.mode                              : int  17 15 1 17 39 39 1 18 1 1 ...
##  $ Application.order                             : int  5 1 5 2 1 1 1 4 3 1 ...
##  $ Course                                        : int  171 9254 9070 9773 8014 9991 9500 9254 9238 9238 ...
##  $ Daytime.evening.attendance.                   : int  1 1 1 1 0 0 1 1 1 1 ...
##  $ Previous.qualification                        : int  1 1 1 1 1 19 1 1 1 1 ...
##  $ Previous.qualification..grade.                : num  122 160 122 122 100 ...
##  $ Nacionality                                   : int  1 1 1 1 1 1 1 1 62 1 ...
##  $ Mother.s.qualification                        : int  19 1 37 38 37 37 19 37 1 1 ...
##  $ Father.s.qualification                        : int  12 3 37 37 38 37 38 37 1 19 ...
##  $ Mother.s.occupation                           : int  5 3 9 5 9 9 7 9 9 4 ...
##  $ Father.s.occupation                           : int  9 3 9 3 9 7 10 9 9 7 ...
##  $ Admission.grade                               : num  127 142 125 120 142 ...
##  $ Displaced                                     : int  1 1 1 1 0 0 1 1 0 1 ...
##  $ Educational.special.needs                     : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Debtor                                        : int  0 0 0 0 0 1 0 0 0 1 ...
##  $ Tuition.fees.up.to.date                       : int  1 0 0 1 1 1 1 0 1 0 ...
##  $ Gender                                        : int  1 1 1 0 0 1 0 1 0 0 ...
##  $ Scholarship.holder                            : int  0 0 0 0 0 0 1 0 1 0 ...
##  $ Age.at.enrollment                             : int  20 19 19 20 45 50 18 22 21 18 ...
##  $ International                                 : int  0 0 0 0 0 0 0 0 1 0 ...
##  $ Curricular.units.1st.sem..credited.           : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Curricular.units.1st.sem..enrolled.           : int  0 6 6 6 6 5 7 5 6 6 ...
##  $ Curricular.units.1st.sem..evaluations.        : int  0 6 0 8 9 10 9 5 8 9 ...
##  $ Curricular.units.1st.sem..approved.           : int  0 6 0 6 5 5 7 0 6 5 ...
##  $ Curricular.units.1st.sem..grade.              : num  0 14 0 13.4 12.3 ...
##  $ Curricular.units.1st.sem..without.evaluations.: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Curricular.units.2nd.sem..credited.           : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Curricular.units.2nd.sem..enrolled.           : int  0 6 6 6 6 5 8 5 6 6 ...
##  $ Curricular.units.2nd.sem..evaluations.        : int  0 6 0 10 6 17 8 5 7 14 ...
##  $ Curricular.units.2nd.sem..approved.           : int  0 6 0 5 6 5 8 0 6 2 ...
##  $ Curricular.units.2nd.sem..grade.              : num  0 13.7 0 12.4 13 ...
##  $ Curricular.units.2nd.sem..without.evaluations.: int  0 0 0 0 0 5 0 0 0 0 ...
##  $ Unemployment.rate                             : num  10.8 13.9 10.8 9.4 13.9 16.2 15.5 15.5 16.2 8.9 ...
##  $ Inflation.rate                                : num  1.4 -0.3 1.4 -0.8 -0.3 0.3 2.8 2.8 0.3 1.4 ...
##  $ GDP                                           : num  1.74 0.79 1.74 -3.12 0.79 -0.92 -4.06 -4.06 -0.92 3.51 ...
##  $ Target                                        : Factor w/ 3 levels "Dropout","Enrolled",..: 1 3 1 3 3 3 3 1 3 1 ...
numeric <- sai[, 1:10]
summary(numeric)
##  Marital.status  Application.mode Application.order     Course    
##  Min.   :1.000   Min.   : 1.00    Min.   :0.000     Min.   :  33  
##  1st Qu.:1.000   1st Qu.: 1.00    1st Qu.:1.000     1st Qu.:9085  
##  Median :1.000   Median :17.00    Median :1.000     Median :9238  
##  Mean   :1.179   Mean   :18.67    Mean   :1.728     Mean   :8857  
##  3rd Qu.:1.000   3rd Qu.:39.00    3rd Qu.:2.000     3rd Qu.:9556  
##  Max.   :6.000   Max.   :57.00    Max.   :9.000     Max.   :9991  
##  Daytime.evening.attendance. Previous.qualification
##  Min.   :0.0000              Min.   : 1.000        
##  1st Qu.:1.0000              1st Qu.: 1.000        
##  Median :1.0000              Median : 1.000        
##  Mean   :0.8908              Mean   : 4.578        
##  3rd Qu.:1.0000              3rd Qu.: 1.000        
##  Max.   :1.0000              Max.   :43.000        
##  Previous.qualification..grade.  Nacionality      Mother.s.qualification
##  Min.   : 95.0                  Min.   :  1.000   Min.   : 1.00         
##  1st Qu.:125.0                  1st Qu.:  1.000   1st Qu.: 2.00         
##  Median :133.1                  Median :  1.000   Median :19.00         
##  Mean   :132.6                  Mean   :  1.873   Mean   :19.56         
##  3rd Qu.:140.0                  3rd Qu.:  1.000   3rd Qu.:37.00         
##  Max.   :190.0                  Max.   :109.000   Max.   :44.00         
##  Father.s.qualification
##  Min.   : 1.00         
##  1st Qu.: 3.00         
##  Median :19.00         
##  Mean   :22.28         
##  3rd Qu.:37.00         
##  Max.   :44.00
standard_deviation <- sd(sai$Daytime.evening.attendance., na.rm= TRUE)
variation <- var(sai$Previous.qualification..grade., na.rm= TRUE)
summ <- sum(sai$Application.order)

print(standard_deviation)
## [1] 0.3118967
print(variation)
## [1] 173.9321
print(summ)
## [1] 7644
plot(sai$Mother.s.qualification, sai$Father.s.qualification, main="Scatter Plot of X vs Y", xlab="Mother.s.qualification", ylab="Father.s.qualification")

hist(sai$Previous.qualification..grade., main="Histogram of grades", xlab="Previous.qualification..grade.")

pie(table(sai$Application.order), main="Pie Chart of Application orders")

boxplot(sai$Admission.grade, main="Box Plot of grades")

data documentation

this data is related to the academic dropouts of students and failure in higher education. each instance or observation is one student the data includes various economic factors like mother’s and father’s occupation ; academic factors like parents education ,admission grades etc.

goals/purpose

the goals are to identify major factors affecting the occurence of student dropouts and developing different solutions and ideas to reduce the risks at early stage of their academic paths.