This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
student <- '/Users/manojmaganti/Downloads/SAI TEJA/DATA SETS/00001/student.csv'
sai <- read.csv(student, sep= ";", header = TRUE)
class(sai)
## [1] "data.frame"
str(sai)
## 'data.frame': 4424 obs. of 37 variables:
## $ Marital.status : int 1 1 1 1 2 2 1 1 1 1 ...
## $ Application.mode : int 17 15 1 17 39 39 1 18 1 1 ...
## $ Application.order : int 5 1 5 2 1 1 1 4 3 1 ...
## $ Course : int 171 9254 9070 9773 8014 9991 9500 9254 9238 9238 ...
## $ Daytime.evening.attendance. : int 1 1 1 1 0 0 1 1 1 1 ...
## $ Previous.qualification : int 1 1 1 1 1 19 1 1 1 1 ...
## $ Previous.qualification..grade. : num 122 160 122 122 100 ...
## $ Nacionality : int 1 1 1 1 1 1 1 1 62 1 ...
## $ Mother.s.qualification : int 19 1 37 38 37 37 19 37 1 1 ...
## $ Father.s.qualification : int 12 3 37 37 38 37 38 37 1 19 ...
## $ Mother.s.occupation : int 5 3 9 5 9 9 7 9 9 4 ...
## $ Father.s.occupation : int 9 3 9 3 9 7 10 9 9 7 ...
## $ Admission.grade : num 127 142 125 120 142 ...
## $ Displaced : int 1 1 1 1 0 0 1 1 0 1 ...
## $ Educational.special.needs : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Debtor : int 0 0 0 0 0 1 0 0 0 1 ...
## $ Tuition.fees.up.to.date : int 1 0 0 1 1 1 1 0 1 0 ...
## $ Gender : int 1 1 1 0 0 1 0 1 0 0 ...
## $ Scholarship.holder : int 0 0 0 0 0 0 1 0 1 0 ...
## $ Age.at.enrollment : int 20 19 19 20 45 50 18 22 21 18 ...
## $ International : int 0 0 0 0 0 0 0 0 1 0 ...
## $ Curricular.units.1st.sem..credited. : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Curricular.units.1st.sem..enrolled. : int 0 6 6 6 6 5 7 5 6 6 ...
## $ Curricular.units.1st.sem..evaluations. : int 0 6 0 8 9 10 9 5 8 9 ...
## $ Curricular.units.1st.sem..approved. : int 0 6 0 6 5 5 7 0 6 5 ...
## $ Curricular.units.1st.sem..grade. : num 0 14 0 13.4 12.3 ...
## $ Curricular.units.1st.sem..without.evaluations.: int 0 0 0 0 0 0 0 0 0 0 ...
## $ Curricular.units.2nd.sem..credited. : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Curricular.units.2nd.sem..enrolled. : int 0 6 6 6 6 5 8 5 6 6 ...
## $ Curricular.units.2nd.sem..evaluations. : int 0 6 0 10 6 17 8 5 7 14 ...
## $ Curricular.units.2nd.sem..approved. : int 0 6 0 5 6 5 8 0 6 2 ...
## $ Curricular.units.2nd.sem..grade. : num 0 13.7 0 12.4 13 ...
## $ Curricular.units.2nd.sem..without.evaluations.: int 0 0 0 0 0 5 0 0 0 0 ...
## $ Unemployment.rate : num 10.8 13.9 10.8 9.4 13.9 16.2 15.5 15.5 16.2 8.9 ...
## $ Inflation.rate : num 1.4 -0.3 1.4 -0.8 -0.3 0.3 2.8 2.8 0.3 1.4 ...
## $ GDP : num 1.74 0.79 1.74 -3.12 0.79 -0.92 -4.06 -4.06 -0.92 3.51 ...
## $ Target : Factor w/ 3 levels "Dropout","Enrolled",..: 1 3 1 3 3 3 3 1 3 1 ...
numeric <- sai[, 1:10]
summary(numeric)
## Marital.status Application.mode Application.order Course
## Min. :1.000 Min. : 1.00 Min. :0.000 Min. : 33
## 1st Qu.:1.000 1st Qu.: 1.00 1st Qu.:1.000 1st Qu.:9085
## Median :1.000 Median :17.00 Median :1.000 Median :9238
## Mean :1.179 Mean :18.67 Mean :1.728 Mean :8857
## 3rd Qu.:1.000 3rd Qu.:39.00 3rd Qu.:2.000 3rd Qu.:9556
## Max. :6.000 Max. :57.00 Max. :9.000 Max. :9991
## Daytime.evening.attendance. Previous.qualification
## Min. :0.0000 Min. : 1.000
## 1st Qu.:1.0000 1st Qu.: 1.000
## Median :1.0000 Median : 1.000
## Mean :0.8908 Mean : 4.578
## 3rd Qu.:1.0000 3rd Qu.: 1.000
## Max. :1.0000 Max. :43.000
## Previous.qualification..grade. Nacionality Mother.s.qualification
## Min. : 95.0 Min. : 1.000 Min. : 1.00
## 1st Qu.:125.0 1st Qu.: 1.000 1st Qu.: 2.00
## Median :133.1 Median : 1.000 Median :19.00
## Mean :132.6 Mean : 1.873 Mean :19.56
## 3rd Qu.:140.0 3rd Qu.: 1.000 3rd Qu.:37.00
## Max. :190.0 Max. :109.000 Max. :44.00
## Father.s.qualification
## Min. : 1.00
## 1st Qu.: 3.00
## Median :19.00
## Mean :22.28
## 3rd Qu.:37.00
## Max. :44.00
standard_deviation <- sd(sai$Daytime.evening.attendance., na.rm= TRUE)
variation <- var(sai$Previous.qualification..grade., na.rm= TRUE)
summ <- sum(sai$Application.order)
print(standard_deviation)
## [1] 0.3118967
print(variation)
## [1] 173.9321
print(summ)
## [1] 7644
plot(sai$Mother.s.qualification, sai$Father.s.qualification, main="Scatter Plot of X vs Y", xlab="Mother.s.qualification", ylab="Father.s.qualification")
hist(sai$Previous.qualification..grade., main="Histogram of grades", xlab="Previous.qualification..grade.")
pie(table(sai$Application.order), main="Pie Chart of Application orders")
boxplot(sai$Admission.grade, main="Box Plot of grades")
this data is related to the academic dropouts of students and failure in higher education. each instance or observation is one student the data includes various economic factors like mother’s and father’s occupation ; academic factors like parents education ,admission grades etc.
the goals are to identify major factors affecting the occurence of student dropouts and developing different solutions and ideas to reduce the risks at early stage of their academic paths.