“The data set I have taken is Students performance in Exam from Kaggle.The data set consists of Math score, Reading score and writing score. It provides data of gender race/ethnicity and the lunch data based on standard or free/reduced format. this data set provides Marks secured by the students in high school Students from the United States.”
'data.frame': 1000 obs. of 8 variables:
$ gender : chr "female" "female" "female" "male" ...
$ race.ethnicity : chr "group B" "group C" "group B" "group A" ...
$ parental.level.of.education: chr "bachelor's degree" "some college" "master's degree" "associate's degree" ...
$ lunch : chr "standard" "standard" "standard" "free/reduced" ...
$ test.preparation.course : chr "none" "completed" "none" "none" ...
$ math.score : int 72 69 90 47 76 71 88 40 64 38 ...
$ reading.score : int 72 90 95 57 78 83 95 43 64 60 ...
$ writing.score : int 74 88 93 44 75 78 92 39 67 50 ...
“The structure function is used to find the structure of the given attribute whether it is a character or integer or number.The attribute structure plays major role in plotting graphs like histogram, boxplot, scatterplot,etc”
[1] "gender" "race.ethnicity"
[3] "parental.level.of.education" "lunch"
[5] "test.preparation.course" "math.score"
[7] "reading.score" "writing.score"
“The names function provides the attribute name or the column of the data set we chosen.”
gender race.ethnicity parental.level.of.education
Length:1000 Length:1000 Length:1000
Class :character Class :character Class :character
Mode :character Mode :character Mode :character
lunch test.preparation.course math.score reading.score
Length:1000 Length:1000 Min. : 0.00 Min. : 17.00
Class :character Class :character 1st Qu.: 57.00 1st Qu.: 59.00
Mode :character Mode :character Median : 66.00 Median : 70.00
Mean : 66.09 Mean : 69.17
3rd Qu.: 77.00 3rd Qu.: 79.00
Max. :100.00 Max. :100.00
writing.score
Min. : 10.00
1st Qu.: 57.75
Median : 69.00
Mean : 68.05
3rd Qu.: 79.00
Max. :100.00
“This Summary function return the value of statistical method like mean, median, mode, 1st quartile, 3rd quartile and length of the dataset etc., this helps to move next level of plot mainly for box plot.”
data is skewed to the left, that means mean is less than the median As you can see in math score have some outlires. Mean value is: 66
“As you can see in reading score have some outlires.Mean value is: 69”
“As you can see in writing score have some outlires. Mean value is:
68”
[1] "By seeing this plot we know that female students are more than male student the count of female student is 518 and the total count of male student is 482"
We can see that male has better performance on math field,
Most of Female and Male students belonging from Group C and Group D and very less in Group A Most of Female and Male students belonging from some collage and associate’s degree and very less in master’s degree
race.ethnicity n
1 group A 89
2 group B 190
3 group C 319
4 group D 262
5 group E 140
As you can see most of the student belonging from Group C, group D and
students less in group A.
[1] "bachelor's degree" "some college" "master's degree"
[4] "associate's degree" "high school" "some high school"
[1] "from this plot we found that the highest are some college and after that the second highest of parental level of education is associate degree and the lowest of parental education is master's degree"
As you can see reading and writing score highly corelated
Here math score reading score is compared with Parental level of education that shown in the plot
Here math score reading score is compared with Race/Ethinicity that shown in the plot
Here math score reading score is compared with type of lunch that shown in the plot
Here math score reading score is compared with Test Preperation that shown in the plot
[1] "Students who completed test preparation courses get higher score."
[1] "The score of female in reading and writing is higher than male, but the score of male in math is higher than female."
[1] "The score of student whose parents possess master and bachelor level education are higher than others."
[1] "The score of students who were provided standard lunch is higher than those who were provided free/reduced lunch."
[1] "The distribution of Math score is left skewed says that mean it above the median"
---
title: "Student Performance in Exams"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
social: menu
theme : united
storyboard : TRUE
source_code : embed
---
```{r setup, include=FALSE}
library(flexdashboard)
data=read.csv("C:\\Users\\NADEEM\\OneDrive\\Desktop\\R\\Practicle\\StudentsPerformance.csv")
attach(data)
library(lattice)
library(plotly)
library(ggplot2)
library(tidyverse)
library(corrplot)
library(DT)
```
-----------------------------------------------------------------------
# Introduction {.tabset}
"The data set I have taken is Students performance in Exam from Kaggle.The data set consists of Math score, Reading score and writing score. It provides data of gender race/ethnicity and the lunch data based on standard or free/reduced format. this data set provides Marks secured by the students in high school Students from the United States."
## DD{.tabset}
### Structure
```{r}
str(data)
```
"The structure function is used to find the structure of the given attribute whether it is a character or integer or number.The attribute structure plays major role in plotting graphs like histogram, boxplot, scatterplot,etc"
### Attributes
```{r}
names(data)
```
"The names function provides the attribute name or the column of the data set we chosen."
### Summary
```{r}
summary(data)
```
"This Summary function return the value of statistical method like mean, median, mode, 1st quartile, 3rd quartile and length of the dataset etc., this helps to move next level of plot mainly for box plot."
# Score
## DD{.tabset}
### Math Score
```{r}
boxplot(data$math.score, col = "Yellow")
hist(data$math.score, col ="pink")
```
data is skewed to the left, that means mean is less than the median As you can see in math score have some outlires. Mean value is: 66
### Reading Score
```{r}
boxplot(data$reading.score, col = "blue")
hist(data$reading.score, col ="cyan")
```
"As you can see in reading score have some outlires.Mean value is: 69"
### Writing Score
```{r}
boxplot(data$writing.score, col = "maroon")
hist(data$writing.score, col ="green")
```
"As you can see in writing score have some outlires.
Mean value is: 68"
# Gender
## DD{.tabset}
### Gender Count
```{r}
"By seeing this plot we know that female students are more than male student the count of female student is 518 and the total count of male student is 482"
ggplot(data, aes(x = factor(gender),
fill = factor(gender))) +
geom_bar(width = 0.50) +
geom_text(aes(label = ..count..), stat = "count", vjust = 1.5, colour = "white")
```
### Both Gender Math Score
```{r}
histogram(~math.score|gender=="male",col="green")
histogram(~math.score|gender=="female",col="cyan")
```
We can see that male has better performance on math field,
### Groupwise Math Score
```{r}
counts <- table(data$math.score, data$gender)
barplot(counts, main="math score",
xlab="math score", col=c("darkblue","red"),
legend = colnames(counts), beside=TRUE)
counts1 <- table(data$reading.score, data$gender)
barplot(counts1, main="reading score",
xlab="reading score", col=c("darkgreen","yellow"),
legend = colnames(counts), beside=TRUE)
counts2 <- table(data$writing.score, data$gender)
barplot(counts2, main="wiriting score",
xlab="writing score", col=c("violet","grey"),
legend = colnames(counts), beside=TRUE)
```
*Most of Female and Male students belonging from Group C and Group D and very less in Group A
*Most of Female and Male students belonging from some collage and associate's degree and very less in master's degree
# Race/Ethnicity
## DD{.tabset}
### Count Race
```{r}
data %>% count(race.ethnicity)
countsr <- table(data$math.score, data$race.ethnicity)
barplot(countsr, main="math score",
xlab="math score on race ethnicity", col=c("darkblue","red","green","yellow","pink"),
legend = colnames(counts), beside=TRUE)
countsr1 <- table(data$reading.score, data$race.ethnicity)
barplot(countsr1, main="reading score",
xlab="reading score on race ethnicity", col=c("darkblue","red","green","yellow","pink"),
legend = colnames(counts), beside=TRUE)
countsr2 <- table(data$writing.score, data$race.ethnicity)
barplot(countsr2, main="writing score",
xlab="writing score on race ethnicity", col=c("darkblue","red","green","yellow","pink"),
legend = colnames(counts), beside=TRUE)
```
### BoxPlot on Race
```{r}
boxplot(math.score~race.ethnicity,ylab = "math score",xlab="race ethnicity group",main = "Math score respect to race/ethnicity" ,col=rainbow(length(unique(race.ethnicity))))
boxplot(reading.score~race.ethnicity,ylab = "math score",xlab="race ethnicity group",,main = "reading score respect to race/ethnicity",col=rainbow(length(unique(race.ethnicity))))
boxplot(writing.score~race.ethnicity,ylab = "math score",xlab="race ethnicity group",,main = "writing score respect to race/ethnicity",col=rainbow(length(unique(race.ethnicity))))
```
As you can see most of the student belonging from Group C, group D and students less in group A.
# Parental level of education
## DD{.tabset}
### Parent Education Type
```{r}
unique(data$parental.level.of.education)
```
### Parent Education
```{r}
ggplot(data, aes(x = factor(parental.level.of.education), fill = factor(parental.level.of.education))) +
geom_bar(height=0.50) +
geom_text(aes(label = ..count..), stat = "count", vjust = 1.5, colour = "white")
```
```{r}
"from this plot we found that the highest are some college and after that the second highest of parental level of education is associate degree and the lowest of parental education is master's degree"
```
# Relation
## DD{.tabset}
### Correlation
```{r}
a=data[c(6,7,8)]
corrplot(cor(a),method="number")
```
As you can see reading and writing score highly corelated
### Math and Writing Parental Education
```{r}
plot(x=math.score,y=writing.score,xlab = "MathScore",ylab = "Reading Score",main = "MathScore and Writing Score with parental Level Of Education", col=rainbow(length(unique(parental.level.of.education))))
```
Here math score reading score is compared with Parental level of education that shown in the plot
### Math Score with Race/ Ethinity
```{r}
plot(x=math.score,y=writing.score,xlab = "MathScore",ylab = "Reading Score",main = "MathScore and Writing Score with Race / Ethinity", col=rainbow(length(unique(race.ethnicity))))
```
Here math score reading score is compared with Race/Ethinicity that shown in the plot
### Math Score with Lunch
```{r}
plot(x=math.score,y=writing.score,xlab = "MathScore",ylab = "Reading Score",main = "MathScore and Writing Score with Lunch", col=rainbow(length(unique(lunch))))
```
Here math score reading score is compared with type of lunch that shown in the plot
### Math Score with Test Preperation
```{r}
plot(x=math.score,y=writing.score,xlab = "MathScore",ylab = "Reading Score",main = "MathScore and Writing Score with Test Preperation", col=rainbow(length(unique(test.preparation.course))))
```
Here math score reading score is compared with Test Preperation that shown in the plot
# Inference
```{r}
"Students who completed test preparation courses get higher score."
"The score of female in reading and writing is higher than male, but the score of male in math is higher than female."
"The score of student whose parents possess master and bachelor level education are higher than others."
"The score of students who were provided standard lunch is higher than those who were provided free/reduced lunch."
"The distribution of Math score is left skewed says that mean it above the median"
```
# Download
```{r}
datatable(data,extensions='Buttons',options=list(dom="Bftrip",buttons=c('copy','print','csv','pdf')))
```