libr— title: “MAT 143H - Final Project” author: “Fall 2017 Semester” output: html_document: default word_document: default —
Question 1: Does having a high school GPA corelate to having a high first year college GPA? Question 2: Does having a high SAT score give you a better can at doing good the first year of college? Question 3: Are the first year college GPA better for men or women?
In this section, you will describe in slightly more detail the question being researched. You may elaborate on why you find the question worth researching. You should describe the original source that your data came from, which often can be found in the dataset’s documentation.
library(openintro)
satGPA<-satGPA
summary(satGPA)
## sex SATV SATM SATSum
## Min. :1.000 Min. :24.00 Min. :29.0 Min. : 53.0
## 1st Qu.:1.000 1st Qu.:43.00 1st Qu.:49.0 1st Qu.: 93.0
## Median :1.000 Median :49.00 Median :55.0 Median :103.0
## Mean :1.484 Mean :48.93 Mean :54.4 Mean :103.3
## 3rd Qu.:2.000 3rd Qu.:54.00 3rd Qu.:60.0 3rd Qu.:113.0
## Max. :2.000 Max. :76.00 Max. :77.0 Max. :144.0
## HSGPA FYGPA
## Min. :1.800 Min. :0.000
## 1st Qu.:2.800 1st Qu.:1.980
## Median :3.200 Median :2.465
## Mean :3.198 Mean :2.468
## 3rd Qu.:3.700 3rd Qu.:3.020
## Max. :4.500 Max. :4.000
head(satGPA)
## sex SATV SATM SATSum HSGPA FYGPA
## 1 1 65 62 127 3.40 3.18
## 2 2 58 64 122 4.00 3.33
## 3 2 56 60 116 3.75 3.25
## 4 1 42 53 95 3.75 2.42
## 5 1 55 52 107 4.00 2.63
## 6 2 55 56 111 4.00 2.91
tail(satGPA)
## sex SATV SATM SATSum HSGPA FYGPA
## 995 1 49 55 104 3.0 2.42
## 996 2 50 50 100 3.7 2.19
## 997 1 54 54 108 3.3 1.50
## 998 1 56 58 114 3.5 3.17
## 999 1 55 65 120 2.3 1.94
## 1000 1 49 44 93 2.7 2.38
str(satGPA)
## 'data.frame': 1000 obs. of 6 variables:
## $ sex : int 1 2 2 1 1 2 1 1 2 1 ...
## $ SATV : int 65 58 56 42 55 55 57 53 67 41 ...
## $ SATM : int 62 64 60 53 52 56 65 62 77 44 ...
## $ SATSum: int 127 122 116 95 107 111 122 115 144 85 ...
## $ HSGPA : num 3.4 4 3.75 3.75 4 4 2.8 3.8 4 2.6 ...
## $ FYGPA : num 3.18 3.33 3.25 2.42 2.63 2.91 2.83 2.51 3.82 2.54 ...
#Scatterplot for question 1
plot(satGPA$SATSum, satGPA$FYGPA)
#Scatterplot for question 2
plot(satGPA$HSGPA, satGPA$FYGPA)
#Correlation for question 1
cor(satGPA$SATSum, satGPA$FYGPA)
## [1] 0.460281
#Correlation for question 2
cor(satGPA$HSGPA, satGPA$FYGPA)
## [1] 0.5433535
#reqression line equation for question 1
lm1<- lm(SATSum~FYGPA, data=satGPA)
lm1
##
## Call:
## lm(formula = SATSum ~ FYGPA, data = satGPA)
##
## Coefficients:
## (Intercept) FYGPA
## 81.421 8.877
#regression line equation for question 2
lm2<- lm(HSGPA~FYGPA, data=satGPA)
lm2
##
## Call:
## lm(formula = HSGPA ~ FYGPA, data = satGPA)
##
## Coefficients:
## (Intercept) FYGPA
## 2.2176 0.3973
Answer:Y=8.877x+81.421 Y=0.3973x+2.2176
table(satGPA$sex)
##
## 1 2
## 516 484
SATGPAmen<-subset(satGPA, satGPA$sex== "1")
SATGPAwomen<-subset(satGPA,satGPA$sex== "2")
boxplot(SATGPAmen$FYGPA, SATGPAwomen$FYGPA)
Answer: In the Boxplot it does show that the women have a higher mean of first year college GPA than the men do.
t.test(SATGPAmen$FYGPA, SATGPAwomen$FYGPA)
##
## Welch Two Sample t-test
##
## data: SATGPAmen$FYGPA and SATGPAwomen$FYGPA
## t = -3.1768, df = 983.3, p-value = 0.001535
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.24026433 -0.05677744
## sample estimates:
## mean of x mean of y
## 2.396066 2.544587
Suprisingly you can tell that there is a difference between the men and the women in the First year of college GPA.
In this conclusin I have come to conclusion in the dataset is that by looking at this, there is a correlation between the SAT and GPA from high school to the GPA of the first year at Dartmouth College. In the plots for both questions there was a somewhat correlation between having a good GPA or a good SAT to having a good first year college GPA. This doesnt necessarily mean that women have a better chance at getting through the year with breeze than men but at this particular college it shows that there is a difference.
There are some limitations such as there shouldve been more varibles such as the amount of student that took honors classes and maybe depended on a higher college GPA. There is also some infromation that wasnt clear such as the SAT catagories because we dont know if its part of the new SATs or the old version. We aslo dont know how many clases the studenst at Dartmouth college took maybe some particular student took maybe they took more classes or some may have taken less which could lead to Dartmouth students having a great first semester.
This document was produced as a final project for MAT 143H - Introduction to Statistics (Honors) at North Shore Community College.
The course was led by Professor Billy Jackson.
Student Name: Jon Gjorga Semester: First