libr— title: “MAT 143H - Final Project” author: “Fall 2017 Semester” output: html_document: default word_document: default —

Overview

Question 1: Does having a high school GPA corelate to having a high first year college GPA? Question 2: Does having a high SAT score give you a better can at doing good the first year of college? Question 3: Are the first year college GPA better for men or women?

Introduction

In this section, you will describe in slightly more detail the question being researched. You may elaborate on why you find the question worth researching. You should describe the original source that your data came from, which often can be found in the dataset’s documentation.

Exploring the Data

Store the information in the R Environment

library(openintro)
satGPA<-satGPA

Summary of the satGPA

summary(satGPA)
##       sex             SATV            SATM          SATSum     
##  Min.   :1.000   Min.   :24.00   Min.   :29.0   Min.   : 53.0  
##  1st Qu.:1.000   1st Qu.:43.00   1st Qu.:49.0   1st Qu.: 93.0  
##  Median :1.000   Median :49.00   Median :55.0   Median :103.0  
##  Mean   :1.484   Mean   :48.93   Mean   :54.4   Mean   :103.3  
##  3rd Qu.:2.000   3rd Qu.:54.00   3rd Qu.:60.0   3rd Qu.:113.0  
##  Max.   :2.000   Max.   :76.00   Max.   :77.0   Max.   :144.0  
##      HSGPA           FYGPA      
##  Min.   :1.800   Min.   :0.000  
##  1st Qu.:2.800   1st Qu.:1.980  
##  Median :3.200   Median :2.465  
##  Mean   :3.198   Mean   :2.468  
##  3rd Qu.:3.700   3rd Qu.:3.020  
##  Max.   :4.500   Max.   :4.000

Head Function

head(satGPA)
##   sex SATV SATM SATSum HSGPA FYGPA
## 1   1   65   62    127  3.40  3.18
## 2   2   58   64    122  4.00  3.33
## 3   2   56   60    116  3.75  3.25
## 4   1   42   53     95  3.75  2.42
## 5   1   55   52    107  4.00  2.63
## 6   2   55   56    111  4.00  2.91

Tail Function

tail(satGPA)
##      sex SATV SATM SATSum HSGPA FYGPA
## 995    1   49   55    104   3.0  2.42
## 996    2   50   50    100   3.7  2.19
## 997    1   54   54    108   3.3  1.50
## 998    1   56   58    114   3.5  3.17
## 999    1   55   65    120   2.3  1.94
## 1000   1   49   44     93   2.7  2.38

Find the structure in this dataset

str(satGPA)
## 'data.frame':    1000 obs. of  6 variables:
##  $ sex   : int  1 2 2 1 1 2 1 1 2 1 ...
##  $ SATV  : int  65 58 56 42 55 55 57 53 67 41 ...
##  $ SATM  : int  62 64 60 53 52 56 65 62 77 44 ...
##  $ SATSum: int  127 122 116 95 107 111 122 115 144 85 ...
##  $ HSGPA : num  3.4 4 3.75 3.75 4 4 2.8 3.8 4 2.6 ...
##  $ FYGPA : num  3.18 3.33 3.25 2.42 2.63 2.91 2.83 2.51 3.82 2.54 ...

Analysis

Question 1~Find the Scatterplot within the two observation

#Scatterplot for question 1
plot(satGPA$SATSum, satGPA$FYGPA)

#Scatterplot for question 2
plot(satGPA$HSGPA, satGPA$FYGPA)

Question 2~Finding the Correlation between the two variables

#Correlation for question 1
cor(satGPA$SATSum, satGPA$FYGPA)
## [1] 0.460281
#Correlation for question 2
cor(satGPA$HSGPA, satGPA$FYGPA)
## [1] 0.5433535

Question 3~Find the rregression line for the two observations

#reqression line equation for question 1
lm1<- lm(SATSum~FYGPA, data=satGPA)
lm1
## 
## Call:
## lm(formula = SATSum ~ FYGPA, data = satGPA)
## 
## Coefficients:
## (Intercept)        FYGPA  
##      81.421        8.877
#regression line equation for question 2
lm2<- lm(HSGPA~FYGPA, data=satGPA)
lm2
## 
## Call:
## lm(formula = HSGPA ~ FYGPA, data = satGPA)
## 
## Coefficients:
## (Intercept)        FYGPA  
##      2.2176       0.3973

Answer:Y=8.877x+81.421 Y=0.3973x+2.2176

Question 4~“Is the GPA better for men or women?”

table(satGPA$sex)
## 
##   1   2 
## 516 484
SATGPAmen<-subset(satGPA, satGPA$sex== "1")
SATGPAwomen<-subset(satGPA,satGPA$sex== "2")

Side-by-side boxplot

boxplot(SATGPAmen$FYGPA, SATGPAwomen$FYGPA)

Answer: In the Boxplot it does show that the women have a higher mean of first year college GPA than the men do.

t.test(SATGPAmen$FYGPA, SATGPAwomen$FYGPA)
## 
##  Welch Two Sample t-test
## 
## data:  SATGPAmen$FYGPA and SATGPAwomen$FYGPA
## t = -3.1768, df = 983.3, p-value = 0.001535
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.24026433 -0.05677744
## sample estimates:
## mean of x mean of y 
##  2.396066  2.544587

Suprisingly you can tell that there is a difference between the men and the women in the First year of college GPA.

Conclusions

In this conclusin I have come to conclusion in the dataset is that by looking at this, there is a correlation between the SAT and GPA from high school to the GPA of the first year at Dartmouth College. In the plots for both questions there was a somewhat correlation between having a good GPA or a good SAT to having a good first year college GPA. This doesnt necessarily mean that women have a better chance at getting through the year with breeze than men but at this particular college it shows that there is a difference.

Limitations

There are some limitations such as there shouldve been more varibles such as the amount of student that took honors classes and maybe depended on a higher college GPA. There is also some infromation that wasnt clear such as the SAT catagories because we dont know if its part of the new SATs or the old version. We aslo dont know how many clases the studenst at Dartmouth college took maybe some particular student took maybe they took more classes or some may have taken less which could lead to Dartmouth students having a great first semester.


This document was produced as a final project for MAT 143H - Introduction to Statistics (Honors) at North Shore Community College.
The course was led by Professor Billy Jackson.
Student Name: Jon Gjorga Semester: First