Macalester Student's satisfaction about academic credit upper limit (Responses)

Mon Dec 16 05:12:50 2013

Amy Tran, Adrian Liang Chang, Masataka Yui, Priit Paidla

Math 155 Survey Project

This was our original survey.

Read in the “Raw” Data from Google

myCSVLink = "https://docs.google.com/spreadsheet/pub?key=0AsXterp7HII8dF90VjZTOU1iQl9qV0JXV3RXd291UVE&output=csv"
d = fetchGoogle(myCSVLink)
### Put in NA for empty or blank answers
for (k in 1:length(d)) {
    temp = d[[k]]
    temp[temp %in% c("", " ", "  ", "   ")] = NA
    d[k] = temp
}

Fix the Names

The names of variables generated by Google Form are too verbose.

origNames = names(d)
origNames
##  [1] "Timestamp"                                                                                                                 
##  [2] "How.many.AP.IB.credits.did.you.have.coming.to.Macalester."                                                                 
##  [3] "Which.division.of.study.do.you.most.strongly.affiliate.yourself.with."                                                     
##  [4] "What.is.your.gender."                                                                                                      
##  [5] "How.many.hours.do.you.typically.spend.per.week.working.an.on.or.off.campus.job.during.the.academic.school.year."           
##  [6] "Where.are.you.from."                                                                                                       
##  [7] "What.year.student.are.you.currently.at.Macalester."                                                                        
##  [8] "How.many.hours.do.you.spend.on.extracurricular.activities..varsity.sports..clubs..choir..etc..per.week.."                  
##  [9] "How.many.credits.are.you.currently.taking.this.semester."                                                                  
## [10] "Are.you.interested.in.taking.more.than.18.credits.if.there.s.no.extra.charge..If.so..how.many.credits.do.you.want.to.take."
## [11] "Academically..how.qualified.did.you.feel.coming.to.Macalester."                                                            
## [12] "Would.you.be.interested.to.graduate.in.less.than.4.years."                                                                 
## [13] "Are.you.interested.in.taking.summer.or.J.term.courses.for.credits."                                                        
## [14] "If.you.are.interested.in.taking.more.than.18.credits.per.semester..which.reason.most.likely.applies.to.you."               
## [15] "Do.you.agree.with.the.current.academic.credit.limit...18.credits.per.semester."

As you can see, there are 15 different variables. You're going to rename each of them. Use the following statements to do so:

names(d)[2] = "APIB"
names(d)[3] = "Major"
## fix the bogus natural science level
d[[3]] = as.character(d[[3]])
hoo = grepl("^Natural.+", d[[3]])
d[hoo, 3] = "Natural Sciences"
names(d)[4] = "Gender"

Make sure to change the number in each line and that you choose an appropriate mnemonic short name.

TASK 1

Use this style to complete the change of names, e.g.

names(d)[5] = "Work"
names(d)[6] = "Origin"
names(d)[7] = "Year"
names(d)[8] = "Extracurr"
names(d)[9] = "Currentcreds"
names(d)[10] = "Interest"
names(d)[11] = "Qualified"
names(d)[12] = "Graduate"
names(d)[13] = "Summerjterm"
names(d)[14] = "Reason"
names(d)[15] = "Agree"
# Put the rest of your commands here.

Categorical Variables

Often, the levels produced by Google Forms are too verbose for convenience. After all, they were designed for another purpose: to be informative to a human completing your survey. It's helpful to change the names to be more convenient for display. To do this, construct a vector that tells what should be the new level for each existing level. You need to be careful to get the spelling exactly right. Also, make sure to list every possible level from your form, even if there are some that nobody selected in your survey.

require(plyr) # just need to do once, like require(mosaic)
newLevels = c("Male"="M","Female"="F",
              "Other"="O")
originLevels = c("Minnesota"="MN","Domestic, but other US state"="US",
              "Out of US (international)"="INTL")
yearLevels = c("First Year"="Fir","Sophomore"="Sop",
              "Junior"="Jun","Senior"="Sen")
reasonLevels = c("I would like the option to graduate in less than 4 years"="grad","I want to be on track when I come back from study abroad"="abr","I find the current semester credit limit load too easy"="eas","The current semester credit limit doesn’t allow me to take all the classes I find interesting"="int","I need to take more credits to make up the previous semester(s)"="mak")
graduateLevels = c("Yes, I definitely want to obtain a degree in less than 4 years"="Yes","Maybe, but at least Macalester should give me the option"="Maybe","No, it takes away from my experience at Macalester as a liberal arts college"="No")
summerjtermLevels = c("Yes, I would already have money set aside for summer/J-term courses"="Yes","Maybe, I at least want the option"="Maybe","No, I plan to do other things like internships/other paid jobs or travel"="No")
majorLevels = c("Social sciences"="SS","Natural Sciences"="NS",
              "Fine arts"="FA","Humanities"="H","Don't Know Yet"="DKY")

Now you will assign these new levels to your variable:

d$Gender = revalue(d$Gender, newLevels)
d$Gender = factor(d$Gender, levels = newLevels)
d$Major = revalue(d$Major, majorLevels)
d$Major = factor(d$Major, levels = majorLevels)
d$Origin = revalue(d$Origin, originLevels)
d$Origion = factor(d$Origin, levels = originLevels)
d$Year = revalue(d$Year, yearLevels)
d$Year = factor(d$Year, levels = yearLevels)
d$Reason = revalue(d$Reason, reasonLevels)
d$Reason = factor(d$Reason, levels = reasonLevels)
d$Graduate = revalue(d$Graduate, graduateLevels)
d$Graduate = factor(d$Graduate, levels = graduateLevels)
d$Summerjterm = revalue(d$Summerjterm, summerjtermLevels)
d$Summerjterm = factor(d$Summerjterm, levels = summerjtermLevels)

This involves two commands for each variable. The first changes the names of the levels. The second does something a little more obscure. It makes sure that the full set of possible levels is available for graphics, models, etc.

You may also want to set the reference level explicitly. You can do this with a statement of this sort:

relevel(d$Gender, ref = "F")
##   [1] F F F M F F F M F F M F F F F M F F F F M F F F F M F M F M F M F F F
##  [36] M M M M M O M M F F F M F F M M F F M F F F F F M M M F M F F F F M F
##  [71] F F F M F F F M F M F F F F F M F F F F M F F F M M F M F F F M F M M
## [106] M F F F F F M M M M F F F F F F F M M F F M F M F F F F F F F F M F F
## [141] O M F M M F F M M F M M M M M M M F F F F F F F M M F F F F F M F F F
## [176] F F M M F F F M M F F F F F F F F F F M F M F F M M F F F M
## Levels: F M O
relevel(d$Origion, ref = "MN")
##   [1] INTL US   US   US   US   MN   US   US   US   US   US   US   US   US  
##  [15] US   US   US   MN   US   US   INTL US   US   US   US   INTL US   US  
##  [29] US   US   INTL US   US   US   US   MN   US   US   US   US   US   INTL
##  [43] US   US   US   US   INTL INTL MN   MN   US   US   INTL MN   US   US  
##  [57] US   US   US   US   US   INTL US   MN   US   US   US   US   INTL INTL
##  [71] INTL MN   US   US   US   US   US   INTL US   INTL US   US   US   US  
##  [85] MN   INTL US   INTL US   US   INTL US   INTL INTL INTL MN   US   US  
##  [99] US   INTL US   US   INTL INTL US   INTL <NA> US   US   US   US   MN  
## [113] US   US   US   US   INTL US   MN   US   US   US   MN   US   MN   US  
## [127] INTL MN   MN   MN   US   US   US   US   US   US   US   US   US   INTL
## [141] US   US   US   US   US   US   INTL US   US   US   MN   MN   US   US  
## [155] INTL MN   US   US   US   US   INTL US   US   US   US   MN   US   MN  
## [169] MN   US   US   US   US   US   US   US   US   US   US   INTL US   US  
## [183] US   US   INTL US   US   MN   US   US   US   US   US   MN   US   INTL
## [197] US   INTL US   MN   US   MN   US   US   US  
## Levels: MN US INTL
relevel(d$Year, ref = "Fir")
##   [1] Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen 
##  [15] Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen  Sen 
##  [29] Sen  Sen  Sen  Sen  Sen  Sen  Jun  Sen  Sen  Sen  Sen  Sen  Sen  Sop 
##  [43] Sop  Sop  Sop  Sop  Sop  Sop  Sop  Sop  Sop  Fir  Sop  Sop  Sop  Sop 
##  [57] Sop  Sop  Sop  Sop  Fir  <NA> Sop  Sop  Sop  Sop  Sen  Fir  Sop  Sop 
##  [71] Sop  Fir  Jun  Fir  Sop  Sop  Sop  Sen  Fir  Jun  Jun  Fir  Sop  Fir 
##  [85] Fir  Sop  Jun  Fir  Sen  Fir  Sop  Sop  Fir  Fir  Fir  Sop  Sop  Sop 
##  [99] Fir  Sop  Sop  Sop  Sop  Sop  Fir  Sop  Fir  Sop  Jun  Sen  Sop  Sop 
## [113] Sop  Fir  Sop  Sop  Sop  Jun  Sop  Sop  Fir  Sop  Sop  Fir  Jun  Fir 
## [127] Jun  Sop  Sop  Sop  Jun  Fir  Sop  Sop  Jun  Fir  Sen  Fir  Sen  Sop 
## [141] Fir  Sop  Sop  Fir  Fir  Fir  Sop  Fir  Fir  Jun  Fir  Fir  Jun  Sop 
## [155] Jun  Sop  Jun  Sop  Fir  Jun  Sop  Jun  Jun  Sop  Sop  Sop  Sop  Jun 
## [169] Fir  Fir  Sop  Sop  Sop  Sop  Sop  Sop  Sop  Jun  Sop  Fir  Sop  Sop 
## [183] Fir  Sop  Sop  Sop  Sop  Jun  Sop  Fir  Sop  Fir  Sen  Fir  Fir  Fir 
## [197] Sop  Fir  Jun  Sen  Sop  Sop  Sop  Jun  Fir 
## Levels: Fir Sop Jun Sen
relevel(d$Reason, ref = "grad")
##   [1] <NA> int  int  int  int  int  abr  int  <NA> int  int  int  <NA> int 
##  [15] <NA> <NA> mak  grad int  <NA> grad int  grad abr  int  int  <NA> int 
##  [29] int  int  <NA> int  int  int  int  <NA> int  abr  <NA> grad int  eas 
##  [43] int  int  <NA> abr  int  int  grad int  int  int  int  int  abr  grad
##  [57] int  int  <NA> mak  int  <NA> <NA> mak  int  int  abr  int  int  mak 
##  [71] int  <NA> abr  <NA> int  <NA> abr  grad int  <NA> int  int  <NA> grad
##  [85] int  grad abr  int  <NA> int  grad int  int  int  int  int  int  <NA>
##  [99] int  int  int  int  int  int  int  int  int  int  int  int  abr  abr 
## [113] <NA> <NA> int  mak  int  eas  mak  int  int  int  int  <NA> <NA> <NA>
## [127] int  int  int  mak  int  mak  int  abr  int  abr  int  abr  <NA> <NA>
## [141] int  int  <NA> <NA> <NA> abr  int  <NA> grad int  abr  abr  int  abr 
## [155] int  int  int  int  int  int  int  <NA> int  int  <NA> mak  <NA> <NA>
## [169] int  <NA> eas  <NA> int  int  abr  <NA> abr  grad grad mak  abr  int 
## [183] abr  mak  int  <NA> int  abr  int  abr  int  abr  grad int  int  int 
## [197] mak  grad <NA> <NA> int  int  abr  int  grad
## Levels: grad abr eas int mak
relevel(d$Graduate, ref = "Yes")
##   [1] No    No    Maybe Yes   No    No    Maybe Maybe No    Maybe No   
##  [12] Maybe Maybe Maybe No    No    No    Maybe Maybe No    Yes   Maybe
##  [23] Maybe Maybe No    Maybe No    Maybe Maybe No    No    Maybe Maybe
##  [34] Maybe No    Maybe No    Maybe Maybe Yes   Maybe No    Maybe No   
##  [45] No    Maybe No    Maybe Yes   No    Maybe No    No    No    No   
##  [56] Yes   Maybe Maybe No    Maybe Maybe No    No    No    No    Maybe
##  [67] Maybe No    No    No    No    No    Yes   No    No    Maybe Maybe
##  [78] Yes   Maybe Maybe No    No    Maybe Maybe No    Yes   No    Maybe
##  [89] No    Maybe Yes   No    Maybe No    No    Maybe Maybe Maybe Maybe
## [100] Maybe No    Maybe No    No    No    No    Yes   No    Maybe Maybe
## [111] Maybe Maybe Maybe No    No    No    Maybe No    Yes   Maybe No   
## [122] Yes   No    Maybe No    Maybe Yes   Maybe Maybe Maybe Maybe No   
## [133] No    No    No    Maybe Maybe Maybe Maybe Maybe No    No    No   
## [144] No    Maybe No    No    No    Yes   Maybe No    No    Maybe Maybe
## [155] Maybe Maybe Maybe No    Maybe Maybe No    No    Maybe No    Maybe
## [166] Maybe Maybe Maybe Maybe No    Maybe Yes   No    No    Maybe No   
## [177] Maybe Maybe Maybe Maybe No    Maybe Maybe No    No    No    Maybe
## [188] Maybe Maybe No    No    No    Maybe No    Maybe No    No    Maybe
## [199] Maybe No    Yes   Maybe No    Maybe Yes  
## Levels: Yes Maybe No
relevel(d$Summerjterm, ref = "Yes")
##   [1] No    Maybe No    <NA>  Maybe No    Maybe No    Maybe <NA>  No   
##  [12] Maybe No    Maybe No    No    <NA>  No    Maybe Maybe <NA>  No   
##  [23] Maybe Maybe Maybe Maybe No    Maybe Maybe Maybe No    <NA>  <NA> 
##  [34] <NA>  <NA>  Maybe No    <NA>  No    Yes   Maybe Yes   Maybe No   
##  [45] Maybe Maybe No    Maybe Maybe No    Maybe Maybe Maybe No    No   
##  [56] Yes   Maybe Maybe Maybe No    Yes   No    No    Yes   Maybe Yes  
##  [67] No    No    No    No    No    Maybe Yes   Yes   Maybe Maybe Maybe
##  [78] Maybe Maybe Maybe Maybe Yes   Maybe Maybe Maybe Maybe Yes   Maybe
##  [89] Maybe Maybe Yes   Maybe Maybe Maybe No    Yes   Yes   Maybe Maybe
## [100] Maybe Maybe Yes   Maybe No    Maybe No    No    No    No    No   
## [111] No    Maybe No    Maybe No    Maybe Yes   Maybe Maybe Maybe Maybe
## [122] Yes   No    Maybe Maybe Maybe No    Maybe Yes   Yes   Maybe Maybe
## [133] Maybe Yes   Maybe No    Maybe Maybe Maybe Maybe No    Maybe Maybe
## [144] Maybe No    Maybe Maybe No    Maybe Maybe Maybe Maybe Maybe Maybe
## [155] No    No    Yes   Yes   Yes   No    No    Maybe Yes   Maybe Yes  
## [166] Maybe Maybe No    Maybe Maybe No    Maybe Maybe Maybe Maybe Maybe
## [177] No    Maybe No    Maybe Yes   Yes   Maybe Yes   No    No    Maybe
## [188] Yes   Maybe Maybe Maybe Yes   Maybe Maybe Yes   Maybe Maybe Yes  
## [199] <NA>  Maybe Maybe Yes   Maybe Maybe Yes  
## Levels: Yes Maybe No
relevel(d$Major, ref = "FA")
##   [1] SS  SS  H   NS  SS  SS  SS  NS  SS  SS  SS  SS  NS  NS  SS  NS  SS 
##  [18] SS  SS  SS  SS  SS  H   NS  FA  SS  SS  SS  SS  H   H   SS  SS  H  
##  [35] SS  SS  SS  SS  SS  NS  SS  NS  NS  NS  SS  NS  NS  NS  H   NS  SS 
##  [52] DKY SS  NS  SS  SS  FA  H   H   NS  DKY NS  NS  SS  NS  SS  SS  SS 
##  [69] SS  NS  SS  H   SS  H   NS  NS  SS  NS  DKY SS  NS  H   H   DKY NS 
##  [86] SS  NS  SS  H   DKY SS  H   SS  SS  SS  NS  SS  SS  NS  SS  H   H  
## [103] SS  SS  SS  SS  SS  SS  SS  H   H   SS  NS  NS  NS  NS  SS  FA  NS 
## [120] SS  NS  NS  H   NS  NS  DKY SS  H   NS  NS  SS  DKY NS  SS  H   NS 
## [137] H   SS  H   SS  DKY H   SS  SS  NS  DKY NS  NS  NS  H   NS  NS  SS 
## [154] DKY NS  NS  SS  NS  DKY H   SS  NS  NS  NS  NS  H   NS  NS  DKY DKY
## [171] SS  NS  SS  NS  NS  SS  SS  SS  SS  NS  SS  NS  SS  SS  SS  SS  NS 
## [188] SS  SS  H   NS  H   NS  H   NS  SS  NS  FA  H   FA  NS  NS  FA  SS 
## [205] SS 
## Levels: FA SS NS H DKY

Ordinal Variables

Many of the survey questions are on a Likert Scale. You will want to simplify the names and also to tell R that there is a natural order. For example, the Web variable in our survey has a natural ordering.

Here's the renaming step:

likertLevels = c("Overqualified"="Over" ,
                 "Qualified"="Qual", 
                 "It's my reach school"="Reach",
                 "Prefer not to say"="Notsay")
d$Qualified = revalue(d$Qualified, likertLevels)
d$Qualified = factor(d$Qualified, ordered=TRUE,levels=likertLevels)
likertLevels = c("< 12 credits"="<12" ,
                 "12-13 credits"="12-13", 
                 "14-15 credits"="14-15",
                 "16-17 credits"="16-17","18 credits"="18","> 18 credits"=">18")
d$Currentcreds = revalue(d$Currentcreds, likertLevels)
d$Currentcreds = factor(d$Currentcreds, ordered=TRUE,levels=likertLevels)
likertLevels = c("0"="0" ,
                 "1-2"="1-2", 
                 "3-4"="3-4",
                 "5+"="5+","Other:"="Other")
d$APIB = revalue(d$APIB, likertLevels)
## The following `from` values were not present in `x`: Other:
d$APIB = factor(d$APIB, ordered=TRUE,levels=likertLevels)
likertLevels = c("No, I'm good with 18 credits upper limit"="Good" ,
                 "Yes, 20-21 credits"="20-21", 
                 "Yes, 22-23 credits"="22-23",
                 "Yes, 24 credits"="24","Yes, >24 credits( Woo good luck~)"=">24")
d$Interest = revalue(d$Interest, likertLevels)
d$Interest = factor(d$Interest, ordered=TRUE,levels=likertLevels)
likertLevels = c("0 hours"="0" ,
                 "1- 4 hours"="1-4", 
                 "5-10 hours"="5-10",
                 "11-15 hours"="11-15","> 15 hours"=">15")
d$Work = revalue(d$Work, likertLevels)
d$Work = factor(d$Work, ordered=TRUE,levels=likertLevels)
likertLevels = c("0 hour (Not doing any activities)"="0" ,
                 "1-5 Hours"="1-5", 
                 "6-10 Hours"="6-10",
                 "11-15 Hours"="11-15","16-20 Hours"="16-20","More than 20 Hours"=">20")
d$Extracurr = revalue(d$Extracurr, likertLevels)
## The following `from` values were not present in `x`: 0 hour (Not doing any
## activities)
d$Extracurr = factor(d$Extracurr, ordered=TRUE,levels=likertLevels)

When you construct the translation (here called likertLevels), make sure to order it in the natural way, from one end to the other.

Now, tell R that the variable is ordered:

head(d$Agree)
## [1]  2 -2 -1  0  1  1
head(d$Currentcreds)
## [1] 12-13 12-13 18    16-17 16-17 14-15
## Levels: <12 < 12-13 < 14-15 < 16-17 < 18 < >18
head(d$Qualified)
## [1] Qual Qual Qual Qual Qual Qual
## Levels: Over < Qual < Reach < Notsay
head(d$APIB)
## [1] 0   5+  3-4 5+  5+  5+ 
## Levels: 0 < 1-2 < 3-4 < 5+ < Other
head(d$Interest)
## [1] <NA>  20-21 22-23 <NA>  20-21 20-21
## Levels: Good < 20-21 < 22-23 < 24 < >24
head(d$Work)
## [1] >15   11-15 5-10  0     0     11-15
## Levels: 0 < 1-4 < 5-10 < 11-15 < >15
head(d$Extracurr)
## [1] 1-5   1-5   1-5   16-20 6-10  1-5  
## Levels: 0 < 1-5 < 6-10 < 11-15 < 16-20 < >20
with(d, class(Agree))
## [1] "integer"

Background

As a liberal arts college, Macalester College has a class registration upper limit of 18-credits per semester. Students are allowed to participate in research, fellowships, and internships over the January term, but there are no formal classes available during this time except a summer physics program. There are also no summer classes available to students. This survey explores the attiudes and satisfaction of Macalester students with the upper 18-credits per semester class registration limit. We want to learn what our peers think of this limit, because from personal experience, students have voiced that he or she is taking 18 credits or wants to take more credits. We hope to gain some insight from this survey about the satisfaction and reasoning for a want to increase or keep the current semester credit limit. Ultimately, this could help provide a foundation to future considerations and changes in credit policies to accommodate more students.

We had several hypotheses in designing the survey, with two of them listed here:

1) Students who want to graduate earlier than 4 years have a negative attitude towards the current academic limit of 18 credits.

2) Natural science majors disagree the most with the current academic limit of 18 credits.

Don't be afraid to state hypotheses that you think are obvious. Even if it's obvious, you'll still want to try to demonstrate them from your data.

Methods

The survey consisted of 14 multiple-choice questions. We distributed the survey by posting a link to it on our personal facebooks and onto the facebook walls of each class group (class of 2014, class of 2015, etc.). We tried to be considerate and polite when asking for responses and said the survey should not take more than 5 minutes.

Description of the Variables

”Agree” is the response variable for both hypotheses. It shows the student’s attitude towards the current academic policy. It was originally a categorical variable with 5 levels: Strongly Agree, Agree, Neutral, Disagree and Strongly Disagree. We transformed it into a quantitative variable from -2 to 2, with -2 being “Strongly Disagree” and 2 being “Strongly Agree”.

“Graduate” is a categorical variable that shows students’ desire to graduate earlier than 4 years. There are three levels for the Graduate variable: Yes, Maybe, No.

“Major” is a categorical variable denoting a student’s major. There are five levels: NS stands for Natural Sciences, SS for Social Sciences, FA for Fine Arts, H for Humanities and DKY for Do Not Know.

The majority students answering the survey were sophomores, following by similar amount of seniors and first years, and the number of juniors is relatively small:

barchart(tally(~Year, data = d, margins = FALSE, format = "count"), auto.key = TRUE)

plot of chunk unnamed-chunk-16

The majority students answering the survey were social science(SS in the graph) and natural science majors(NS in the graph).

barchart(tally(~Major, data = d, margins = FALSE, format = "count"), auto.key = TRUE)

plot of chunk unnamed-chunk-17

Graphical descriptions of relationships between variables

Hypothesis 1

From the two graphs below it seems that those who want to graduate early tend to disagree with the current academic policy of 18 credits.

mosaicplot(Agree ~ Graduate, data = d, las = 2, col = rainbow(5))

plot of chunk unnamed-chunk-18

bwplot(Agree ~ Graduate, data = d)

plot of chunk unnamed-chunk-18

Hypothesis 2

From the following two graphs it seems that natural science majors disagree the most with the current academic limit of 18 credits.

mosaicplot(Agree ~ Major, data = d, las = 2, col = rainbow(5))

plot of chunk unnamed-chunk-19

bwplot(Agree ~ Major, data = d)

plot of chunk unnamed-chunk-19

TASK 3:

Modeling Analysis

(1)Does the students who want to graduate early tend to disagree with the credit upper limit? Here's a linear regression model of whether a student will agree or disagree with this policy(2=strongly agree and -2=strongly disagree) and whether they tend to graduate early:

mod1=lm(Agree ~ Graduate,data=d)
summary(mod1)
## 
## Call:
## lm(formula = Agree ~ Graduate, data = d)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -2.044 -1.044 -0.044  0.956  2.765 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept)     -0.765      0.300   -2.55    0.012 *
## GraduateMaybe    0.538      0.326    1.65    0.100  
## GraduateNo       0.809      0.327    2.47    0.014 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.24 on 202 degrees of freedom
## Multiple R-squared:  0.0325, Adjusted R-squared:  0.0229 
## F-statistic:  3.4 on 2 and 202 DF,  p-value: 0.0354
anova(mod1)
## Analysis of Variance Table
## 
## Response: Agree
##            Df Sum Sq Mean Sq F value Pr(>F)  
## Graduate    2   10.4    5.21     3.4  0.035 *
## Residuals 202  309.9    1.53                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The coefficient on intercept(the first level on the categorical variable, which is GraduateYes) is -0.765. This suggests that students who want to graduate early tend to disagree with the credit upper limit policy. The coefficient on Graduate No is 0.809. This suggests that students who don’t to graduate early tend to agree with the credit upper limit policy. The p value for intercept and GraduateNo are both <0.05, thus we can reject the null hypothesis and demonstrate that the relationship between graduate early and agreement with credit policy has statistical significance. The Anova test gives us a p value <0.05 as well, indicating a statistical significance between graduate plan and agreement with credit policy.

(2)Does natural science majors tend to disagree with the current academic limit of 18 credits? Here's a linear regression model of whether a student will agree or disagree with this policy(2=strongly agree and -2=strongly disagree) and their majors:

mod2=lm(Agree ~ Major,data=d)
summary(mod2)
## 
## Call:
## lm(formula = Agree ~ Major, data = d)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.1667 -0.9195  0.0805  1.0805  2.3824 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -0.0805     0.1340   -0.60     0.55
## MajorNS      -0.3019     0.2023   -1.49     0.14
## MajorFA       0.2471     0.5274    0.47     0.64
## MajorH        0.2418     0.2614    0.92     0.36
## MajorDKY     -0.2272     0.3716   -0.61     0.54
## 
## Residual standard error: 1.25 on 200 degrees of freedom
## Multiple R-squared:  0.025,  Adjusted R-squared:  0.00554 
## F-statistic: 1.28 on 4 and 200 DF,  p-value: 0.277

The coefficient of MajorNS is -0.3(which is the largest coefficient in scale comparing with other levels in majors), indicating that nature science majors tend to disagree the most with the major policies. However, the p value is 0.14, which is large enough to fail to reject the null hypothesis.

Sample Size

If your p-values are too large to reject the null, it's helpful to give some guidance to future researchers. Select a sample size that will give you a p-value of 0.01 and report that. To do this, you'll need to vary the sample size until you find one that works reliably. You don't have to show the calculations you do, just give the result. (Your instructor can check it out by using that sample size!) (1)For hypothesis 1-Graduate early with disagreement

largerSample = resample(d,size=600)
mod3=lm(Agree ~ Graduate,data=largerSample)
## 
## Call:
## lm(formula = Agree ~ Graduate, data = largerSample)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0591 -1.0591  0.0731  1.0731  2.5556 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     -0.556      0.183   -3.03   0.0025 **
## GraduateMaybe    0.482      0.196    2.46   0.0143 * 
## GraduateNo       0.615      0.199    3.09   0.0021 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.23 on 597 degrees of freedom
## Multiple R-squared:  0.016,  Adjusted R-squared:  0.0127 
## F-statistic: 4.86 on 2 and 597 DF,  p-value: 0.00807

600 samples are enought to get a p value smaller than 0.01

(2)For hypothesis 2-Science Major with disagreement

largerSample = resample(d,size=1000)
mod4=lm(Agree ~ Major,data=largerSample)
## 
## Call:
## lm(formula = Agree ~ Major, data = largerSample)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -2.111 -0.845 -0.598  1.155  2.402 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  -0.1553     0.0597   -2.60   0.0095 **
## MajorNS      -0.2469     0.0884   -2.79   0.0053 **
## MajorFA       0.2664     0.2964    0.90   0.3690   
## MajorH        0.2546     0.1197    2.13   0.0337 * 
## MajorDKY     -0.2068     0.1724   -1.20   0.2307   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.23 on 995 degrees of freedom
## Multiple R-squared:  0.0202, Adjusted R-squared:  0.0163 
## F-statistic: 5.13 on 4 and 995 DF,  p-value: 0.000426

1000 samples are enough to get a p value smaller than 0.01

Conclusions

(1)From our hypothesis1: Students who want to graduate earlier than 4 years have a negative attitude towards the current academic limit. We can conclude that there’s a statistical significant negative correlation between graduate early and agreement with the current academic limit. This conclusion is supported by the regression report of linear model fitted with Graduate early categorical variable and Agree quantitative variable(negative correlation for GraduateYes, positive correlation for GraduateNo, p<0.05 for both levels indicating rejection of null hypothesis). That seems reasonable because students who want to graduate early have more incentive to take more classes every semester and have the incentive to take credits over upper limit.

(2)From our hypothesis 2: natural science majors disagree the most with the current academic limit of 18 credits. We can conclude that there’s a negative correlation between natural science level in Major variable and agreement with the current academic limit. This conclusion is supported by the negative correlation in the regression report of linear model fitted with Major categorical variable and Agree quantitative variable. However, the p value is not small enough to reject the null hypothesis. If we expand our sample size to 1000, we can have a p value small enough to reject the null hypothesis and justify the statistical significance. That makes sense because science majors tend to have more major requirement classes, and thus they want to take more classes to fulfill their major requirement as early as possible, or take some classes for fun other than their major fields.

Comments

We have not collected enough number of data for Fine Art students, making it almost impossible to find the relationship between that major and the students’ preference of the credit limit. Therefore it might be Fine Arts students who are more likely to disagree with the 18-credits limit than natural science student.