graddegrees<-read.file("/home/emesekennedy/Data/Ch1/graddegrees.txt")
## Reading data with read.table()
View the name of the variables in the data:
names(graddegrees)
## [1] "Degree" "PercentFemale"
Statistics and box plot for the time24 data:
time24<-read.file("/home/emesekennedy/Data/Ch1/timetostart24.txt")
## Reading data with read.table()
favstats(~TimeToStart,data=time24)
## min Q1 median Q3 max mean sd n missing
## 4 23 36.5 46.25 77 37.375 18.57491 24 0
bwplot(~TimeToStart,data=time24)
Statistics and box plot from the same data with an added outlier:
time25<-read.file("/home/emesekennedy/Data/Ch1/timetostart25.txt")
## Reading data with read.table()
favstats(~TimeToStart,data=time25)
## min Q1 median Q3 max mean sd n missing
## 4 23 40 47 694 63.64 132.5779 25 0
bwplot(~TimeToStart,data=time25)
Note: Rstudio recognized the outlier and created a modified box plot.
Box plot of the chicken weight data grouped by the different diets:
bwplot(feed~weight,data=chickwts)
Create a data set with grades:
grades<-c(60, 65, 75, 80)
Verify that the sum of the deviations from the mean is zero:
grades-mean(grades)
## [1] -10 -5 5 10
sum(grades-mean(grades))
## [1] 0
Find the standard deviation using the command sd() and the the favstats() command:
sd(grades)
## [1] 9.128709
favstats(grades)
## min Q1 median Q3 max mean sd n missing
## 60 63.75 70 76.25 80 70 9.128709 4 0
Transform the grades using a linear transformation:
newgrades<-1.1*grades+5
Save the statistics for both the original grades and the new grades:
stats<-favstats(grades)
newstats<-favstats(newgrades)
Compute the IQR for both the original grades and the new grades:
IQR<-stats[1,4]-stats[1,2]
IQR
## [1] 12.5
newIQR<-newstats[1,4]-newstats[1,2]
newIQR
## [1] 13.75
Verify that 1.1 times the IQR of the original grades gives the IQR of the new grades:
IQR*1.1
## [1] 13.75
Create a histogram of the time24 data with a density curve:
histogram(~TimeToStart,data=time24,density=T)
Note: the vertical axis for the histogram must be densities (i.e. type=“density”). This is the default option, which is why we did not have to specify the type.
Load a new data set that is formatted a little differently than our previous data sets:
state<-read.file("/home/emesekennedy/Data/Ch1/collegebystate.txt",sep="\t",header=T)
## Reading data with read.table()
Create a histogram with a density curve showing the distribution of Undergraduate students in the USA by states:
histogram(~Undergrads,data=state,density=T)
As both the histogram and the density curve shows, the data is skewed to the right.