Harold Nelson
11/8/2016
Your task is to create a character variable densityCat in countyComplete with these categorical values. After you create the character variable, you should create a factor version in a second variable densityCatF. Low is defined as the first quartile. Medium is the second and third quartiles. High is the fourth quartile. You will need the quantile function. Use Google for help. It’s your best friend!
Get the data.
cC <- read.delim("~/Dropbox/RProjects/CSC 360 Module 2/countyComplete.txt")
cC$densityCat = "Low"
cC$densityCat[cC$density >
quantile(cC$density,.25)] = "Medium"
cC$densityCat[cC$density >
quantile(cC$density,.75)] = "High"
cC$densityCatF = factor(cC$densityCat,levels =
c("Low","Medium","High"))
#Check
table(cC$densityCat,cC$densityCatF)
##
## Low Medium High
## High 0 0 786
## Low 787 0 0
## Medium 0 1570 0
cC$densityCatF2 = cut(cC$density,
breaks = c(0,quantile(cC$density,.25),
quantile(cC$density,.75),
max(cC$density)),
labels=c("Low","Medium","High"),
include.lowest=TRUE)
# Check
table(cC$densityCatF,cC$densityCatF2)
##
## Low Medium High
## Low 787 0 0
## Medium 0 1570 0
## High 0 0 786
boxplot(cC$density~cC$densityCatF)
We have a bad graph because the values have a very large range.
min(cC$density)
## [1] 0
max(cC$density)
## [1] 69467.5
Using Log10 values makes the graph usable.
cC$logDensity = log10(cC$density+.01)
boxplot(cC$logDensity~cC$densityCatF)
tapply(cC$density,cC$densityCatF,summary)
## $Low
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 2.60 5.60 6.83 10.75 16.90
##
## $Medium
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 17.00 29.72 45.20 50.99 69.18 113.60
##
## $High
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 114.1 173.8 301.1 928.3 664.7 69470.0
tapply(cC$logDensity,cC$densityCatF,summary)
## $Low
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.0000 0.4166 0.7490 0.6705 1.0320 1.2280
##
## $Medium
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.231 1.473 1.655 1.652 1.840 2.055
##
## $High
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.057 2.240 2.479 2.590 2.823 4.842