Understanding your data is a critical step in analysis. Describe in words the following data types. Is it numeric? Continious? Bounded at 0?
Don’t yet worry about the distributions, the point is to recognize fundamental characteristics of data.
- Draw 1000 random normal points with mean of 0 and sd of 1
x <- rnorm(1000,mean=0,sd=1)
head(x)
## [1] 1.1187685 -2.4280303 0.4584987 2.0480882 -1.6284340 -1.5402559
- Draw 1000 random Poisson points lambda=1
j <- rpois(1000,lambda=1)
head(j)
## [1] 0 0 0 3 2 3
- Draw 1000 random binomial points with prob =0.5 (coin flip; i.e., size=1)
g <- rbinom(1000,prob=.5,size=1)
head(g)
## [1] 0 1 1 1 1 1
- Create a histogram of each distribution above.
hist(x, main="rnorm(1000,mean=0,sd=1)")
hist(j, main="rpois(1000,lambda=1)")
hist(g, main="rbinom(1000,prob=.5,size=1)")
``
- Bin the histogram into fewer sections (e.g., 5). See ?hist
hist(x,col="red",breaks=5)
Using curve(), fit a histogram of your data with a distribution curve
Following the example.
x <- rnorm(100)
head(x)
## [1] -0.94226151 -0.67686102 -0.15420097 -1.58940222 0.33022652 -0.07324667
hist(x,prob=TRUE)
curve(dnorm(x,mean=mean(x),sd=sd(x)),add=TRUE,col="red")
What does the prob=TRUE argument do in histogram? Why is it needed?
- Fit questions 2 and 3 to their respective distributions.
hist(j,prob=TRUE)
curve(dpois(x,lambda=mean(j)),col="red",from=0,to=6,n=7,add=TRUE)
#
hist(g,probability=TRUE)
curve(dbinom(x,prob=.5,size=1)*10,col="red",add=TRUE)
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.010000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.020000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.030000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.040000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.050000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.060000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.070000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.080000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.090000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.100000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.110000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.120000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.130000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.140000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.150000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.160000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.170000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.180000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.190000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.200000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.210000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.220000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.230000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.240000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.250000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.260000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.270000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.280000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.290000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.300000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.310000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.320000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.330000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.340000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.350000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.360000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.370000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.380000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.390000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.400000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.410000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.420000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.430000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.440000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.450000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.460000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.470000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.480000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.490000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.500000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.510000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.520000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.530000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.540000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.550000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.560000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.570000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.580000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.590000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.600000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.610000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.620000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.630000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.640000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.650000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.660000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.670000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.680000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.690000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.700000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.710000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.720000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.730000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.740000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.750000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.760000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.770000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.780000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.790000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.800000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.810000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.820000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.830000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.840000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.850000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.860000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.870000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.880000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.890000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.900000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.910000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.920000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.930000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.940000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.950000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.960000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.970000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.980000
## Warning in dbinom(x, prob = 0.5, size = 1): non-integer x = 0.990000
#That's awkward, do you know why!?
- Plot all distributions together, using the add=TRUE parameters. Color them seperately and make note of which distribution is which color.
Red is normal Blue is poisson Green is binomial
#all at once,
hist(x,col="red")
hist(j,col="blue",add=TRUE)
hist(g,col="green",add=TRUE)