R Markdown

##Question 1. In this html document, the dataset state.x77 is used to show information about American States. Ranging everything from population, income, illiteracy, life expectancy, murder rate, HS graduation rate, frost, and area. The first table will be displaying the first few rows of the data to give you a general idea of what is in this states dataset.

head(states)
##            Population Income Illiteracy Life Exp Murder HS Grad Frost   Area
## Alabama          3615   3624        2.1    69.05   15.1    41.3    20  50708
## Alaska            365   6315        1.5    69.31   11.3    66.7   152 566432
## Arizona          2212   4530        1.8    70.55    7.8    58.1    15 113417
## Arkansas         2110   3378        1.9    70.66   10.1    39.9    65  51945
## California      21198   5114        1.1    71.71   10.3    62.6    20 156361
## Colorado         2541   4884        0.7    72.06    6.8    63.9   166 103766

##First part of Question 2. This shows the mean of all 50 states combined. The program is adding up the population of everystate and dividing that by 50 (the amount of states).

mean(Population)
## [1] 4246.42

##Second part of Question 2. The number below represents the median or the population of the middle state. So because there is an even number of states, the program will take the number equadistant to the populations of the 25th and 26th most populous state.

median(Population)
## [1] 2838.5

##First part of Question 3. This next number is descibing the variance. This measures the average distance of all the numbers from the mean by squaring the difference of all numbers from the mean. This shows how spread out in values a data set can be.

var(Population)
## [1] 19931684

##Second part of Question 3. The next number shows the standard deviation of all 50 states Population which is the square root of variance. By finding the square root of variance which is squared, the value of Standard deviation is the same as the dataset instead of being exaggerated. The reason the variances are so large are because America has a large range of populations for its states. Some being less than a million and others being many million.

sd(Population)
## [1] 4464.491

##Question 4 These next numbers will show the largest population of all the states, and then the smallest population of all the state.

max(Population)
## [1] 21198
min(Population)
## [1] 365

##Question 5 Below is a histogram. You will notice that most states have a small population and there are only a few states with a very large population. Helps explain why the most populous state has a population of almost 5 times the mean and over 7 times larger than the median, and why the variance is so large.

hist(Population, xlab="Population in Thousands")

##Question 6 The number below shows the sum of every US state’s population added together.

sum(Population)
## [1] 212321

##Question 7 This table below shows how many states have a population of over 5 million and how many don’t. I learned that there are way more states below 5 million. Knowing the mean population is 4.2 million, the table clearly shows how skewed the population distribution is.

big <- Population > 5000
table(big)
## big
## FALSE  TRUE 
##    38    12

##Question 8 Below is a plot that shows the population of all states against their area. I think it is pretty easy to see where Alaska, Texas, and California are. You can also see that most states are within a similar range of sizes.

plot(Area, Population, xlab="Area in square miles", ylab="population in thousands", main= "population vs. area")

##Question 9 This is a histogram of how dense the population is by seeing how many states have a high density vs. a low density. You can see that most states are not very dense at all compared to how dense some states are. I think those very dense states are probably in the Northeast where there is little land and large populations.

pop.density <- Population/Area
hist(pop.density)