US State Clustering Analysis

Roger Blake

Introduction

What regions make up the contiguous United States?

  • The Census defines 4 regions and 9 divisions. Are these accurate?

Methodology:

  • Use k-means clustering to cluster similar states together
  • Once clusters are established, further statistical testing can be done
  • Variables to work with (from state dataset): Income, Illiteracy, Life Expectancy, Murder, HS Grad, Frost, Area, Coordinates of the center

Following Census Regions (k=4)

Cluster States Included
1 Arizona, California, Florida, New Mexico, Oklahoma, Texas
2 Colorado, Idaho, Iowa, Kansas, Minnesota, Montana, Nebraska, Nevada, North Dakota, Oregon, South Dakota, Utah, Washington, Wisconsin, Wyoming
3 Alabama, Arkansas, Georgia, Kentucky, Louisiana, Mississippi, North Carolina, South Carolina, Tennessee, West Virginia
4 Connecticut, Delaware, Illinois, Indiana, Maine, Maryland, Massachusetts, Michigan, Missouri, New Hampshire, New Jersey, New York, Ohio, Pennsylvania, Rhode Island, Vermont, Virginia

Comparison with Census Regions

Census.gov Takeaways:

  • New “Well-off Sun Belt” region appears
  • South (Cluster 3) remains extremely similar, minus “Well-off Sun Belt” states
  • Clusters 2 and 4 split the remaining states roughly along east/west lines

Following Census Divisions (k=9)

Cluster States Included
1 Alabama, Georgia, Louisiana, Mississippi, South Carolina
2 Arkansas, Kentucky, North Carolina, Tennessee, West Virginia
3 California, Oregon, Washington
4 Texas
5 Colorado, Montana, Nevada, Wyoming
6 Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont
7 Delaware, Illinois, Indiana, Maryland, Michigan, Missouri, New Jersey, New York, Ohio, Pennsylvania, Virginia
8 Arizona, Florida, New Mexico, Oklahoma
9 Idaho, Iowa, Kansas, Minnesota, Nebraska, North Dakota, South Dakota, Utah, Wisconsin

Comparison with Census Divisions

Census.gov

  • South region is split along north/south lines instead of east/west
  • Texas gets its own region
  • New England stays perfectly intact
  • Florida is again grouped with Western Sun Belt states

Conclusions

  • Midwestern states of MI, OH, IL, and IN were constantly grouped with Northeastern states of NJ, NY and PA. Perhaps these regions are more similar than people realize.
  • Florida seems more similar to its Southwestern sun belt companions than its neighbors in the Southeast.
  • Texas, Florida, and California were very difficult to cluster; perhaps manually assigning them a region would be the best course of action.
  • In my paper, I will delve more into statistical comparisons between regions