industry diversity (via Shannon entropy) per US county

Ecologists use Shannon entropy to measure species diversity in a given region. Here I apply the same equation to determine industry diversity in each US county. In the map below, darker color indicates greater industrial diversity:


Downloaded the 2009 County Business Patterns data from the US Census Bureau and extracted the business establishment counts for each six-digit NAICS code for each county. Then, for each county, I computed the Shannon entropy from the establishment counts in each NAICS code. Finally, I partitioned the entropy values into nine distinct colors to fill in the map above.

Code implementing the above-described computations is available on the Badass Data Science wiki.


I used the Python mapping method detailed here.


5 thoughts on "industry diversity (via Shannon entropy) per US county

  1. Just ran a regression on the 2009 county unemployment rates vs. the 2009 county industry diversity indices shown above. Found no correlation between the two.

  2. Might be the case, though, that there is a correlation between industry diversity and unemployment variability over time. Probably a negative correlation, because having all your eggs in one basket would make you more prone to boom and bust? But you’d need to repeat this analysis annually for a few years, or else go back in time. Meaning get data from further back in time, I mean.

