industry diversity (via Shannon entropy) per US county

Ecologists use Shannon entropy to measure species diversity in a given region. Here I apply the same equation to determine industry diversity in each US county. In the map below, darker color indicates greater industrial diversity:


Downloaded the 2009 County Business Patterns data from the US Census Bureau and extracted the business establishment counts for each six-digit NAICS code for each county. Then, for each county, I computed the Shannon entropy from the establishment counts in each NAICS code. Finally, I partitioned the entropy values into nine distinct colors to fill in the map above.

Code implementing the above-described computations is available on the Badass Data Science wiki.


I used the Python mapping method detailed here.


This entry was posted in data science, econometrics and tagged , , , , , . Bookmark the permalink.

5 Responses to industry diversity (via Shannon entropy) per US county

  1. badassdatascience says:

    Just ran a regression on the 2009 county unemployment rates vs. the 2009 county industry diversity indices shown above. Found no correlation between the two.

  2. rossdavidh says:

    Might be the case, though, that there is a correlation between industry diversity and unemployment variability over time. Probably a negative correlation, because having all your eggs in one basket would make you more prone to boom and bust? But you’d need to repeat this analysis annually for a few years, or else go back in time. Meaning get data from further back in time, I mean.

  3. Pingback: industrial diversity vs percent change in unemployment rate | badass data science

  4. Pingback: using Hadoop to examine county-level industrial diversity | badass data science

  5. Pingback: test driving Amazon Web Services’ Elastic MapReduce |

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>