# industry diversity (via Shannon entropy) per US county

Ecologists use Shannon entropy to measure species diversity in a given region. Here I apply the same equation to determine industry diversity in each US county. In the map below, darker color indicates greater industrial diversity:

## Method

Downloaded the 2009 County Business Patterns data from the US Census Bureau and extracted the business establishment counts for each six-digit NAICS code for each county. Then, for each county, I computed the Shannon entropy from the establishment counts in each NAICS code. Finally, I partitioned the entropy values into nine distinct colors to fill in the map above.

Code implementing the above-described computations is available on the Badass Data Science wiki.

## Acknowledgments

I used the Python mapping method detailed here.

## 5 thoughts on “industry diversity (via Shannon entropy) per US county”

(February 22, 2012 - 12:37 am)

Just ran a regression on the 2009 county unemployment rates vs. the 2009 county industry diversity indices shown above. Found no correlation between the two.

#### rossdavidh

(June 14, 2012 - 7:59 pm)

Might be the case, though, that there is a correlation between industry diversity and unemployment variability over time. Probably a negative correlation, because having all your eggs in one basket would make you more prone to boom and bust? But you’d need to repeat this analysis annually for a few years, or else go back in time. Meaning get data from further back in time, I mean.

#### industrial diversity vs percent change in unemployment rate | badass data science

(July 13, 2012 - 7:04 pm)

[…] Several months ago I reported using an ecological species diversity equation to calculate a US county’s “industrial diversity”, producing the map shown below illustrating the degree of diversity of industries operating in each county. The computation method is detailed here. […]

#### using Hadoop to examine county-level industrial diversity | badass data science

(December 5, 2012 - 2:51 pm)

[…] a previous post, I computed each U.S. county’s industrial diversity from the 2009 County Business Patterns […]

#### test driving Amazon Web Services’ Elastic MapReduce |

(January 7, 2014 - 4:19 am)

[…] diversity” index from the U.S. Census Bureau’s County Business Patterns (CBP) data (see http://badassdatascience.com/2012/02/21/industry-diversity/). The CBP dataset provides, for each NAICS industry code, the number of business establishments […]