Colbert Report devotees recently witnessed a true miracle—Stephen Colbert spoke data science:
Due to the massive volume [of suggestions received], we … used computers to crunch the data.
Mr. Colbert had just received approximately 53,000 e-mailed suggestions from his minions proposing social issues for the newly formed Colbert Super PAC to address. Financial contributions to the Super PAC accompanied some of these recommendations. Colbert, though committed to personally reading every one of these suggestions, quickly realized he needed to scale up his ability to interpret the material. He therefore brought forth the powerful Persuadulux 6000 computer to assist.
The Persuadulux 6000 gained its computational prowess when its designers realized they could skip steps one through 5,999 in the product development cycle. Loaded with SPSS software, the Persuadulux enables the world’s most formidable social science. Mr. Colbert likely spent at least half of the Super PAC’s funds buying the thing.
Colbert then directed his top acolyte to input the 53,000 suggestions into the Persuadulux. After uttering the correct sequence of ancient SPSS incantations, the acolyte received a word cloud summarizing the primary suggestion themes:
Colbert then skillfully cherry picked the signal from the noise: Marijuana. Great rejoicing erupted across the land.
But Stephen Colbert, in his wisdom, suspected improper data normalization. After days of meditation on the moral supremacy of pay-to-play democracy, he instructed the acolyte to weight each suggestion’s content by the suggestion-maker’s dollar contribution to the Super PAC.
The acolyte again recited the ancient SPSS canon and received new revelation:
With proper normalization, viewers now see how the larger inconveniences of government, people, and education dwarf marijuana in importance.
Let Colbert’s unyielding commitment to correct data normalization inspire scientists everywhere!









Pingback: State of Data #96 « Dr Data's Blog