Excel mangles NCBI gene symbols

Using Microsoft’s Excel for bioinformatics work sucks, but sometimes a spreadsheet is the best format for communicating results to other scientists.

The program’s default behavior mangles some NCBI gene symbols when you import them from a text file. Here is how to deal with it. Suppose you have the following list of gene symbols,


and you import them into Excel. The middle three gene symbols get treated as dates:


This definitely ruins your day.

The trick to avoiding this problem is to go to the third step of the Text Import Wizard and set the Column data format to “Text”:


Then the gene symbols import correctly:


