picking stocks by graph database (part one)

Historical stock price data comes readily available at daily resolution. So we calculated the Granger causality for each pair of stocks we hold data for, at one and two day lags (testing the question “does daily percent change in volume for stock X Granger cause daily percent change in adjusted close price for stock Y?”). […]

Bayesian method for filtering out mRNA turnover rate bias from siRNA knockdown measurements

Abstract siRNA performance prediction calculations for a given siRNA may be divided into two broad categories: functions of the siRNA’s sequence, hereafter referred to as “intrinsic” properties of the siRNA, and functions of the target mRNA, hereafter referred to as “extrinsic” properties of the siRNA. When training a statistical or machine learning model to select […]

how I make a living: what is bioinformatics? (part #1)

I’m constantly asked to explain what I do for a living. Here is an attempt to do so in laypersons’ terms. I’ll assume my readers are non-scientists and non-engineers, but that they’ve taken a high school biology class. “Bioinformatics” is the application of mathematics and computer science to biological data, particularly molecular biology data. By […]

DIY Twitter analytics (part 3: hashtag network)

I’ve been mathematically analyzing my Twitter feed to determine how best to position my tweets for maximum impact, and have been documenting the work on this blog. While I’ve not come to any brilliant conclusions yet, I’ve made progress. My first post on the subject described clustering my followers by their hashtag use to see […]

the science of gender identity (part 4: summary)

To prepare for a book I intend to write on the science of gender identity, I drafted the following three blog posts to collect my thoughts. They are highly technical; I need to recast the content for the layperson. I also assembled some of my own biological data to analyze. The first blog post, http://badassdatascience.com/2015/06/06/sci-gender-identity-01/, […]

DIY Twitter analytics (part 2: correlations)

I’ve been working with the Twitter API to develop my own Twitter analytics tool chain, and have been documenting the results on this blog. My last post on the subject described clustering my followers by their hashtag use to see whose tweets are most like mine. My goal of this project is to figure out best […]

DIY Twitter analytics (part 1: clustering related users)

I’ve started working with the Twitter API to develop my own Twitter analytics tool chain. My goals are to figure out who the influencers in my subjects are, figure out how best to position my tweets, etc. I could certainly pay for this service, but then I wouldn’t learn any new technical skills in the […]

the science of gender identity (part 3: psychology)

This is the third post in a multi-part series surveying the current science of gender identity, particularly with regard to the transgendered population. My first post on the subject covered proposed genetic associations and corresponding research. The second post on the matter discussed observed differences in brain anatomy between transgendered and cisgendered individuals. Here I […]

the science of gender identity (part 2: brain anatomy)

This is the second post in a mult-part series surveying the current science of gender identity, particularly with regard to the transgendered population. In my previous post I discussed the proposed genetic associations and corresponding research. A future post, if I can find sufficient data, will address neuropsychology research related to the transgender experience. Here […]

the science of gender identity (part 1: genetics)

This is the first in a multi-part series surveying the current science of gender identity, particularly with regard to the transgendered population. I intend to discuss the genetic, brain anatomic, and neuropsychological findings of recent studies on the matter. As always, I will incorporate my own statistical analysis of raw study data wherever possible. Here […]