toward a gene panel for psychiatric violence

I recently developed a method for specifying a comprehensive gene list for investigating genes related to psychiatric violence, which I describe below. First though, here’s a cool picture from the analysis: Method I started by extracting a list of diseases involving violence from [1], removing epilepsy, dementia, mental retardation (is there a better word for […]

rapidly identifying potential CRISPR/Cas9 off-target sites (part one)

Before we can score segments in the genome having a small number of mismatches to a CRISPR for their off-target risk, we must first find these segments. Searching for every possible mismatch permutation proves computationally expensive, so we apply the following heuristic: We only search for mismatches in the top positions relevant to CRISPR efficiency. […]

Bayesian method for filtering out mRNA turnover rate bias from siRNA knockdown measurements

Abstract siRNA performance prediction calculations for a given siRNA may be divided into two broad categories: functions of the siRNA’s sequence, hereafter referred to as “intrinsic” properties of the siRNA, and functions of the target mRNA, hereafter referred to as “extrinsic” properties of the siRNA. When training a statistical or machine learning model to select […]

RNAfold’s and RNAcofold’s predicted dG correlates with sequence length

This seems rather obvious, but I decided to double check before building a machine learning model based on RNAfold’s and RNAcofold’s predictions involving sequences of varying length. Method I generated 30,000 random RNA sequences of random length between 15 and 30 bases. I ran RNAfold on this list; and RNAcofold on this same list where […]

how I make a living: what is bioinformatics? (part #1)

I’m constantly asked to explain what I do for a living. Here is an attempt to do so in laypersons’ terms. I’ll assume my readers are non-scientists and non-engineers, but that they’ve taken a high school biology class. “Bioinformatics” is the application of mathematics and computer science to biological data, particularly molecular biology data. By […]

graph database for heterogeneous biological data

To assist with a project I’m working on, I recently implemented a substantial portion of DisGeNET as a graph database. Furthermore, I added MeSH, OMIM, Entrez, and GO into the database to facilitate linking of data between these sources. Here I briefly describe these data sources, describe graph databases, and then show how use of […]

the science of gender identity (part 1: genetics)

This is the first in a multi-part series surveying the current science of gender identity, particularly with regard to the transgendered population. I intend to discuss the genetic, brain anatomic, and neuropsychological findings of recent studies on the matter. As always, I will incorporate my own statistical analysis of raw study data wherever possible. Here […]

fast genomic coordinate comparison using PostgreSQL’s geometric operators

PostgreSQL provides operators for comparing geometric data types, for example for computing whether two boxes overlap or whether one box contains another. Such operators are quick compared to similar calculations implemented using normal comparison operators, which I’ll demonstrate below. Here I show use of such geometric data types and operators for determining whether one segment […]

gene annotation database with MongoDB

After reading Datanami’s recent post “9 Must-Have Skills to Land Top Big Data Jobs in 2015” [1], I decided to round out my NoSQL knowledge by learning MongoDB. I have previously reported NoSQL work with Neo4j on this blog, where I discussed building a gene annotation graph database [2]. Here I build a similar gene […]

iBioSim: a CAD package for genetic circuits

iBioSim is a CAD package for the design, analysis, and simulation of genetic circuits. It can also be used for modeling metabolic networks, pathways, and other biological/chemical processes [1]. The tool provides a graphical user interface (GUI) for specifying circuit design and parameters, and a GUI for running simulations on the resulting models and viewing […]