Abstract siRNA performance prediction calculations for a given siRNA may be divided into two broad categories: functions of the siRNA’s sequence, hereafter referred to as “intrinsic” properties of the siRNA, and functions of the target mRNA, hereafter referred to as “extrinsic” properties of the siRNA. When training a statistical or machine learning model to select […]

## Bayesian network modeling stock price change

Taking a cue from the systems biology folks, I decided to model stock price change interactions using a dynamic Bayesian network. For this analysis I focused on the members of the Dow Jones Industrial Average (DJIA) that are listed on the New York Stock Exchange (NYSE). Bayesian Networks A Bayesian network is an acyclic directed […]

## overfitting in statistics and machine learning (part one)

Overfitting is a common risk when designing statistical and machine-learning models. Here I give a brief demonstration of overfitting in action, using simple regression models. A later post will more rigorously address how to quantify and avoid overfitting. We start by sampling data from the process using the R code: Then we produce a linear […]

## simulated ROC curves

How receiver operating characteristic (ROC) curves vary with simulated data having stepped degrees of separation: Computational Notes These were created in R using the “ROCR” package. Be sure to say “ROCR” really fast! The simulated data are normally distributed within each group.