In clinical trials and reliability studies, researchers often measure the time until an event occurs for each patient or object in the study. That event may be patient death in the case of clinical trials for a new cancer drug, or bridge failure in the case of a reliability study of bridges.
Sometimes, however, the event being measured cannot be observed for each subject in the study. For example, a patient may outlive the study period for a study measuring time to patient death, or may quit the study before it ends. In these cases we know only that their measured time until the study event occurs is greater than some value, but we don’t know by how much.
We call such data points “censored” data points.
Consider the following plot showing time to death in a 20-day clinical trial of a new drug:
Here we see that Bart outlived the study period, and therefore has a censored data point. Lisa quit the study after seven days, so her time before death is also a censored data point.
To be more technical, these data points are called “right censored”, to indicate that the censoring takes place on the right side of the graph (meaning the true data point exceeds a certain value, but it is unknown by how much). To contrast, “left censored” data occurs when researchers do not know when a subject of theirs enters the study; when the data point lies below a certain value, but it is unknown by how much.
Censored data must be treated carefully in analyses. In future posts I’ll illustrate some of the ways this is done.