Danger in Machine Learning: Human Error

by Stephen Nawara, PhD.

Recently a landmark study[1] was published that claimed to have identified a danger of "black box" machine learning methods. The main message of the article is that there is a trade-off between intelligibility and accuracy.

They write that, while models like neural networks often more accurate, it is difficult to understand what they are doing. Instead they advocate researchers use simpler models (such as logistic regression) when it is more important that the end-user be able understand how the model works.

This message seems fair enough. It really does seem to be the case that more complicated models often perform better on complicated problems. Also, the human mind really does seem to have limits to the degree of complexity it can comprehend. Finally, there is value in being able to sanity check a model. However, when the authors go on to provide their motivation for this realization, we find some problems.


The project in question was actually run in the 1990s by an organization called Cost-Effective HealthCare (CEHC). The overall goal was to determine how applicable machine learning methods could be to various problems faced by the healthcare industry.

Specifically, the researchers wanted to predict which pneumonia patients had a high probability of death (POD) upon presentation at a hospital. If these patients could be accurately split into "high-risk" and "low-risk" groups, it was thought that efficiency gains could be made by only admitting the high-risk patients for intensive monitoring and treatment.


To achieve this, they collected some data and plugged it into various algorithms (e.g., neural networks, logistic regression, etc) for comparison. A neural network model had the best predictive skill in the end, but was rejected for other reasons.

They decided it was too risky to place the health/lives of patients in the hands of such a difficult to understand model. Instead they decided to use a much simpler logistic regression model.

The reason was that a few algorithms (including the neural network) had learned that patients with asthma actually had a lower POD. This was considered an unacceptable conclusion since it went against all medical thought:

On one of the pneumonia datasets, the rule-based system learned the rule "HasAsthama(x) => LowerRisk(x)", i.e., that patients with pneumonia who have a history of asthma have lower risk of dying from pneumonia than the general population. Needless to say, this rule is counterintuitive.

What happened?

So what happened here? A few possibilities immediately come to mind:

  • Was there a bug in the neural network code?
  • Was the data so noisy that the network used some spurious correlation between asthma and mortality?
  • Has the medical dogma about the relationship between pneumonia, asthma, and mortality been wrong this whole time?

It turns out to be none of the above. The code was fine, the data was clean, and the medical dogma remains unchallenged. Instead, the network had picked up that asthmatic patients were monitored/treated more aggressively upon admission to the hospital.

Specifically, the pneumonia patients with asthma were admitted directly to the Intensive Care Unit (ICU). It turned out the care they recieved in the ICU actually improved the prognosis over asthma-free patients who only recieved the usual level of attention. The models seemed to have learned something wrong:

...models trained on the data incorrectly learn that asthma lowers risk, when in fact asthmatics have much higher risk (if not hospitalized).

Human Error


However, let us reiterate the goal of the study (emphasis added):

One of the goals of the study was to perform a clinical trial to determine if machine learning could be used to predict risk prior to hospitalization so that a more informed decision about hospitalization could to be made.

The researchers wanted to predict what would happen from information available prior to admission, but used data that contained information about what happened subsequent to admission. This is a great example of data leakage, specifically predicting the past with the future.

There is another way to describe this error. The researchers wanted know the POD if the patient was not hospitalized, but collected data from patients who were hospitalized. The latter type of data is obviously more easily available, but this does not justify their attempt to use it. The data used by the researchers simply could not answer the actual question they wanted to ask.

Cascading Failure

This is human error already, but the researchers go on to make a second mistake. Instead of recognizing the true problem, they attribute the issue to a problem with neural networks. They even go on to consider various methods of manually adjusting the data/network to remove the "counterintuitive" result. They considered various possibilities:

  • leaving asthmatics out of the study
  • removing the asthma feature from the training data
  • modifying the outcome of the asthmatics to reflect the care

Eventually, they decided to abandon the neural network approach altogether because such models may learn any number of "counterintuitive" relationships like this. Obviously, this is a solution to the wrong problem. The correct solution would be either to:

  • collect data on patient mortality from those who did not get admitted to the hospital
  • clean the features so that they contain information only available prior to admittance


In an earlier post it was shown to be quite easy for the data scientist to introduce undesirable information to a machine learning model. Now we have also seen that a cascading series of human errors can result if this problem is not recognized for what it is. Specifically, the researcher may have a bias to blame the model rather than themselves for any strange results, and attempt to fix the wrong problem. Often, this is not a bad heuristic: software often does contain bugs. In this case, however, the error occurred before the model ever saw the data. The humans collected the wrong data for their question.

[1] Caruna, et al. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission. 2015. KDD '15 Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. http://dx.doi.org/10.1145/2783258.2788613