In the previous post we looked at nuclear cross sections as the probability of a nuclear interaction and noted that the cross section is a statistic derived through experimentation. If the statistic was measured incorrectly, the nuclear reactor would fizzle and a mess would result. But what if the measurement dealt with people and was “sort” of wrong? What if the test was usually right, but sometimes could give a false result? And what if the test was applied to an entire population or a representative sample to draw an inference?
Thomas Bayes was a Presbyterian minister born in England and raised in a nonconformist family. In 1773, his essay about probability was read to the Royal Society and eventually led to the theorem that bears his name. Bayes Theorem is very important to statistics because it accounts for the accuracy of the measurement technique or device. For simplicity, let us consider a sample test of a small population of people and the likelihood they may get the sniffles on National Cry Day (Super Bowl Sunday?).
We know (an absolute fact) that exactly ten percent of the population will cry and that the population of 100 is an exact representation of the absolute cry statistic. Of the 100 people, 10 will cry and 90 will not cry on National Cry Day. We have exactly 10 packages of tissue that must be handed out to people who will cry. How do we identify those to whom we supply the tissue?
Lancelot McSnoyd (the leprechaun) has invented a sensing machine for testing the propensity for crying on National Cry Day. The subject places his or her hand on the sensing panel and the machine signals cry or no cry. And the machine has a 90 percent accuracy rating if a person is going to cry (has the crying gene) on National Cry Day, correctly identifying him or her 90 percent of the time. In our population, 9 of the 10 criers will be successfully identified and get the tissue they need. We get a 90 percent success rating and an “A” for our efforts. Or do we?
McSnoyd’s machine is used (must be used) to test all 100 members of the population. If we only test the criers, why are we testing? Just hand out the tissue. When we test everyone, 10 percent of the non-criers will test as being criers. That is 10 percent of 90 or 9 more positive results. The testing will show that 18 people will cry on National Cry Day.
Of the 18 people that tested positive, 9 will cry and need tissues, 9 will not need anything. And one unidentified person will not get the necessary tissue. How important is it that the one unidentified crier does not get the help needed? And of the 18 identified as criers, how do you hand out the 10 packages of tissue? Obviously, this a silly and simplistic example. But what if it is a contagious disease, automobiles with potential brake failure, or an entrance or final exam for public safety control? Bayes Theorem deals with the concept of false positives and false negatives of testing results. For the most part, it has little meaning in our everyday lives, but it could be significant if it affects our health.
Your annual physical date is rapidly approaching and you dutifully report to the blood testing lab to have a sample extracted from your body. The sample is labelled and sent to the analysis lab for evaluation, the results written down, and the report forwarded to your doctor. You feel great, just finished running the Boston Marathon, and have a “perfect” body mass index. You expect a glowing report from your doctor and enter the examining room. Then you find out the results of your blood test.
Are the results right or wrong? How do you know? How does your doctor know? Being that your doctor knows your athletic prowess, her or his judgment is probably pretty good, but you may (probably will) need more tests. This means more costs, but you want the tests to be right.
In the simplistic Bayes Theorem example, error types were highlighted in a “scientific” example. Blood tests are highly regulated and clinical laboratories operate under stringent quality regulations. Yet errors do occur. Some may be operational and some statistical. But they do occur. According to the Scientific American Guest Blog of May 9, 2016, the rate of outpatient diagnostic errors is about five percent per year or around 12 million adults per year. This is a statistic based on statistics.
Shown below are two charts of medical data or statistics about deaths in the United States in a specific year. What do they tell you?
Till next time….