Scientists love statistical significance. It offers a way to test hypotheses. It’s a ticket to publishing, to media coverage, to tenure.
It’s also a crock — statistically speaking, anyway.
You know the idea. When scientists perform an experiment and their data suggest an important result — say, that watching TV causes influenza — there’s always the nagging concern that the finding was a fluke. Maybe some of the college sophomores selected for the study had been recently exposed to the flu via some other medium. By dividing the students into two groups at random, though — one to watch TV and the other not — scientists try to make such preexposure equally likely in either group. Of course, there’s still a chance that the luck of the draw put more flu-prone people in the TV group. Tests of statistical significance offer a way to calculate just how likely such a fluke should be.
Even when such tests are performed correctly, it’s a challenge to draw sensible conclusions. And analyzing statistical data presents many opportunities for making logical errors.