Such a calculation of "significance" takes account only of the numerical data of this one experiment. An estimate of σ is not to be regarded as a number that can be used in place of
σ unless the observations have exhibited randomness, and not unless the number of degrees of freedom amounts to 15 or 20, and preferably more. A broad background of experience is necessary before one can say whether his experiment is carried out by demonstrably random methods. Moreover, even in the state of randomness, it must be borne in mind that unless the number of degrees of freedom is very large, a new experiment will give new values of both
σ(ext) and
σ(int), also of
P(χ) and
P(s). Ordinarily, there will be a series of experiments, and a corresponding series of P values.
It is the consistency of the P values in the series, under a wide variety of conditions, and not the smallness of any one P value by itself that determines a basis for action, particularly when we are dealing with a cause system underlying a scientific law. In the absence of a large number of experiments, related knowledge of the subject and scientific judgment must be relied on to a great extent in framing a course of action.
Statistical "significance" by itself is not a rational basis for action.
-- W. Edwards Deming, Statistical Adjustment of Data, (Wiley, 1943), p. 30.
Boldface added
Only a series of experiments under a wide range of conditions will establish whether a relationship is robust and holds in a variety of circumstances. Alas, no one wants to publish papers that say, "Yeah, I found the same relationship, too." Or to re-run the same experiment under different conditions. Feynman was once critical of an experiment using heavy hydrogen because the comparison was made to another, published experiment using light hydrogen on another apparatus. The comparison should have been made using light hydrogen on the same apparatus as the heavy hydrogen; but time was not made available because "we already know the answer for light hydrogen."
Feynman knew quite well that the same experiment might give different results when run on different apparatus. And if this is an issue for physics, how much more so for social "science"?
Such warped priorities. One would think from this post that publishing was less important than finding out the truth.
ReplyDeleteMary! This post is but one datum point.
ReplyDeleteDon't be rash!
:>)
JJB
Ah, well. At least since the Positivists gained control, 'science' hasn't been about truth, anyway.
Delete"Feynman knew quite well that the same experiment might give different results when run on different apparatus. And if this is an issue for physics, how much more so for social "science"?"
ReplyDeleteThe good news is that the inquisition--er, "inquiry"--against Prof. Regnerus was dropped. The bad news is that the university started said inquiry in the first place.
" Only a series of experiments under a wide range of conditions will establish whether a relationship is robust and holds in a variety of circumstances. Alas, no one wants to publish papers that say, "Yeah, I found the same relationship, too." Or to re-run the same experiment under different conditions. Feynman was once critical of an experiment using heavy hydrogen because the comparison was made to another, published experiment using light hydrogen on another apparatus. The comparison should have been made using light hydrogen on the same apparatus as the heavy hydrogen; but time was not made available because "we already know the answer for light hydrogen." "
I've long held that a Ph.D. should not so much require that the candidate in question produce new research--only to go on to become a post-doc to prove that he can produce new research--but rather should demonstrate that he has mastered the "well-known" aspects of his field. Well-known these days means, "some other group did that experiment once."
Long live the p-value!
ReplyDeleteJohn Tukey once said, as I recall it, that no one ever really has more than 30 degrees of freedom to estimate a variance: if you have more data points than that, you start to be able to see the inhomogeneities.
ReplyDeleteHeh, rather timely... someone over at Ricochet.com posted a review of a book called Black Swans that's kind of what you mention at the end; when folks "already know the answer," they don't check, and get surprised by things that were outside their reference.
ReplyDeleteOr, as many folks say:
it ain't what you know, it's what you know that ain't so.
Nassim Taleb, the author of The Black Swan, has extended the concepts presented therein to a concept of anti-fragility -- see, for example, http://www.youtube.com/watch?v=33kET2YPWls
DeleteNote especially his distinction between "anti-fragile" (aka "philo-stochastic") and "robust".