A beautifully told story with colorful characters out of epic tradition, a tight and complex plot, and solid pacing. -- Booklist, starred review of On the Razor's Edge

Great writing, vivid scenarios, and thoughtful commentary ... the stories will linger after the last page is turned. -- Publisher's Weekly, on Captive Dreams

Friday, February 24, 2017

Friends Don't Let Friends Do Stats

A recent story by NBC News [sic]

Female Doctors Outperform Male Doctors, According to Study

Patients treated by women are less likely to die of what ails them and less likely to have to come back to the hospital for more treatment, researchers reported Monday.
If all doctors performed as well as the female physicians in the study, it would save 32,000 lives every year, the team at the Harvard School of Public Health estimated.
Thus does NBC summarize the paper "Comparison of Hospital Mortality and Readmission Rates for Medicare Patients Treated by Male vs Female Physicians" by Yusuke Tsugawa, MD, MPH, PhD; Anupam B. Jena, MD, PhD; Jose F. Figueroa, MD, MPH; et al.

(I wish I knew who Et Al. was. Gets his name on a lot of papers, it seems.)

NBC "News" tells us
“The data out there says that women physicians tend to be a little bit better at sticking to the evidence and doing the things that we know work better.”
Apparently male doctors practice medicine regardless of what the evidence dictates. Worse, they are paid more for their foolish and dangerous behavior.

Alas, Tsugawa and his co-authors did not actually measure how doctors practiced. So even if the 30-day mortality and readmission rates differed between male and female doctors, the researchers had no way of knowing why they differed; and ipso facto neither would NBC. Someone was blowing smoke because the newsmog cannot abide a story that doesn't give a paradigm-conforming reason. Sticking to the evidence? Forsooth..
Design, Setting, and Participants  We analyzed a 20% random sample of Medicare fee-for-service beneficiaries 65 years or older hospitalized with a medical condition and treated by general internists from January 1, 2011, to December 31, 2014. We examined the association between physician sex and 30-day mortality and readmission rates, adjusted for patient and physician characteristics and hospital fixed effects (effectively comparing female and male physicians within the same hospital). As a sensitivity analysis, we examined only physicians focusing on hospital care (hospitalists), among whom patients are plausibly quasi-randomized to physicians based on the physician’s specific work schedules. We also investigated whether differences in patient outcomes varied by specific condition or by underlying severity of illness.
That just sounds scientificalistic as all hell, dunnit? And hospitalists? Who knew? TOF always thought they were "doctors."
Conclusion: Patients treated by female physicians had lower 30-day mortality (adjusted mortality, 11.07% vs 11.49% …) and lower 30-day readmissions (adjusted readmissions, 15.02% vs 15.57% …) than patients cared for by male physicians, after accounting for potential confounders.
It is important to realize that the researchers were not as bold as the newsreaders. They wrote under Key Points:
Differences in practice patterns between male and female physicians, as suggested in previous studies, may have important clinical implications for patient outcomes.
TOF notes a few cautions:

Caution #1. The mortality difference was 11.1% vs. 11.5%. We caution you not to clutch your pearls too tightly over this chasm-like gap.

Caution #2. These were "adjusted percentages." That means the actual percentages were something else and the researchers tweaked the numbers to make them "all else being equal." That is, the reported %'s are the result of a model. The uncertainty of the model was not mentioned. One suspects it may have been more than 0.4%-points.

Caution #3. There were about 1,000 "all elses" that were equalized. These are called co-variates in stat-lingo. That's a heckuva lot of co-variates, leading to the possibility of multi-collinearity. This is when two or more covariates are themselved correlated with each other. If this happens, the model is over-determined and the estimates may be flawed. One is more likely to obtain an uninterpretable swamp. Did they check the Variance Inflation Factors and eliminate superfluous covariates? Inquiring minds want to know.

Caution #4. NBC simply said "patients," but the mean age of these patients was about 80 years old and the first rule of sampling is that the results cannot be generalized to populations that were not subjected to the sampling.

Caution #5. Female doctors were about 5 years younger on average, and female docs also treated many fewer patients on average than men. This implies women docs had more time per patient.

Caution #6.  The report says, “female physicians treated slightly higher proportions of female patients than male physicians did.” Since females tend to live longer than males, especially at advanced age, this would present as a higher survival rate for the patients of female doctors, not because of the doctors' skills but because of the patients' longevity. Is that the reason? TOF does not know, and neither do you or NBC.
One species of "Fake News" is when the reporter doesn't know what he's writing about, which is often the case, especially in technical subjects. The problem is that most scientific papers are wrong.
Most published scientific research papers are wrong, according to a new analysis. Assuming that the new paper is itself correct, problems with experimental and statistical methods mean that there is less than a 50% chance that the results of any randomly chosen scientific paper are true. -- New Scientist
Most clinical researchers, while experts in their fields, are not experts in statistics and tend to find significant results that aren't there, especially if they want to find them. What is almost as bad, if not worse, is when they took Intro to Stats back in college and learned a cook-book approach to "number crunching." The on-going obsession with p-values and tests of significance often overlooks two things:
  1. Statistical significance is not causal significance.
  2. Statistical significance applies to the parameters of the model, not to the actual data.
Therefore, findings from such models tend to be overconfident.


  1. The great and powerful Et al:

  2. There's unfortunately a lot of garbage like this published as medical "research". For example, a highly reputable surgeon at a major teaching hospital may have a higher mortality rate for his patients than the guy at the local hospital. This is not because he is incompetent or is just having his residents do the procedure, but most likely it is because he is getting the more complex and harder cases. Likewise, whenever you hear someone making comparisons between the US medical system and those of other countries, they are nearly always forgetting that the US population and those of other countries are *DIFFERENT*. They have different pre-existing conditions, different lifestyles, different cultural patterns etc. The US has a higher mortality rate than other countries from car accidents because we drive more than other countries - but should this be counted against our health care system? Likewise, health statistics are not counted the same in every country (famously infant mortality) and some countries may fudge their numbers to make themselves look better. If you don't automatically trust numbers coming from the US government, why assume that other governments are more trustworthy? Long-story short, be very skeptical of "research" like this.

  3. I warn my children that doctors for the most part get into their profession not because they love science, but because they want to help people. A noble thing - until they start trying to do science. So, when doctor A says something that doesn't seem quite right, really good idea to check with doctor B and C before doing anything too extreme.

    In the bits I've perused, some - but not a lot - seems OK, but the majority of what passes for medical research is hooey. Is fat good or bad for you? How about salt? Is letting your baby sleep in the same bed as you a death sentence ("consensus" in the US up until a few years ago) or, as near universal practice and logic would indicate, a good idea (newly discovered finding of a few years ago)? Some of this is just the "news" digging for clickbait, but it seems the doctors themselves are culpable more often than not.

    The bad news: by pushing so many poorly understood or validated theories and practices, the medical profession damages its credibility for things that actually are important. I have to wonder how much this whole anti-immunization thing is fed by all the studies that pronounce TRVTH one day only to be contradicted by the next study. Can't help.

  4. TOF, thanks for this and other examples! I use them in teaching. Are you familiar, or sympathetic, to Henry Kyburg's work?

  5. Re et-al: