Wednesday, April 6, 2011

The Dreaded Red Squamish

The aforesaid disease, called DRS (sometimes DRSq), afflicts one in ten citizens.  It results in the decay of the reasoning process, causing sufferers to utter strings of complete gibberish.  But as you see when the number of incidents is looked at geographically, DRS is not equally distributed.  The geography is simulated by a 10x10 grid, each square of which is like a congressional district: roughly equal in population. 

As you can see, there are two townships in which DRQ is concentrated.  The township at (A2:B4), known as Appletown, has an occurrence rate which is five times the national average of c-bar=0.1!  The other cluster is just west of the Interstate (column I) at (H3:I5), which is called Barkleyville.  The occurrence rate here is even higher: nearly seven times the national average.  There are only three scattered incidents in the remainder of the region. 

Incidents of Dreaded Red Squamish (DRS)

ABCDEFGHIJ
10000000000
20
100000000
31000000100
41000000110
50000000100
60000000000
70000000000
80000000000
90010000001
100000010000

Something Must Be Done! 

What is the cause of DRS?  Clearly, the Interstate (col. I) has something to do with it.  Perhaps deadly toxins in the exhausts of Personal Automobiles and other non-mass transit vehicles.  But this does not explain Appletown!  Closer examination reveals that Appletown is the site of a pharmaceutical plant, so it is probably toxins emitted by the plant as they manufacture deadly drugs in an effort to kill all their customers.

Enough!  I say, Enough!  Here is proof positive that DRS causes sufferers to babble nonsense.  See the comm box here: 42 Disease Clusters In 13 U.S. States Identified.   Come to think of it, read the press release news article.  Better yet: read the tendentious propaganda scientific study.  Only a few people in the comm box appear to be immune to the brain-rot of DRSq. 

In fact, the major cause of clusters is simply to select clustery-looking areas.  The Scientific Study does not specify any criteria, only "An unusually large number of people sickened by a disease in a certain place and time is known as a ‘disease cluster’."  In order to identify numerous clusters, the disease of choice varied from place to place or time to time, and "unusually large" was left unusually vague. 

The cure for DRSq is at hand: statistical literacy.  No process is ever perfectly uniform over geography, nor perfectly steady over time.  Years ago, Rosiland Yalow had a wonderful paper about apparent clusters among low probability events.  They are of course virtually guaranteed.  She gave an example of IIRC childhood leukemia cases in a certain county.  The data was nicely random, then there were several consecutive years of double-digit incidents, then they disappeared.  Nothing unusual happened during those years or stopped happening immediately after.  Remember: a cause must explain not only what happens, but what does not happen.  If for example, a highway is the cause of DRQ in Barkleyville, then we must explain why the incident rate is not high anywhere else along the highway.  If the incidence rate in Appletown is due to the pharmaceutical plant, then we must explain why there were no elevated numbers in the past when the plant was already there and already making the same line of products; or why there is no similar rate at other sites where there are similar plants. 

In the present case, the mystery is solved.  The cause of the DRSq clusters is.... me. 

I used Minitab 16 to generate 100 random Poisson variables in a single column, then I destacked the column into ten columns of ten rows, and inspected for "clusters."  The mean of the Poisson distribution in all cases was c-bar = 0.1.  IOW, both clusters were optical illusions created by random number generation, even though one was five times and the other nearly seven times the over all average.  Relative risk ratios over a small base will almost always appear dramatically high simply because the denominator is dramatically low. 

100 Random Poisson Digits, c-bar = 0.1

ABCDEFGHIJ
10000000000
20
100000000
31000000100
41000000110
50000000100
60000000000
70000000000
80000000000
90010000001
100000010000

For sooth.

2 comments:

  1. This is kind of like the "400% increase" in suicides among X group, isn't it?
    (Popular in military hit pieces; sometimes the variation of "has now exceeded their age-group" is used, without mention that they're ignoring the sex differences.)

    ReplyDelete
  2. Yes, that's an especially egregious case. There was a USA Today piece a while back about police suicides being so much greater than "the general population." But making no mention that the general population is more female and includes children.

    ReplyDelete

In The Belly of the Whale - Now Available

    Dear Readers, Dad's final (? maybe?) work is now available at Amazon, B&N, and many other fine retailers. I compiled a list a fe...