Wednesday, June 18, 2014

Some Statistical Fun

The Armies of the Homeless

  • "One of every 50 American children experiences homelessness, according to a new report that says most states have inadequate plans to address the worsening and often-overlooked problem,"
    – Associated Press (10 Mar 2009)
  • "These kids are the innocent victims, yet it seems somehow or other they get left out,"
    – Dr. Ellen Bassuk, National Center on Family Homelessness.
  • The report estimates 1.5 million children experienced homelessness at least once that year (2005-2006)
But, what do you mean "homeless"? According to the National Center on Family Homelessness, living conditions of homeless children, 2005-2006 broke down as follows:

  • 56% “Doubled Up”
  • 13% in Transitional Housing: 35,799 units
  • 11% in Emergency Shelters: 29,949 units
  • 7% in Hotels, incl. motels, trailer parks and camping grounds
  • 3% Unsheltered, incl. "abandoned in hospitals," "living in cars, parks, public spaces, abandoned buildings, substandard housing, bus or train stations."
  • 10% Unknown
That is, over half the homeless children were living in the home of a relative or other family. This may be crowded, but it is not "homeless." By this definition, TOF was a homeless child for his first five years, for his parents were doubled up with his mother's parents. But this is clearly not what most people mean when they speak of "homeless children." It's not clear whether "transitional housing" or even "substandard housing" meets that criteria, even if the first is temporary and the latter distressing. Can those living in hotels, motels, or trailer parks be called "homeless"? In some ways, yes. But then why not those living on assistance in public housing projects? In any case, about two thirds of homeless children are arguably not what readers have in mind when they hear that phrase.

Tammany Stats

This is the equivalent of ballot box stuffing as exemplified by Tammany Hall, the Chicago Machine, and other paragons of politics who, since the decline of the paper ballot have been reduced to endless recounts and the fortuitous discovery of overlooked vote repositories should the election be in balance. That is, those who feel strongly about topic X are often inclined to exaggerate the extent of X in order to excite emissions of donations through the impact of verbal neutrons.  The usual method is to define X more broadly than the extreme cases the general public has in mind. This helps ensure a maximal estimate of the problem.

Another Example

Some press releases from a while back:
3.9 million women living in a couple with a man were physically abused last year
-- Washington Post (15 July 1993)
The figure came from a 1993 Commonwealth Fund telephone survey of 2500 women, using questions from the 1975/1985 Straus and Gelles Surveys. But what do you mean by "physical abuse"? The questionnaire ran:
In the past year, has your spouse/partner:
  1. Insulted or swore at you
  2. Stomped out of the room, house, etc.
  3. Threatened to hit you/throw something
  4. Threw, smashed, hit, or kicked something
  5. Threw something at you
  6. Pushed, grabbed, shoved or slapped you
  7. Kicked, bit, or hit you with a fist or object
  8. Beat you up
  9. Choked you
  10. Threatened you with a knife or gun
  11. Used a knife or gun on you
The scale is not too bad. Apparently if your siggy bludgeons you with a baseball bat, it only rates a 7 while a mere threat with a knife or gun merits a 10. Not certain where poison falls in the scale. But other than these and similar quibbles, the scale does seem to capture some kind of escalation. The question is where on the scale does physical abuse kick in? Only an academic would call #1 physical abuse; only a fool would say that #11 is not. So where between 1 and 11 do we draw the line when we count "the number who suffered physical abuse"? (BTW, TOF does not know if "none of the above" was a permitted choice.)

The Commonwealth Fund drew the line between #5 and #6. But while #7 and higher is definitely physical abuse (unless in self-defense!), #6 is capable of ambiguity. Able might shove past Baker in order to get out the door to cool off. A great deal depends on the magnitude of the pushing, grabbing, or shoving. Slapping is more definite: a woman might slap her siggy across the chops if she discovers he has been stepping out on her, for example; whereas a push/shove might be more in the way of separating one disputant from another. Get away from me.

Of the 2500 women surveyed, not one answered #8, 9, 10, or 11. Only 84 (3.4%) responded #7, which assuming a statistical sample taken randomly from the population -- a big assumption -- was projected by the Commonwealth Fund to 1.8 million in the country, a more mobilization-worthy figure than 84. A further 125 (5.0%) responded #6 -- projected to 2.6 million. This 2.6 +1.8 added up to the 3.9 million reported in the newspapers because most respondents who checked off #7 likely checked off #6 as well, since 3.9-1.8=2.1≈2.6 Escalation tends to be inclusive of lesser assaults.

This is not as egregious a case of padding as the homeless children, because you can make a good case for #6 being a reasonable cut-off. But the cut-off was set as low as possible because no one at all answered #8 or higher, where there can be no doubt. For a simple random representative sample of 2500 to harvest zero responses means the response rate in the population is probably no greater than  0.4%. (A relative frequency greater than this is unlikely to result in zero incidents captured in a sample of 2500.)

Listen Up

How a question gets asked will affect the way it is answered. Here are two questions polling Americans about a then-nascent NSA program to study patterns in phone calls:
  • USA Today: "As you may know, as part of its efforts to investigate terrorism, a federal government agency obtained records from three of the largest U.S. telephone companies in order to create a database of billions of telephone numbers dialed by Americans. . . . Based on what you have heard or read about this program to collect phone records, would you say you approve or disapprove of this government program?" 
    51% replied that they disapproved.
  • WashPost/ABC: "It's been reported that the National Security Agency has been collecting the phone call records of tens of millions of Americans. It then analyzes calling patterns in an effort to identify possible terrorism suspects, without listening to or recording the conversations. Would you consider this an acceptable or unacceptable way for the federal government to investigate terrorism?"
    63% replied that they thought it was acceptable.
So did the majority of Americans approve or disapprove of the NSA program? Anyone who thinks a poll or questionnaire is an "instrument" should think again.

 Tallest Building

The following picture is long obsolete, but TOF is passing fond of it.  At one time these were the six tallest buildings in the world. TOF's Perceptive Reader will note at once a peculiarity regarding the then-tallest (Petronas) and -second tallest (Sears) buildings:
That's right: the Sears Tower was taller than the Petronas Twin towers. This is because the definition of building height in North America was to the roofline of the structure, while in East Asia it was to the tops of any masts or antennae that were integral to the building. The Petronas was taller than the Sears only because East Asian definition was applied to it while the North American definition was applied to the Sears. Had the same definition been applied to both, the Sears would have won under either definition. And the Empire State would have come in second.

School Shootings

Darwin Catholic has posted an interesting comment on Tammany statistics, in which you flog the little darlings until they confess to whatever they are supposed to confess. Torture yields unreliable results from data as well as from people.

TOF has seen this map on network TV news. It purports to show "the 74 school shootings since the assault on Sandy Hook Elementary School" in Newtown, CT, Dec. 14, 2012.

The data was compiled by Everytown for Gun Safety, a recently-formed gun-control advocacy group and put in map form by Huffington Post editor Mark Gongloff. So there was clearly no bias or anything like that, right?

But it raises the question: What is a school shooting? Everytown for Gun Safety defines school shooting as "incidents ... when a firearm was discharged inside a school building or on school or campus grounds" including "assaults, homicides, suicides, and accidental shootings." This is much more general than what most people think of when they hear "school shooting." The term conjures up maladjusted teens or disturbed adults opening fire in a spree killing. The Everytown definition would include spurned boyfriends bursting into the girlfriend's workplace to murder her, if that workplace happened to be a school. It also includes the following incidents:
  • Joe Gibbs was shot and killed inside a car in southwest Atlanta, not far from the Morehouse campus, apparently in a drug deal gone bad. "Not far from" is apparently enough to qualify as a "school shooting."
  • Travion Foster was shot and killed just before 9 p.m. Wednesday in the field behind Hillside Elementary School.... The Alameda County Sheriff's Department say it appears Foster was involved in a game of dice with several others, when gunfire erupted. So a shooting at a late-night dice game qualifies as a "school shooting" if the game is played "in a field behind a school."
  • A shooting at Indian River State College in Florida, which resulted from police chasing a man brandishing a gun in a pickup truck around town until cornering him in a parking structure on the college campus, where a shootout ensued which injured a college-student bystander before the suspect was successfully arrested. So school shootings include any that happen when a police chase coincidentally ends at a school parking building.
  • University police responded to a large crowd outside Williams Hall gym just before 8:00 p.m. Thursday. Investigators determined a student was chased by the four men, who aren't ECSU students, to a public street, Hoffler Street, and was shot there. This seems to have been gang-related and took place off-campus. Is "a public street" a "school shooting" provided the victim was a student?
  • Morgan Tukes, 17, accidentally shot herself in the courtyard area near the parking lots for her school. She ought not have been packing a gun to school, especially if she did not know how to handle it safely, but again this is not what people normally think of when they think "school shooting."
  • Almost two weeks before school starts, the assistant principal at a Montgomery County high school was found shot to death. Pamela Cooper was found shot to death in her ex-husband's pickup truck Monday night. There is no indication in the news story that this even took place at the school. Is this what we think of when we think "school shooting"?
  • Police say Amado Contreras, 25, and his 24-year old brother, Landyer, attacked Landrick Hamilton, 24, with a pool cue in front of the college's main building. During the fight, Hamilton got a gun from his car at about 1:50 p.m. ET and fired one shot, hitting Amado Contreras.  Apparently, self-defense also qualifies as a "school shooting."
Enough. It is clear that the Bloombergians are deliberately inflating the number of "school shootings" in order to magnify the anxiety.  When a terrible thing is also terribly rare, those who want to eradicate it fall into the temptation of exaggeration by lumping other sorts of things (even other terrible things) into the terrible bucket. Otherwise, other folks might not be anxious enough to support their campaign.

The effect of the Huffington Post map is to make the incidents seem more ubiquitous. That there is almost one "school shooting" each week must be taken in the perspective that there are some 130,000 schools and colleges in the country.  So the likelihood of such a thing happening at your school this week is vanishingly small.

Why is it reprehensible to include student suicides under "school shootings"? Simply because the location was accidental, not essential. Would the advocates have been happier if the kids had shot themselves at home? In an abandoned warehouse? IOW, classifying them in this way obscures the real issues. Where a kid commits suicide may be useful, but only in the context of all child suicides, not all school shootings. It would be like studying "car violence" by combining deaths due to speeding, texting, etc. with heart attacks while in a car.

Dead Babies

In 1998 infant mortality was reported as:
  • Switzerland: 4.8 per 1,000 live births
  • United States: 7.2 per 1,000 live births
Another example of the poor health system in the US. But what do you mean "live birth"?
In Switzerland, infants born less than 30 cm not counted as live birth unless they survive on their own.
The US uses weight rather than length, but infants under 2.2 lbs are typically also under 30 cm long. Often heroic measures are used to preserve the lives of these extreme preemies. Despite this, they often die, and in fact about ⅓ of US infant deaths are of those under 2.2 lbs. These deaths would not even have been counted as births in Switzerland. If they had been counted as "stillborn" rather than "live births," the US rate would have been about ⅔(7.2) = 4.8, essentially the same as Switzerland.

IOW, the difference was an artifact of the different definitions, no less than were the tallest buildings.


TOF! (I hear you say) Surely, infant deaths, homeless children, school shootings, and domestic abuse are Important! How can you denigrate such well meaning folks and their well-intentioned paving stones?

It is precisely because it is important that operational definitions are wanted. The first step in solving any problem is to define the problem precisely, with its "size and shape."Is the plight of homeless children any less demanding on our attention if the numbers are half what is advertized. Are school shootings any less horrendous if we don't dilute spree killings by disturbed individuals by mixing them in with police chases that just happen to end in a college parking garage? Is an infant mortality of 4.8 less of a problem than if it were 7.2? Indeed, by mixing disparate phenomena as they often do, inflated definitions can make it more difficult to address the underlying problems.
“The missing children’s field is littered with definitional disputes.  These disputes are not minor and arbitrary…  It is not possible to count instances of a phenomenon until the phenomenon is clearly defined.”
-- National Incidence Studies of Missing, Abducted, Runaway,
and Thrownaway Children, OJJDP, Dept. of Justice (1990)


  1. May I propose a new title for this post? How TOF reduced the infant mortality rate by 33% in 1998

  2. A few years ago I had a student learning to write research papers, a senior education major, with an interest in homeless children because she encountered them in her student teaching. There was at the time (about 8 yrs. ago) no standardized way of defining or counting the kids even within a single school district, much less a state. A child who changed "addresses" 3 times during a school year might be reported as homeless up to three times, or not at all. It was a big mess, so even with good definitions, where are the numbers coming from?

    What my student found was that teenagers were the most vulnerable in our area--seemingly a lot more people will let a single mom and her little kids sleep in their couch a few months than will do the same for a 16 yr. old fleeing a parent's substance abuse or mental illness. She wrote a good paper; she also started a supply drive for homeless teenagers for stuff like deodorant, toothpaste, etc. in collaboration with the local high schools. I encouraged her to go into administration because she actually noticed the info. problem on her own AND arranged for the supply drive to continue to be done by the college after she graduated.

  3. "[N]o one has ever failed to find the facts he is looking for. The good statistician knows this and distrusts all figures -- he either knows the fellow who found them or he does not know him; in either case he is suspicious." -- Peter F. Drucker, The Effective Executive

  4. You may need to pursue the matter of stability of kinship housing.

    In most instances, folks will be displaced within 6 months, but you go find the data yourself.

    1. I suppose it depends on the kin. We were doubled up for five years before we had a house of our own. But that was a German family in the 1950s, and we were related to most folks in the neighborhood. Things are likely different in other milieux. But once they are displaced, then they may be homeless.

  5. This comment has been removed by a blog administrator.

  6. This comment has been removed by a blog administrator.


New Story from Michael F. Flynn

 Greetings All.    Mike (Dad) has a new story in the July/August edition of Analog . I know Analog is available on Kindle store and Analog ...