This Meme Needs Puncturing.

I’m encountering a statement with increasing frequency online, and my irritation with it has reached critical mass. Pedantry threat level is at maximum.

The statement is this: “The plural of anecdote is not data.” In my experience, it’s trotted out when one wishes to denigrate another’s observations as not being scientific enough—not rigorously collected. Even though I am pretty old school about the definition of “data”—or perhaps because I am old school about it—this meme rankles me like few others can.

A datum is simply a single piece of information—nothing more, nothing less. Data, therefore, are pieces of information—in both scientific and general usage it is implicit that those pieces are somehow related in the context in which they’re being considered. From these basic observations, then, it obviously follows that anecdotes can and do become data.

That alone probably isn’t sufficient to stop this stupid meme, if for no other reason than it doesn’t sound very scientific. But the scientific process isn’t as cleanly delineated from ordinary behavior as some would apparently prefer to think. Still, let’s go ahead and pursue this exploration using that tighter perspective.

The “gold standard” of the scientific method is The Experiment. Many individuals know the drill when they hear that phrase: random assignment into treatment and control groups (or random order of exposure to treatment and no treatment), double–blind conditions (neither participant nor researcher knows whether one is in the control or treatment group; or whether a specific trial is control or treatment); precise measurement of the dependent variable(s); and rigorous statistical analysis of those numbers—that all–important data—to directly test the hypothesis. Most research questions cannot be assessed by these means, however—not ethically.

Randomness is a crucially important aspect of research, in large part because the statistical tests that are done make assumptions about it. The mathematical manipulations of an inferential statistical test numerically tease apart effects of treatment and lack of treatment—the independent variable(s)—on the dependent variable; that allows the researcher to test the null hypothesis of no treatment effect. The significance level of the text describes the probability that the obtained results could occur by chance alone: in other words, by random happenstance. Unfortunately, not nearly enough randomness is possible in experimentation.

Arguably, the biggest challenge in human research is forming random groups. For many important variables, individuals cannot be randomly assigned to groups—we are pre–assigned by virtue of genetics, history, values, etc. Sex and race are two excellent examples of these kinds of groups; people are what they are and cannot ethically be made otherwise. In case the example needs to be more clear, let’s consider another group variable: individuals who have been raped. Few people would be willing to participate in a study for which one might, in order to randomly assign all participants to a group, be raped. Instead researchers use extant classification of individuals. The problem with that, though, is that it isn’t as random—there are probably several things (variables) that make some individuals more likely to be a rapist’s target than others. The loss of randomness means a less rigorous test.

So, that gold standard is actually very difficult to attain. In strict research terms, experiments that have to make use of existing groups rather than random selection into groups are called quasi–experiments—because the experimenter has less control over important variables. The vast majority of experimental research is therefore quasi–experimental—and must be, in order to be ethical research—but this complicated truth is virtually never discussed. And, as my examples should illustrate, in matters of human behavior—upon which much public policy and regulation rest—it is ignored at our peril.

Moving further down the research methodology ladder, many other ways of testing hypotheses exist. They include: observational studies—in which, as the name implies, the researcher simply observes conditions and behavior, rather than directly controlling some things, and manipulating others to see the effects on behavior; and case studies, wherein extensive study of one or a few exceptional individuals is made in order to advance our understanding. One of the best known examples of case study research is that of H.M., the man who lost the ability to form long–term memories after brain surgery. These methods have much less rigor, because the researcher has much less control over many things that might be important; however, they can also be much more naturalistic—and thereby perhaps be more informative than an experiment run in a lab. (This is another very important element of good research that is sadly ignored in practice.) In essence, all these data fall into the category of anecdote ... yet it is clear we would be worse off if they were dismissed as worthless. Moreover, anecdotes gleaned from casual observation, or from special situations like H.M.’s, can be (and often are) refined into more rigorously testable hypotheses.

One anecdote alone may mean something ... or it may mean nothing outside of the person who experienced it. Two or more similar anecdotes can suggest a pattern; and it would be a shabby scientist indeed who turned away from the interesting possibilities because of the informal nature of the observations. Similarly, educated lay people would do themselves a service by rejecting the false dichotomy some would insist exists between anecdotes and data.

Funny thing...

Most of the judgments we make as individuals are based on these "anecdotes," collected from our own observations and informal comparisons of people and events. We observe the coincidences and make conclusions based on our perception of the potential for increasing or decreasing coincidence if certain actions or choices are made.

I believe the process starts about the time a baby is able to put his/her fingers into their mouth... or maybe sooner? :)


Zing! You’re right, of course, Mama. It’s how we work.

The plural of anecdote is internet

Good rant, Sunni. :-)

Just for fun I tried tracking down the source of this meme. The opposite version ("The plural of anecdote is data") seems to go back to one Raymond Wolfinger while teaching at Stanford in 1969-70. The "is not data" version allegedly appears in a 1996 book, The Clinical Evaluation of a Food Additive: Assessment of Aspartame, although the well-known author Someguy Onthenet claims to have invented it in 1965 when he was in college.

I wonder if the popularity of the "is not" version is due to the internet, and the massive availability of anecdotes posted by dimwitted and/or poorly educated people. Maybe we're witnessing inflation and devaluation of the anecdote supply. Pretty soon you'll need wheelbarrows full of anectodes to justify even the simplest scientific investigation.

How are they related to nematodes?

Yeah, I did that too, but decided the results weren’t worth mentioning (I get off track easily enough as it is).

I do think you’re right about the internet’s influence here, Mr. Bill ... it’s the basic phenomenon Mama Liberty described, but without common sense applied. Of course most individuals don’t intend to infer that their personal experiences apply to every reader, but absent explicit disclaimers, many dimwits and some smartasses will make that baseless extrapolation.

Pretty soon you'll need wheelbarrows full of anectodes to justify even the simplest scientific investigation.

At least the anectodes won’t be as wiggly as nematodes. :-D

Anecdotal Memetodes

The internet does not promise that anecdotes proposed by all dumbasses and smartasses are even real, let alone those of narcissists and histrionics (possibly/probably inclusive). So then the question becomes what qualifies an anecdote as data? Does it have to actually be experienced or can it be imagined? What happens to your data then? Does it simply become hypothesis?

Trust, but verify

Regardless of the source, all data must be verified. That process is not always easy, of course.

Not all data is worth the effort either. Much needs simply to be ignored.

A long time ago, I was

A long time ago, I was taught that the plural of anecdote is the start of a hypothesis, or essentially the first step in a scientific inquiry.