Thursday, February 3, 2011

Ethnographic Science and Bayesian Statistics

Is anthropology a science?  As I suggested below, a lot rests on how one uses the word “science,” and I follow Deirdre McCloskey in an ecumenical, expansive definition.

But, even taking a stricter understanding of science, can ethnography based on participant observation pretend to be as rigorous as random sample surveys and rigid statistical analysis?  I think so, and to understand why we need to look back to Thomas Bayes’ (1702-1761) approach to statistics.  

Bayes was a pioneer of statistics, but in the pre-probablistic era of the field.  So, Bayesian statistics works off fewer and less random data points.

Thomas Griffiths and Joshua Tenenbaum (2006, an article in Psychological Science) have taken a surprising look at how folks actually make predictions and found support for a Bayesian model.

Griffiths and Tenenbaum came up with experiments in which they presented participants with an isolated piece of information and ask them to make a generalized inference from it:

Movie grosses: Imagine you hear about a movie that has taken in 10 million dollars at the box office, but don’t know how long it has been running. What would you predict for the total amount of box office intake for that movie?

Poem lengths: If your friend read you her favorite line of poetry, and told you it was line 5 of a poem, what would you predict for the total length of the poem?

Terms of U.S. representatives: If you heard a member of the House of Representatives had served for 15 years, what would you predict his total term in the House would be?

Baking times for cakes: Imagine you are in somebody’s kitchen and notice that a cake is in the oven. The timer shows that it has been baking for 35 minutes. What would you predict for the total amount of time the cake needs to bake?

Waiting times: If you were calling a telephone box office to book tickets and had been on hold for 3 minutes, what would you predict for the total time you would be on hold?

It turns out that people’s predictions about such things were extremely accurate.  Significantly, in all of these items have well-established probability distributions (normal, Erlang, power-law, poisson, etc.), distributions that people know intuitively and thus are able to place lone pieces of data into these internalized distributions.  This allows accurate predictions from lone pieces of data.

Working with good priors, Bayesian inferences are at the heart of ethnography.

Ethnographers who spend lots of time in a place and a lot of time talking to people there, internalize distributions and probabilities. For certain domains they become almost intuitive. Turns out we are pretty good at estimating those things with just one data point BECAUSE we have a pretty good distribution curve in our minds already.

1 comment:

  1. Excellent post. I'd love to read more of your thoughts on this. I've intuited a kinship between Bayesian reasoning and the way I think ethnography can be understood, though I've yet to explicate my thoughts on this. This is a great start, thank you for the inspiration!