Andrew Gelman

You are currently browsing articles tagged Andrew Gelman.

Humans are so terribly at hiring other humans for jobs that it seems plausible software couldn’t do much worse. I think that will certainly be true eventually, if it isn’t already, though algorithms won’t likely be much better at identifying non-traditional candidates with deeply embedded talents. Perhaps a human-machine hybrid à la freestyle chess would work best for the foreseeable future?

In arguing that journalists aren’t being rigorous enough when reporting on HR software systems, Andrew Gelman and Kaiser Fung of the Daily Beast point out that data doesn’t necessarily mitigate bias. An excerpt:

Software is said to be “free of human biases.” This is a false statement. Every statistical model is a composite of data and assumptions; and both data and assumptions carry biases.

The fact that data itself is biased may be shocking to some. Occasionally, the bias is so potent that it could invalidate entire projects. Consider those startups that are building models to predict who should be hired. The data to build such machines typically come from recruiting databases, including the characteristics of past applicants, and indicators of which applicants were successful. But this historical database is tainted by past hiring practices, which reflected a lack of diversity. If these employers never had diverse applicants, or never made many minority hires, there is scant data available to create a predictive model that can increase diversity! Ironically, to accomplish this goal, the scientists should code human bias into the software.•

Tags: ,

Bayesian statistics–“Monty Hall Math,” you might call it–is a nontraditional method employed to interpret probability and improve the odds of being right. It’s not a sure thing but a surer one. From F.D. Flam in the New York Times:

“Take, for instance, a study concluding that single women who were ovulating were 20 percent more likely to vote for President Obama in 2012 than those who were not. (In married women, the effect was reversed.)

Dr. [Andrew] Gelman re-evaluated the study using Bayesian statistics. That allowed him look at probability not simply as a matter of results and sample sizes, but in the light of other information that could affect those results.

He factored in data showing that people rarely change their voting preference over an election cycle, let alone a menstrual cycle. When he did, the study’s statistical significance evaporated. (The paper’s lead author, Kristina M. Durante of the University of Texas, San Antonio, said she stood by the finding.)

Dr. Gelman said the results would not have been considered statistically significant had the researchers used the frequentist method properly. He suggests using Bayesian calculations not necessarily to replace classical statistics but to flag spurious results.

A famously counterintuitive puzzle that lends itself to a Bayesian approach is the Monty Hall problem, in which Mr. Hall, longtime host of the game show Let’s Make a Deal, hides a car behind one of three doors and a goat behind each of the other two. The contestant picks Door No. 1, but before opening it, Mr. Hall opens Door No. 2 to reveal a goat. Should the contestant stick with No. 1 or switch to No. 3, or does it matter?

A Bayesian calculation would start with one-third odds that any given door hides the car, then update that knowledge with the new data: Door No. 2 had a goat. The odds that the contestant guessed right — that the car is behind No. 1 — remain one in three. Thus, the odds that she guessed wrong are two in three. And if she guessed wrong, the car must be behind Door No. 3. So she should indeed switch.”

Tags: , , ,