April 28, 2005

Odd odds.

I'm very glad they caught this serial rapist. I remember how it felt to be a woman living in NYC at the time these attacks were occurring. But can the NYT aim a little copy editing at this?
The near-perfect match between Mr. Worrell's DNA and the sample recovered from the Manhattan victim's underwear means that the odds that he is the assailant are more than a trillion to one...

Wouldn't that make Worrell the least likely suspect?

UPDATE: Now, based on the discussion in the comments, I'm thinking I'm wrong. Or the odds that I'm wrong are... high.

ANOTHER UPDATE: A "trained probabilist" has arrived in the comments and is defending my original post, making odds of my being right higher. Or at least different.

MORE: Let me make an updated copy-editing point. Statistics should be stated in a very clear way that doesn't cause readers to stop and puzzle about which way the numbers work. So they could have written, for example: "There is only a one in a trillion chance that he is not the assailant."

By the way, does this statistic bother you because one trillion is far MORE that the total number of human beings who have ever lived? [Sorry, I had "less" there for a while-- and, me, talking about copy-editing....]


sp0t said...

Not at all.... They are quoting the odds in favor of a match, so that there is actually a 1 over a (trillion -1) chance of the guy not being the rapist.

You are thinking of horse races where they typically quote the odds against the horse winning, so the higher the odeds the worse the horse is.

Ann Althouse said...

I understand what they think they are doing, which is why they need the word "not" in there.

Dave said...

One of my pet peeves about journalists is they are ignorant about statistics and basic math. Often they are quoted a statistic, and either they interpret the statistic incorrectly, are given the wrong information, or don't understand the information given.

Then, when the fact-checker checks the document (I'm guessing here) he/she is too timid to question the journalist about what precisely the statistic means.

One thing I've learned from many a job in which statistics have been bandied about, is that people get very riled up when you ask them what a particular statistic means. Likely, most people view a statistic as an arbiter of truth, while the true understanding of a statistic is anything but.

Understanding statistics requires critical thought, after all, and too few people, if any, are willing to engage in critical thought about anything.

Therefore, lies, damn lies, and statistics.

Sorry, as I say: a pet peeve.

Stephen said...

I'm no language expert, but I've got to agree with sp0t. As it reads, the odds of the guy being the assailant are a trillion times as likely as the guy not being a rapist.

Ann, if you put a not in there, you'd have to also use the full wording with "against", as sp0t indicated, which would lead to the double negative:

"The odds against the man not being the assailant are a trillion to one." Literally true, but awkward.

sp0t said...

I agree completely , Dave....

I actually think the way this journalist is using the phrase may actually be more intuitive.

I have always thought the horse-odds-thing is kinda backwards, unless you are used to it. That is, playing a horse with bigger odds is a bigger risk because it means the probability of him losing is greater.

Anyway, to be completely correct, you should probably use the words "in favor" or "against".

But to concur, I will bet with odds that the journalist is unclear on what he/she is actually saying, as opposed to what he/she means to say.

Ann Althouse said...

Dave: I'm very interested in this sort of thing too. I highly recommend the book "A Mathematician Reads the Newspaper."

Sp0t, stephen: I see your point. It read completely wrong to me, but when I thought about it again, I found it puzzling. As you note, the "odds against" or the "odds in favor." But I guess the "odds that" is the same as the "odds in favor." I think I need to take back what I wrote.

Stephen said...

It does go against common usage. It would be better to say: "The odds the man is not the assailant are 1 in a trillion."

We usually talk about things in terms of how improbable they are, not how probable they are.

PMM said...

Ann, you're right. Don't take it back!

Odds are a very well defined probabilistic term. Odds of X to Y imply that in X+Y trials, the trial will succeed Y times and fail X times. the probability of success is X / (X+Y), and the probability of failure is Y / (X+Y). The odds are the ratio of P(fail) to P(succeed).

In this case, AS WRITTEN, odds of a trillion to one mean that the probability of the DNA matching is 1 / (10^12 + 1). So small as to be dispositive. What they really mean is that the odds AGAINST the DNA matching are a trillion to one.

(Anything smaller than 1 in 6 billion means that even a random selection of any person in the world would be more likely than that specific person. Obviously the complementary probability would overwhelming disqualify any other individual, which is why this statistic is so damning.)

The best case would be, since (as this threaad has proved) people don't understand odds, DON'T USE ODDS. Especially don't use odds when the probability of the event is so close to zero that the odds and the probability are indistinguishable to 11 decimal places.

Dave said...

Ann: I've read that book, too. It's a good one.

sp0t said...

One more post on this, for pmm....

The ratio P(fail) over P(succeed) is defined as the odds against. Flip the fraction to get odds in favor. So "trillion-to-one" odds could mean either, and without the words "in favor" or "against", who knows outside of context.

Here in context, it seems clear what the journalist was trying to say.

Thanx, Ann, for the "math class".

PMM said...

Spot, I just totally disagree with your last comment. I know that that is what the journalist meant, but it's not what she said. I read the story the way Ann did, and I'm a trained probabilist, which means I am unusually sensitive to the horrible damage that media does to mathematical concepts. Odds means something specific. The odds against an event are a totally different number than the odds in favor of an event. Yes, they're related, but they are by no means interchangeable, which is what you and dave seem to be arguing.

Stephen said...

pmm: Are you SURE? Don't recall ever learning that in Prob & Stats, nor does a cursory internet search turn up any definition matching yours.

Besides, common usage trumps specialist usage when it comes to what should be printed an article intended for common people.

I think this wikipedia article states clearly common usage, going so far as to indicate that odds of 0.25 would equivalte to odds of 4:1 against as spoken by a British bookkeeper:

PMM said...

The fact that American books don't quote their odds the way that Wiki says is problematic for you, Stephen. Consider:

There was a guy who put $1000 down on Carolina at 60:1 to win the Super Bowl before the 2003 season started. The casino that took his bet flew him down to Jacksonville for the game and publicized the possibility that he would win $60,000 if the Panthers won.

According to that Wiki, he should have received $17. If you wish to defend that proposition, I will take your bets every day of the week. After all, what usage of odds is more common than gambling?

sp0t said...

Sorry, PMM....

I definitely did not intend to imply that odds for and odds against are interchangeable. That part is a misundertanding on your part.

My intent was to say that without the adjectives "for" or "against" there is no way to assign which of the ratios you mean unless from context.

This is also the gist of stephen's comments.

The journalist is definitely NOT wrong here. The journalist is just ambiguous, or unclear. Although, you are right that this can be just as damaging as being outright wrong.

Ann's original comment reflects the fact that she took the odds with an implied "against", which arguably is a more common tack. But within the context of the article, I read the intent of the author immediately.

sp0t said...

One more thing also....

60:1 against is precisely 1:60 for.

In that sense, and WITH proper adjectives of "for" and "against", the order of the odds are interchangeable.

Ann Althouse said...

Doesn't all this show that it should have been written differently? Is there a standardized way to express odds? I'm no longer sure of what I thought I knew, which indicates to me that the form is not standardized enough to use in writing for laypersons.

sp0t said...

It is common usage in gambling, as PMM points out, that odds are typically stated as against, which means that the higher first number indicates higher risk of losing, and hence a higher payoff. I think most people accept odds only in this fashion, especially without the adjective.

Yes, Ann, the author definitely should have worded that part differently, and yes, it is a mistake. It is easy to fly by a spoken mistake since people who chat are looking for meaning rather than phrasing. But in print, it should be accurate.

As far as a universal standard, there is none, really, and the adjective "for" or "against" should always be used. You leave it out as the risk of being ambiguous.

It is like saying that I am now walking forward.... If you do not know which way I am facing, then how would you know which direction I am heading in?

Maybe that is a dumb example, but it is to a journalists advantage to remove ambiguity from the article.