April 29, 2020

"It’s a Bayesian thing. Part of Bayesian reasoning is to think like a Bayesian; another part is to assess other people’s conclusions as if they are Bayesians..."

"... and use this to deduce their priors. I’m not saying that other researchers are Bayesian—indeed I’m not always so Bayesian myself—rather, I’m arguing that looking at inferences from this implicit Bayesian perspective can be helpful, in the same way that economists can look at people’s decisions and deduce their implicit utilities. It’s a Neumann thing: again, you won’t learn people’s 'true priors' any more than you’ll learn their 'true utilities'—or, for that matter, any more than a test will reveal students’ 'true abilities'—but it’s a baseline."

From "Reverse-engineering priors in coronavirus discourse" by Andrew (at Statistical Modeling, Causal Inference, and Social Science,) via "What’s the Deal With Bayesian Statistics?" by Kevin Drum (at Mother Jones).

Both of these posts went up yesterday, that is, 2 days after I said, "Shouldn't we talk about Bayes theorem?" I'm not saying I caused that. I'm just saying maybe you should use Bayesian reasoning to figure out if I did. I will stand back and say, this is not my field. I'm only here to encourage it.

37 comments:

daskol said...

An explicitly Bayesian approach--inference based on clearly stated priors, updating of priors and hypotheses as data come in--is particularly useful under conditions of uncertainty where action may need to be taken. It's a helpful approach to inferring whether there's something that's better than nothing. The most explicitly Bayesian analysis that needs to be conducted regards interpretation of various COVID-19 tests. But the approach is also useful for bigger picture analysis.

Ficta said...

Whenever somebody starts talking about Bayesian statistics, I pat my pockets to make sure I've still got my wallet. It's not that Bayesian statistics is a fraud or anything like that, and I freely admit that it makes my head spin. It can be legitimate: I once, for instance, saw some very impressive signal processing that used it. But it's such an opportunity for the less scrupulous or less competent practitioner to use "priors" they pulled from the orifice of their choice in order to prove anythingatall.

Nonapod said...

I regret saying that the authors of the Santa Clara study “owe us all an apology.” I stand by my reactions to that paper, but I don’t think that politicization is good. In an ideal world, I would not have made that inflammatory statement and the authors would’ve updated their paper to include more information and recognize the ambiguity of their results.

At least he admits he's just as susceptible to overly hyperbolic reactions as anyone else. That's an improvement.

As he points out, this topic is currently too political. People seem far too invested in the virus either being more or less infectious and deadly, so much so that they're willing to ignore any and all contravening information. I myself suspect and hope that the true percentage of people with antibodies is much higher than the current confirmed numbers indicate, perhaps on the order of like 5-30% of the population. That would obviously lower the true fatality rate, it would also get us much closer the much discussed "heard immunity" levels. And we could reopen the economy sooner and start to pick up the pieces of the damage we've wrought.

chickelit said...

Should “Neumann” be “von Neumann”?

DavidUW said...

Here's all you need to know about Bayes stuff:

consider the turkey.
consider his experience.
For 729 days he waddles out to the yard and the farmer feeds him every time.

the probability of the farmer coming out to feed him on day 730 is very very very very high, according to Bayes.
Day 730 is Thanksgiving.

Sebastian said...

"I’m arguing that looking at inferences from this implicit Bayesian perspective can be helpful, in the same way that economists can look at people’s decisions and deduce their implicit utilities."

Hey, can I play?

So, we have shutdowns that produce no gain in health, destroy actual health care, and harm the economy.

So, what can we infer about the "priors" (?) of people wanting such a clearly foreseeable result?

Obviously, health was not their priority. Obviously, economic growth was not their priority. Obviously, preserving actual health care was not their priority.

So, what was? Two possibilities: the priorless prior of pure panic, leading to a suspension of any rational thought whatsoever; or the scheming exploitation of a crisis that should not go to waste, using bad science to justify tanking the economy as part of a calculated political assault on Trump, the deplorables, and America generally.

What say you, fellow Bayesians?

daskol said...

This is a really cool website that presents a simple Bayesian "AI" for medical diagnosis. As you continue to input symptoms, it narrows down the diagnosis. If you want to see how Bayesian inference works, this is a terrific example.

Yancey Ward said...

The author reveals his priors with this:

"There was a study reporting 20% with antibodies in New York City. Nobody thinks that is all or even mostly false positives; obviously a lot of people in the city have been exposed to the virus. There are sampling issues so maybe underlying rate was really only 10% or 15% at the time the study was done . . . we can’t really know!"

Do you see what he did there? A kudos for the first person to mention it.

rhhardin said...

Nobody thinks Kalman filtering anymore. Much better than Bayes. You get the complete insides of the other guy's head.

ccscientist said...

There are two aspects to the Bayesian thing here. The first is your priors regarding epidemics and models thereof. How much should one believe them? Which data can be relied on as input? China turned out to be lying and Italy was maybe an anomaly. But the second regards the cost and risk of the shutdown. It is quite possible that the negative impact of the shutdown (losing your home or business, depression, divorce, suicide) could be worse than the deaths (in some non-computable way). No one really considered this in the heat of the moment of panic. In addition, nuance like not treating ND the same as NYC or banning large gatherings but not restaurants or retail (as in 1918) were not considered. The choice set was so narrow (shutdown or mass mortality) that more damage was done to the economy (which is not an abstraction but consists of millions of jobs and lives) than necessary.

Yancey Ward said...

Neumann!!!! 😠

Michael K said...

It's interesting that when I studied it for my medical statistics course, it was referred to as "Bayesian algebra."

Yancey Ward said...

"What say you, fellow Bayesians?"

I would say it was a range of pure panic and Machiavellian malice.

Josephbleau said...

"Part of Bayesian reasoning is to think like a Bayesian”

And the first rule of the Bayesian club is the first rule of the Bayesian club.

Roger Sweeny said...

The Andrew is Andrew Gelman, statistics professor at Colombia, and one of the major analysts and publicists of the replication crisis in science and sort of science.

Bay Area Guy said...

@DavidUW,

Here's all you need to know about Bayes stuff:

consider the turkey.
consider his experience.
For 729 days he waddles out to the yard and the farmer feeds him every time.

the probability of the farmer coming out to feed him on day 730 is very very very very high, according to Bayes.
Day 730 is Thanksgiving.


Outstanding. I was gonna say some bullshit about a priori expectations, but yours is better.

daskol said...

Andrew Gelman's blog is a treasure.

Josephbleau said...

“P.S. One reason I’d like to shift the discussion from “bias” or “motivated reasoning” to “priors” is that taking about priors is less inflammatory than talking about bias or motivated reasoning. Bad priors give bias, and motivated reasoning can lead to bad priors, but if researchers can identify their priors and their assumptions more specifically, I think this would help, if for no other reason than that some direct reflection can help clarify internal inconsistencies.“

This is why I have a problem with this multi purposing of “Bayesian.” If the purpose is only to be non inflammatory or to encourage introspection, how does that relate to the execution of a mathematical procedure on data?

“Asking people to specify their priors in a statistical analysis is comparable to the thing that people sometimes say in debate: “What evidence would it take you to change your mind on this issue?””

This is very very astute, and points out the elephant in the room of Bayesian inference. How much am I going to allow new data to change my previous beliefs on this hypothesis? Cool, in my world new data either confirms or rejects my prior beliefs, that is why I am not much of a Bayesian, but I did take a 3 day SAS class last November on Bayesian MCMC general linear modeling, just because it is mathematically amusing.

tim maguire said...

Sebastian said...Two possibilities: the priorless prior of pure panic, leading to a suspension of any rational thought whatsoever; or the scheming exploitation of a crisis that should not go to waste, using bad science to justify tanking the economy as part of a calculated political assault on Trump, the deplorables, and America generally

Problem. The people who know the most mostly support the approach. The people arguing about how it's all unnecessary,a plot by America haters are for the most part armchair quarterbacks on blogs who for don't know what they're talking about.

chuck said...

I'll just throw out that Bayesian and frequentist statistics understand probability differently. A Bayesian regards probability as a measure of subjective knowledge, a frequentist regards it is something objective. For instance, a Bayesian might start with a prior that a tossed coin has equal probabilities of landing heads or tails because that represents his complete lack of knowledge. A frequentist, OTOH, will maintain that that is just how coin tosses work.

chuck said...

Nobody thinks Kalman filtering anymore

Of course they do, it is just Bayes statistics with multivariate Gaussian probabilities :)

gilbar said...

20% with antibodies in New York City. Nobody thinks that is all or even mostly false positives

Mostly False positives would mean MORE than half were false, half of 20% is 10%

There are sampling issues so maybe underlying rate was really only 10% or 15%


So,
he's saying that (while NOBODY thinks it's mostly false,) it's between 25% false and 50% false
That is between CRAP, and MASSIVE CRAP... Which is "assuming facts not in evidence"

gilbar said...

isn't it TRUE?
THAT MOST people that use Bayesian reasoning... have PRIOR experience with Bayesian reasoning?

William said...

I went into the comments section here fully expecting that the comments would be above my head. This expectation has been proven to be correct. Another example of Bayesian reasoning?....I contributed something useful to the Wordsworth discussion. We all have our areas of interest.

n.n said...

Bayes Theorem is only as good as the chosen distribution and/or population sample, and assumptions/assertions.

Gabriel said...

@Josephbleau:How much am I going to allow new data to change my previous beliefs on this hypothesis?

If you are doing Bayesian analysis, you are declaring your priors and putting them in the analysis for others to examine. If you prior is a delta function on the point you already decided in advance, then that will be very apparent to your reviewers.

Yes, Bayesian statistics lets you do calculations that only reflect your priors, if you choose, but the whole point is you are declaring your priors and others will reject your inference.

Gabriel said...

@Josephbleau:How much am I going to allow new data to change my previous beliefs on this hypothesis?

If you are doing Bayesian analysis, you are declaring your priors and putting them in the analysis for others to examine. If you prior is a delta function on the point you already decided in advance, then that will be very apparent to your reviewers.

Yes, Bayesian statistics lets you do calculations that only reflect your priors, if you choose, but the whole point is you are declaring your priors and others will reject your inference.

@n.n: Bayes Theorem is only as good as the chosen distribution and/or population sample, and assumptions/assertions.

No different from frequentist statistics. If you don't draw to an inside straight, because you know it's a sucker bet, you've made assumptions about how poker works that is not supported by your experience whether or not you are a Bayesian.

Fernandinande said...

I posted here about Gelman's Apr 19 post (h/t Sailer); the "Bayesian" discussion applies to test inaccuracy, mostly false positives.

Fernandinande said...

I'm not saying I caused that.

Gelman and Rushton were talking Bayesian on Apr 19, a week or so before you mentioned it.

Josephbleau said...

“Yes, Bayesian statistics lets you do calculations that only reflect your priors, if you choose, but the whole point is you are declaring your priors and others will reject your inference. “

OK but really convoluted. So If I do a weeks work on an industrial experiment for a lab, they have to look really close at my priors, cause I might have fooled them. Why not just tell them what their experiment demonstrated and make it easy for them. Bayesian analysis is a solution in search of a problem, in my opinion.

gbarto said...

If everyone over 50 were required to take a cyanide capsule, we would see a dramatic decline in COVID-19 deaths. Therefore, everyone who opposes giving cyanide capsules to everyone over 50 thinks it's okay to let COVID-19 claim more lives than necessary.

It's a stupid idea, of course, but if you plug it into the equation, you'll find the likelihood of dying from COVID-19 is greatly reduced if everyone over 50 is given a cyanide capsule.

And this is the problem with the word Bayesian. Ninety-five percent of the time, when you see the word Bayesian, what follows is an attempt to argue that adopting the author's thinking is just logical and scientific. Specifically, it is an attempt to pretend that there are no value judgments to be made, just logical dictates to be followed.

As for the author in this case, it sounds like it is an attempt to argue that a chain of inferences about others' thinking will reveal why those who disagree with him are intellectually broken, but don't sweat it because we're using nicey-nicey terminology to say that their reasoning process is defective but the author's is logical. What about the Bayesian theorem? It can be very useful when you have solid data sets with binary outcomes to compare. As others have noted, regression analysis with a large number of other variables thrown in would do better at evaluating whether a certain group of factors are relevant.

gbarto said...

Okay, I went back and read the articles underlying the post.

1) I agree with Kevin Drum (did I say that?) about Bayesian analysis.

2) In the main piece, the author did a satisfactory job of explaining why, statistically, it was hard to come up with the same numeric estimates the study authors did without some underlying assumptions/information not documented in their study.

3) Calling the assumption that people's assessment of information is affected by their biases "implicit Bayesian perspective" is grandiloquent BS of the highest order. The whole bit about Bayesian perspective could easily be said by anyone with basic knowledge of human psychology without invoking Bayes.

Elder Nerd said...

Bayes theorem is an algebraic relationship that describes how new data can be combined with prior data. In the best case, the prior data is extensive and specific, so the effect of new observations will be small. In public policy matters the prior data may be a matter of expert opinion or even personal opinion, and the effect of new data can be quite variable. Since we are often inclined to give more weight to data that confirms our prior opinions, it may be possible to explain diametrically opposed opinions as the result of "Bayesian" reasoning.

In DavidUW's earlier turkey example, the prior data is incomplete and the Thanksgiving update would cause a Bayesian to make a large revision in the mortality curve.

Beaumont said...

Not a scientist, mathematician, economist, but from what I gather, Bayesian reasoning embodies a set of procedures for trying to understand what is going on, i.e. you start with your best current understanding (which might suck but you've got to start somewhere) and make improvements as new (and better) data come in. You've got to be flexible enough to revise your thinking when the called for by the data. Ultimately, this approach should lead to better predictions

For example, I'll start with this initial premise- "Bayesian reasoning is never used by politicians, pundits, and conspiracy theorists' and I will make adjustments as needed when new and improved information presents itself.

narciso said...

So what value is one assuming to plug into a bayesian analysis

Josephbleau said...

narcisso,

The right way to assign a Bayesian prior is for example to account for when you believe that an effect is non negative, then I would use a gamma distribution as a prior for that effect because it can’t be negative. The stronger I make the prior, by making its variance small, the more it influences the final parameter. If I make the prior very strong the new data from my experiment have less and less influence. In my example if the experimental data violate my non negative assumption and the value is really negative, my selection of a non negative prior distribution will force the parameters to be more positive, just because that is how I defined it, not because the data showed it to be. This is why Bayesian analysis must be watched closely. It works in situations where there is a large history of past experiments that indicate a certain result, and you want your new data to be influenced by the old. In a way, like a meta study.

In a statistics department the Bayesian guy is the cool kid who wants to be edgy, so he will spent a lot more computer time doing Bayesian analysis that ends up giving nearly the same answer that everyone else gets by frequentist work. Except in a few cases Bayesian analysis has no closed solution and you need to do a huge numerical integral to proceed. This limits your analysis to smaller problems with today’s computers. Sorry, I think I have now said more than anyone wants to hear about Bayesian statistics.

Paul Snively said...

There's quite a lot of misapprehension about Bayesian probability out there, apparently.

First, there are indeed several schools of thought about probability, beyond just the usual "frequentist" vs. "Bayesian" camps. This review of Jaynes' "Probability Theory: The Logic of Science" breaks them down very well.

While what he says is true, it neglects to point out that Jaynes goes well beyond this description. Fully the second half of the book consists of Jaynes' elucidation of three means of constructing prior probability distributions of data sets given nothing but the data sets themselves: transformation groups, marginalization, and the maximum entropy principle, the latter of which he was responsible for introducing in 1957.

Some have heard the message. Chapter 5 of Data Analysis: A Bayesian Tutorial concerns how to assign probabilities at the beginning of an analysis—that is, how best to assess prior probabilities given the data and background information at hand. It's worthwhile if for no other reason than providing a more foundational, logic-based explanation of why some common distributions (Gaussian, Poisson, binomial, gamma...) are appropriate, to whatever degree, for certain classes of problems. But the real point is to be able to construct your own prior distribution from scratch. Chapter 7 deals with experimental design: how should one design an experiment that will yield the most additional information to guide inferences?

It's worth adding that the book includes several sidebar discussions on writing software to perform the kinds of analysis under discussion, including dealing with computational complexity issues.

Finally, Statistical Rethinking: A Bayesian Course with Examples in R and STAN, the 2nd edition of which was just published in March, does a spectacular job of explaining all of this in even more nuts and bolts terms—practically everything is explained, justified, and computed, again with an eye toward allowing you to develop your own models and draw your own inferences with confidence in the underlying principles—and, crucially, reproducibility guaranteed merely by running the same program on the same data.

Honestly, it's a bit depressing to me to see so many people so cavalierly dismissive of Bayesian probability, because there's quite frankly no other game in town: this is how sound reasoning in the presence of uncertainty is done. It's true that, if you claim the prior probability of a proposition is 0.0 or 1.0, Bayes' theorem will never give you a posterior probability of anything different, but that's as you should expect: it means that, in the continuum limit, Bayes' theorem goes into good ol' modus ponens, as you should hope and expect.

What I really hope is that more intelligent and curious amateurs take the opportunity the above works offer to become fluent in probability, take the data that's publicly available, and construct and share their own models, to help with the reproducibility crisis as well as the overreliance on "expert" opinion and credentialism we find in the media. You do not even need a degree—in anything—to learn this material and put it to good use.