Feb 01, 2022
acx
Read on (unread)

Motivated Reasoning As Mis-applied Reinforcement Learning

Scott analyzes motivated reasoning as misapplied reinforcement learning, explaining how it might arise from the brain's mixture of reinforceable and non-reinforceable architectures. Longer summary
Scott explores the concept of motivated reasoning as misapplied reinforcement learning in the brain. He contrasts behavioral brain regions that benefit from hedonic reinforcement learning with epistemic regions where such learning would be detrimental. The post discusses how this distinction might explain phenomena like 'ugh fields' and motivated reasoning, especially in novel situations like taxes or politics where brain networks might be placed on a mix of reinforceable and non-reinforceable architectures. Scott suggests this model could explain why people often confuse what is true with what they want to be true. Shorter summary

Here’s something else I got from the first Yudkowsky-Ngo dialogue:

Suppose you go to Lion Country and get mauled by lions. You want the part of your brain that generates plans like “go to Lion Country” to get downgraded in your decision-making algorithms. This is basic reinforcement learning: plan → lower-than-expected hedonic state → do plan less. Plan → higher-than-expected hedonic state → do plan more. Lots of brain modules have this basic architecture; if you have a foot injury and walking normally causes pain, that will downweight some basic areas of the motor cortex and make you start walking funny (potentially without conscious awareness).

But suppose you see a lion, and your visual cortex processes the sensory signals and decides “Yup, that’s a lion”. Then you have to freak out and run away, and it ruins your whole day. That’s a lower-than-expected hedonic state! If your visual cortex was fundamentally a reinforcement learner, it would learn not to recognize lions (and then the lion would eat you). So the visual cortex (and presumably lots of other sensory regions) doesn’t do hedonic reinforcement learning in the same way.

So there are two types of brain region: basically behavioral (which hedonic reinforcement learning makes better), and basically epistemic (which hedonic reinforcement learning would make worse, so they don’t do it).

But it’s a fuzzy distinction. Suppose that out of the corner of your eye, you see a big yellowish blob. Is it a lion? To find out, you’d have to turn your head. Turning your head is a good idea and you should do it. But it’s going to involve a pretty decent chance that you see a lion and then your day is ruined. Turning your head is a behavior and not a theory, but it’s a pretty epistemic behavior. Do you do it or not? I think in this situation most people would head-turn. But it looks a lot like a class of problems people actually have trouble with - eg they’re pretty sure they’re behind on their taxes, so they dread opening their budgeting program to check, and then their finances just get worse and worse (Roko Mijic calls this an “ugh field”).

Speculatively, maybe taxes are such a novel situation that they get spread across different brain architecture types: some of them end up on nonreinforceable architecture, other parts on reinforceable architecture. It can’t be 100% reinforceable, or else you could train yourself into thinking your taxes were completely done and no IRS nastygram could ever convince you otherwise. But if it’s 5% reinforceable, it could at least teach you the behavior of not checking.

Motivated reasoning is the tendency for people to believe comfortable lies, like “my wife isn’t cheating on me” or “I’m totally right about politics, the only reason my program failed was that wreckers from the other party sabotaged it”. In this model, it’s got to be what happens when you try to run epistemics on partly-reinforceable architecture. Checking whether your political program worked or not involves a lot of behaviors analogous to head-turning: what sources to check, how much attention to pay to each. It also involves purely epistemic behaviors, like deciding how hard to update on each contrary fact, or whether or not to make excuses.

Maybe thinking about politics - like doing your taxes - is such a novel modality that the relevant brain networks get placed kind of randomly on a bunch of different architectures, and some of them are reinforceable and others aren’t. Or maybe evolution deliberately put some of this stuff on reinforceable architecture in order to keep people happy and conformist and politically savvy.

This question - why does the brain so often confuse what is true vs. what I want to be true? - has been bothering me for years. I think this explanation is obvious, almost tautological. I get the impression that Eliezer and Roko have both known it for ages, but it was new to me. If there’s other research on which parts of the brain are / aren’t reinforceable, or how to run your thoughts on one kind of architecture vs. the other, please let me know.

If you enjoy this fan website, you can support us over here. Thanks a lot!
Send this article to your Kindle or e-reader

We'll email you this article as an EPUB attachment, ready to open on your Kindle, Kobo, or any other e-reader.

Enter your Send-to-Kindle email (it looks like [email protected]) below. For Amazon to accept the file, you first need to add our sender address to your approved list:

[email protected]

Open Amazon approved emails settings

On that page, open "Personal Document Settings", then add the address above under "Approved Personal Document E-mail List".

If your Kindle is linked to a non-US Amazon account, change the link's domain to match your country (for example amazon.fr or amazon.co.uk instead of amazon.com).

Email address
Enjoying this website? You can donate to support it! You can also check out my Book Translator tool.