Cooperation Un-Veiled

Contractualism tries to derive morality from an agreement that even selfish agents would willingly sign if they knew about it. In theory, you would gain from such an agreement, since the costs of not being able to behave unethically towards others would be at least balanced by the benefits of other people not behaving unethically to you.

Such attempts crash into the brick wall that not everybody would, in fact, sign such an agreement. For example, the King might reasonably argue that he is able to reap the benefits of oppressing lots of people, but almost nobody can oppress him. To give another example, rich people might feel no need to give to charity, since they don’t need anyone else to give charity to them.

One classic solution to the problem is Rawls’ “veil of ignorance”. Rawls asks: what if we have to make the agreement before we know who exactly we’re going to be? The future King, not knowing he will be born a King, will agree oppression is bad along with everyone else; the future rich, not knowing they will be rich, will want to create a strong social safety net and tradition of charitable giving.

The great thing about this thought experiment is that it works pretty well to get us what we want – assuming a veil at just the right spot, we end up with something like utilitarianism being in everyone’s best interests.

The bad thing about the thought experiment is that there is not, in fact, a veil of ignorance. There’s just a King, who when asked will tell you he knows perfectly well he’s a King and would like to keep on oppressing people. So what can we do with the universe we actually have?

Here’s a model I have been playing around with recently.

Suppose there is a society of one hundred men, conveniently named Mr. 1, Mr. 2, and so on to Mr. 100. Higher-numbered people are stronger than lower-numbered people, such that a higher-numbered person can always win fights against a lower-numbered person at no danger to themselves. Further, suppose this society has a god who enforces all oaths and agreements, but who otherwise stays out of the picture.

(in order to avoid finicky math distinctions between choosing with replacement and choosing without replacement, it might help to think of these as arbitrarily large clans of people with with specified strength instead. Whatever.)

This society is marked by interactions where two randomly selected people meet each other. Sometimes the people nod at each other and pass each other by. Other times, the stronger of the two people overpowers the weaker one and oppresses them in some way, where an oppression is an interaction where the stronger person gains and the weaker person loses some utility.

One person proposes a rule: “no oppressing anyone else.” How much support does the rule get?

Well, that depends on the character of the oppression. Some oppression can give the oppressor exactly as much utility as it costs the victim – for example, I steal $10 from you, making me $10 richer and you $10 poorer. Other oppression can cost the victim more than it benefits the oppressor – for example, I steal your wallet, which gives me only whatever small change you have in there, but you have to replace all your credit cards and licenses and so on. Still other oppression could help the oppressor more than it hurts the victim – for example, starving Jean Valjean steals a loaf of bread from a rich man.

So let’s be more specific. One person proposes a rule: “No zero-sum oppression.” Who agrees?

Naively – and I’ll challenge this later – Mr. 1 through Mr. 50 agree, but Mr. 51 through Mr. 100 refuse. Analyzing Mr. 25’s thought process should explain: “In 25% of interactions, I will be the oppressor. In 75%, I will be oppressed. Assuming one of my utils for one of their utils, that means in a hundred interactions I will on average lose fifty utils. Therefore, I should ban this type of interaction.”

Mr. 99, on the other hand, likes this kind of oppression. He thinks “In 99% of interactions, I will gain. In 1%, I will lose. So in a hundred zero-sum interactions, I will on average a gain of 98 utils. Therefore, I like this type of interaction.”

But Mr. 99 might have a different rule he would agree to. He might say “No oppression so bad that it hurts the victim >100x as much as it helps the oppressor.”

It’s easy to think of examples of this kind of oppression. For example, if I’m having a really bad day and just want to beat someone up, breaking your ribs might make me feel a little bit better, but probably not even one percent as much as it makes you feel worse.

Mr. 99 thinks “In 99% of interactions I will be the oppressor; in 1% I will be the victim. Each time I am the oppressor, I gain one util; each time I am the victim, I lose 100. Therefore, in 100 interactions I will lose on average one util. Therefore, I don’t like this kind of oppression.”

And it’s easy to see that Mr. 1 through Mr. 98 will agree with him and be able to sign this contract.

The logical conclusion is a hierarchy of agreements. Mr. 1 signs an agreement banning all oppression, Mr. 1 and 2 together sign an agreement banning oppression that helps the oppressor less than 50 times as much as it hurts the victim, Mr. 1 and 2 and 3 together sign an agreement banning oppression that helps the oppressor less than 33 times as much as it hurts the victim, and so on all the way to everyone except Mr. 100 signing an agreement banning oppression that helps the oppressor less than 1/100 as much as it helps the victim. Mr. 100 signs no agreements – why would he?

Before I explain why this doesn’t work, I want to think about what it means in real world terms.

It would replace the one-size-fits-all principle of utilitarianism with the idea of power-based utility ratios. This seems to kind of map on to real life experience. For example, the King may order his servant to spend hours getting the floor polished absolutely spotlessly. Having a perfectly spotless floor (rather than a very clean floor with exactly one spot) gives the king only a tiny utility gain, but may require many more hours of the servant’s time and labor. That the King can command a large amount of the servant’s utility to improve his own utility only a tiny bit seems a lot like what it means to say there’s a power differential between the King and the servant. If the servant tried to reduce the King’s utility by a large amount in order to improve his own utility by a tiny amount, he would be in big trouble.

I notice this in my own life as well. Last year I worked under a doctor who was consistently late. The way it would work was that he would say “I have a meeting at 8 AM every morning, so you should be in by 9 so we can start work together.” Then his meeting would invariably run to 10, and I would be left sitting around for an hour doing nothing. It might seem that the smart choice would have been for me to just sleep late and arrive at 10 anyway, but suppose one day a week, my boss’ meeting finishes exactly on time. Then if I’m not there, he has to wait for me, and he considers this unacceptable. So if my boss and I value an hour of our times the same amount, it would seem this arrangement implies my boss’ utility is worth at least seven times as much as my own.

There are some features of this power-ratio utilitarianism that are repugnant: the rich seem to be held to a very low standard, whereas the poorer you are, the more exacting a moral standard you’ve got to live up to. That seems like if anything the opposite of how it should be. But other features actually seem better than our current morality – if giving charity to the poor improves their utility 100x as much as it decreases yours, then the 1% have to donate, probably quite a lot.

Enough of that. The reason this doesn’t work is simple. Mr. 1 through Mr. 50 would want to sign the zero-sum agreement. But if he knows the rules of the thought experiment, Mr. 50 can predict that Mr. 51 through Mr. 100 won’t sign the agreement. None of the people who could conceivably oppress him will consider themselves bound by the rule. So he’s not trading his right to oppress others in exchange for others’ right to oppress him, he’s giving up his right to oppress others but should still expect exactly the same amount of oppression as he had before. Therefore, he does not sign.

But now Mr. 49 is in the same such position. He knows nobody stronger than he is, including Mr. 50, will sign the agreement. Thus the agreement is useless to him.

And so on by induction all the way to Mr. 2 refusing to sign (it doesn’t matter much for poor Mr. 1 either way).

This produces some weird results. Mr. 99 is no longer willing to accept his “No breaking people’s ribs just to let out some stress” agreement that banned utility exchanges worse than 1:100, because the only person whose help he wants, Mr. 100, isn’t going to sign. That means Mr. 98 won’t sign, Mr. 97 won’t sign, and again, so on all the way down to Mr. 2.

In other words, even the second weakest person in a society has no interest in signing an agreement not to punch people weaker than you when you’re having a bad day.

But this is a stupid result!

It reminds me of a problem noticed in Iterated Prisoner’s Dilemma. Conventional wisdom says the best thing to do is to cooperate on a tit-for-tat basis – that is, we both keep cooperating, because if we don’t the other person will punish us next turn by defecting.

But it has been pointed out there’s a flaw here. Suppose we are iterating for one hundred games. On Turn 100, you might as well defect, because there’s no way your opponent can punish you later. But that means both sides should always play (D,D) on Turn 100. But since you know on Turn 99 that your opponent must defect next turn, they can’t punish you any worse if you defect now. So both sides should always play (D,D) on turn 99. And so on by induction to everyone defecting the entire game. I don’t know of any good way to solve this problem, although it often doesn’t turn up in the real world because no one knows exactly how many interactions they will have with another person. Which suggests one possible solution to the original problem is for nobody to know the exact number of people.

(now I want to write a science fiction novel about a planet full of aliens who are perfect game theorists, but who always behave kindly and respectfully to one another. Then some idiot performs a census, and the whole place collapses into apocalyptic total war.)

It seems like there ought to be some kind of superrational basis on which the two sides in the iterated-100 prisoners dilemma can cooperate. And along the same lines there ought to be some kind of superrational basis upon which everyone in the society of 100 people should stick to some basic utility-ratio principles. But I’m not sure what it would be.

Some other variations of this problem might be more interesting, but I don’t think I’ve got the math ability or the time to think about them as carefully as they deserve:

1. What if all fights contained a random element? For example, suppose your chance of overpowering someone else (and thus being able to oppress them) was your_strength/(your_strength + opponent_strength)? In societies of this type, agreements to ban strongly negative-sum interactions would be more salient for everyone, since even Mr. 100 would have some chance of being beaten in a typical interaction.

2. How about a meta-agreement, in which people say “I agree to sign the agreements requested by people weaker than myself if and only the people above me agree to sign the agreements benefitting people weaker than they?” Such an agreement wouldn’t make sense for Mr. 100, and so Mr. 99 would not sign, and so on down, but is there a superrational solution?

3. What if one type of agreement people were allowed to make was a coalition to gang up against opponents? This seems one of the most important real-world considerations – one of the things that does make Kings behave at least somewhat morally is the knowledge that they will be overthrown if they do not; likewise, some countries implement social welfare systems with the explicit goal of decreasing the poor’s incentive to overthrow the rich (I think Bismarck tried this). On the other hand, it also gives the powerful an incentive to band together to better oppress the weak. I’m pretty sure the effects of this would be impossible to really calculate, but might we lump them together into saying “This is so nondeterministic that no one can ever be sure they’ll end up in the winning as opposed to the losing coalition, therefore they are less certain of victory, therefore they should be more likely to agree to rules against oppression”?