99LDT x 1CDT oneshot PD tournament as arguable counterexample to LDT doing better than CDT

[summary: As a counterexample to the assertion that Logical decision theories handles a strictly larger fair problem class than CDT, [wei_dai Wei Dai] proposed the example of 1 CDT agent playing 99 LDT agents in a oneshot Prisoner's Dilemma without knowledge of the other agent's code. The LDT agents must cooperate uniformly to avoid accidentally defecting against each other; the CDT agent can defect against all of them.

Counterargument: Imagine a CDT agent and an LDT agent see a $20 bill lying in the street. Two days previously, Omega has already bombed an orphanage if the LDT algorithm's output is to pick up the $20 bill. The LDT agent refuses the money, and the CDT agent happily picks it up.

From the standpoint of a CDT agent, this situation seems fair because the CDT agent thinks that Omega has already left, and so believes the LDT agent would get the same payoff for the same physical act. But from the LDT agent's perspective, a behavior from one algorithm is being penalized while the identical behavior from another algorithm is being rewarded. Similarly, on the LDT view, 1 LDT agent versus 1 CDT agent facing 98 LDT agents in a PD tournament are facing an 'unfair' asymmetric correlation that the CDT agent doesn't care about.]

One of the arguments supporting Logical decision theories over Causal decision theories is the assertion that LDT handles a strictly wider problem class and therefore [ dominates]: Arguendo, we lose nothing by moving from CDT to LDT, and instead gain the ability to handle a strictly wider class of problems.

[wei_dai Wei Dai] posed the following counterexample: Suppose that 99 LDT agents and 1 CDT agent are playing a tournament of oneshot Prisoner's Dilemma. Then the LDT agents calculate that they will lose a great deal by defecting (since this would make them defect against 99 other LDT agents), while the CDT agent cheerfully defects (and does not move in unison with the LDT agents).

To make this example more precise, we might also imagine an LDT agent, versus a CDT agent, playing the game "Me and 98 LDT agents in a PD tournament." (In the original example, the CDT agent is facing 99 LDT agents, and each LDT agent is facing 98 LDT agents and 1 CDT agent, which technically breaks the fairness of the problem.)

There are two counterarguments that this example does not favor CDT as a decision theory:

First, imagine inserting a single logical-style agent with some other slightly variant algorithm, LDT-Prime, such that it no longer sees itself as moving in lockstep with the other LDT agents in the PD tournament. Then LDT-Prime will do as well as the CDT agent in the tournament, and do better on other Newcomblike problems. This argues that the particulars of the CDT algorithm were not what gave the CDT agent its apparent advantage.

Second, an LDT agent would argue against the fairness of the PD tournament. Since the LDT agent is facing off against 98 other agents moving in logical lockstep with itself, it is being faced with an environmental challenge unlike the one an LDT-Prime agent or CDT agent sees. Arguendo, the LDT agent is being presented with different options or consequences for its decision algorithm, compared to the consequences for the LDT-Prime or CDT algorithm.

Suppose that a CDT agent and an LDT agent both encounter a $20 bill in the street. Two days previously, Omega has already bombed an orphanage if the output of the LDT algorithm is to pick up the $20 bill. The LDT agent naturally refuses to pick up the $20. The CDT agent laughs and remarks that Omega is already gone, and chides the LDT agent for not taking the $20 reward that was equally available to any agent passing by.

Since the CDT agent doesn't think the past correlation can affect outcomes now, the CDT agent believes that the CDT agent and the LDT agent would receive just the same payoff for picking up the $20 bill, and thus that this scenario is a fair challenge. The LDT agent thinks that the CDT agent and LDT agent have been presented with different payoff matrices for the same outputs, and thus that this is an unfair challenge. On the LDT view, CDT agents are blind to Newcomblike dependencies, so the CDT agent may believe that a scenario is a fair CDT problem when it is actually an unfair Newcomblike problem.

On the LDT view, something very similar happens when an LDT agent versus an LDT-Prime agent, or an LDT agent versus a CDT agent, are presented with a PD tournament against 98 LDT agents. Being presented with 98 other contestants that will mirror you but not the competing agent doesn't seem very 'fair'.

However, even the orphanage-bombing example taken at face value seems sufficient to technically refute the general statement that the fair problem class on which LDT agents end up rich is strictly larger than the corresponding 'fair' problem class for CDT agents. And the 99LDT/1CDT tournament does seem in some sense like it's posing a natural or a realistic scenario in which a CDT agent could get a higher payoff than an LDT agent; on the CDT view, this is exactly the sort of problem that CDT is good for.

Comments

Jaime Sevilla Molina

I fail to see how this setup is not fair - but more importantly, I fail to see how LDT is losing in this situation. If the payoff matrix is CC:2/2, CD:0/3, DD: 1/1, then if LDT cooperates in every round it will get $99\cdot 2=198$ utilons, while if it defected then it gets $100$ utilons.

Thus $LDT$ wins $198$ utilons in this situation, while a CDT agent in his shoes would win $100$ utilons by defecting each round.

The situation changes if the payoff becomes Having a higher score that CDT: $1$ , while Having an equal or lower score than CDT: $0$ .

Then the game is clearly rigged, as there is no deterministic strategy that LDT could follow that would lead to a win. But neither could CDT win if it was pitted in the same situation.

Eliezer Yudkowsky

I'll edit to be more precise: A CDT agent thinks "me and an LDT agent facing off against 99 other LDT agents in a oneshot PD tournament" is a fair test of it versus the LDT agent. A CDT agent does not think that "Me facing off against 99 CDT agents and 1 LDT agent, versus an LDT agent facing 99 LDT agents and 1 CDT agent" is a fair and symmetrical test. On the CDT view, the LDT agent is being presented with an entirely different payoff matrix for its options in the second test.

I do not think that the "Me facing off against 99 CDT agents and 1 LDT agent, versus an LDT agent facing 99 LDT agents and 1 CDT agent" is fair either.

The thing that confuses me is that you are changing the universe in which you put the agents to compete.

To my understanding, the universe should be something of the form "{blank} against a CDT agent and 100 LDT agents in a one-shot prisoners dilemma tournament", and then you fill the "blank" with agents and compare their scores.

If you are using different universe templates for different agents then you are violating extensionality, and I hardly can consider that a fair test.

Makes sense (though the versus you quote wasn't being advocated as a fair example by either agent). I'll rewrite again.