Transitivity and its failures.

In life, we tend to expect transitivity. In other words: if A > B, and B > C, then A > C.

A jackal is heavier than a cobra. A cobra is heavier than a mongoose. So a jackal had better outweigh a mongoose, or else some weight-conscious animal has been editing brazen lies onto Wikipedia.

But weight is simple. A single measurement. Complex traits—like, say, fighting ability—can’t be so easily summarized. You’ve got to consider speed, strength, strategy, tooth sharpness, poison resistance, endorsement deals… with so many interacting factors, transitivity fails. In this case, a mongoose can defeat a cobra, which can defeat a jackal, which can defeat a mongoose.

The same is true of human combat: in a trio of famous fights, Joe Frazier beat Muhammad Ali, who beat George Foreman, who beat Joe Frazier.

You find a similar dynamic in another facet of everyday life. That’s right: the mating strategies of the male side-blotched lizard.

Some males (“monogamists”) stick close to a single mate. But they’re outcompeted by another kind (“aggressors”) who conquer a large territory, building a harem of many mates. Aggressors, too, have a weakness: a third kind of male (“sneakers”) who wait until the aggressor is away, then get busy with his unprotected mates. Yet the sneaker, in turn, cannot succeed against the watchful monogamist. Aggressor conquers Monogamist, who defends against Sneaker, who gets the better of Aggressor.

Next time someone proposes a game of Rock, Paper, Scissors, I urge you to counter-propose with a game of Lizard, Lizard, Lizard.

Political scientists boast their own version of non-transitivity: the Condorcet Paradox. In an election with multiple choices, it’s possible that the electorate will prefer Taft to Wilson, Wilson to Roosevelt, and Roosevelt to Taft. More than just another great a Rocks, Paper, Scissors replacement, this is a vexing challenge for political theorists. It means that seemingly innocent changes to the structure of an election may have dramatic effects on its outcome.

Indulge me one more example, a favorite of mathematicians. You place three special dice on the table for inspection, and allow your opponent to pick whichever one they want. You then pick one of the remaining two. Both dice are rolled, and the highest number wins.

The trick? Their strength is non-transitive. A usually beats B, which usually beats C, which usually beats A.

Whoever picks their die second can always seize the advantage.

Cool fact about this particular trio: If you bring out a second die of each kind, and compete by rolling a pair of same-color dice, then the cycle reverses direction.

As we’ve seen, transitivity holds in the simplest cases (6 > 5, and 5 > 4, so 6 > 4) but wilts under the breath of complexity. I’m afraid to report that real life is rather complex. Every decision we make could lead to a dizzying array of outcomes: some good, some bad, some likely, some not, and all of them contingent on forces beyond our control.

In one psychology study, students were asked to choose between pairs of fictional job applicants. Their preferences formed a non-transitive loop: A beat B, who beat C, who beat D, who beat E, who beat A. “I must have made a mistake somewhere,” one student fretted, when shown the non-transitivity of his choices. He hadn’t.

It’s just that transitivity is simple, and making decisions under uncertainty is not.

These thoughts circle my head any time I’m asked to rank anything. Sure, our world permits occasional clarity. The best gymnast is Simone Biles. The best Billy Joel album is “The Stranger.” The best squash to eat—not to mention to pronounce—is “butternut.”

But usually it’s not so simple. What’s your favorite Le Croix flavor? Who is the strongest student in your class? What writer has inspired and/or depressed you most? In such cases, there may be no right answer, just a non-transitive mess.

11 thoughts on “Transitivity and its failures.

  1. Perhaps it can be of interest that mechanical toys (e.g. intransitive monkeys, combs, gears etc) are possible as well. They are related to the paradox of intransitive voting by Condorcet (https://en.wikipedia.org/wiki/Condorcet_paradox) and cellular automata
    https://nkj-ru.translate.goog/archive/articles/43663/?_x_tr_sl=ru&_x_tr_tl=en&_x_tr_hl=en-US&_x_tr_pto=wapp
    (Original text: https://www.nkj.ru/archive/articles/43663/)

  2. I love the moral here of humility in the face of complexity. I do think it’s worth emphasizing, though, that an individual acting rationally cannot have cyclic preferences (see https://www.wikiwand.com/en/Money_pump). So when there’s ranking to be done and intransitivity pops up, I think that’s best taken as a cue to look out for moving goalposts; that is, it might be that every option was not evaluated under the same criterion, but that does not mean that transitivity would still make sense if a common criterion were being used. For example, if Frazier, Ali, and Foreman, all beat each other in a 3-cycle, the thing to realize is that Frazier is strictly better than Foreman at the task of “boxing against Muhammad Ali”, while Ali is better than Frazier at “boxing against George Foreman” and Foreman is better than Ali at “boxing against Joe Frazier;” these are three distinct tasks.

    Of course, the lesson when this pattern emerges does not have to be “darn, let’s throw out that information and hunt for a well-specified criterion now” – it could be that there’s just more complexity to your original question than you realized, and it’s best to embrace that rather than try to work around it. If you’re putting together the U.S. Olympic boxing team, to keep up the same example, the takeaway probably shouldn’t be to ignore head-to-head results and have each athlete’s punching force measured so you have an easily quantitative, transitive way of ranking them. Instead, it’s probably that there are many different kinds of boxing match-ups, a good thing to keep in mind as you try to come up with the much more difficult “all things considered” way to rank your choices. Ultimately, though, if you have to choose, you will need to rank them, and that ranking can’t be transitive.

    So, basically, I DO think the student in the job application study made a mistake, because his head-to-head comparisons of candidate A to candidate B, and candidate B to candidate C, etc. clearly weren’t based on a standard metric of applicant quality that would allow him to rank them in order. I’m not saying there aren’t real head-to-head scenarios you could put job applicants in where the winners form a non-transitive loop – maybe, if you asked them to debate each other, this would happen, which would be a good prompt for realizing how complex and varied the task of debating is. But I AM saying that using those head-to-head results to do a ranking, and ending up with a non-transitive loop, is nonsensical (and in fact would make it impossible to actually choose an applicant).

    1. Thanks for the thoughtful reply! I love your decomposition of the boxing triangle into three distinct tasks. This was a good challenge to my thinking.

      Having chewed on it a little, I’ve got a thought on why, in complex scenarios, goalpost-moving may be unavoidable.

      Say we’re choosing an assistant professor at a liberal arts college. A is an award-winning researcher whose undergrads often publish; B is an award-winning teacher whose textbook is used nationwide; and C is an award-winning mentor with a track record of helping first-generation college students. Who do we pick?

      The goal of a liberal arts college is something like “help our students thrive academically.” But which students? Those eyeing grad school (i.e., our top performers) may benefit most from A. The most vulnerable ones, at risk of not finishing their degrees, may benefit most from C. And the students in between, likely to graduate but unlikely to become academics, may benefit most from the rigorous instruction of B. That’s not to mention all the different things that “academic thriving” could entail: independent thought, disciplinary knowledge, intellectual curiosity, etc. In other words, our goal is not singular or simple; it’s highly multi-dimensional. In a strict sense, rank-ordering is impossible.

      Of course, “A, B, and C cannot be ranked” is different from “A > B > C > A.” The latter seems clearly irrational. So how do we wind up with non-transitive loops?

      To compare a pair of candidates, we need to project our complex goals down into a single dimension. There is no correct or best way to do this. Different pairs of candidates may induce different projections – especially because we will try to pick projections that make each decision clearer and easier. For example, we may say “B > A, because B will help more of our students.” Then we may say, “C > B, because helping our first-gen students is an urgent priority right now.” And then we may say, “A > C, because at institution of higher learning, we’d be crazy to pass on a top-tier researcher.” In each case, we select a criterion that makes the pairwise comparison easy. After all, it’d be silly to choose a criterion on which they come out tied!

      Anyway, that’s why I see goalpost-moving as not necessarily irrational. Instead, it’s a consequence of repeatedly projecting multi-dimensional goals down to a single-dimensional criterion, and rationally choosing the most convenient projection each time.

      (Why not pick the projection in advance, at the beginning of the hiring process? This would avoid non-transitive loops, but might also make the decision harder than necessary, or lead to a choice that doesn’t “feel” right or satisfying. So again, whether this is a good idea depends on how you boil down the multi-dimensional goals of a hiring process into a single criterion!)

      1. Thanks for taking the time to respond! Your framing of projecting multi-dimensional goals onto a single-dimensional criterion is, I think, a really useful way to conceptualize what’s often going on when we do pairwise comparisons. Especially because of how that model can generalize. In a 2-D space, for example (say an x-axis for teaching ability and a y-axis for research ability), to project onto either axis would be to decide solely based on that ability and ignore the other. But we could also choose to project onto an arbitrarily sloped line, say y=2x if we wanted to weight a unit of research ability — whatever that is — twice as much as a unit of teaching ability.

        I think that’s basically the main thought I’d want to add in response to what you wrote above (and to what most humans, myself included, tend to do instinctively): looking for a single criterion to project onto doesn’t have to mean [deciding which one of several dimensions is most important and ignoring the rest of them]. It’s tempting to do that, because it’s often simpler, but I’d say the best 1-dimensional line with which to rank our applicants will almost always be a composite criterion, neither completely vertical nor horizontal, which gives at least some weight to every relevant dimension.

        I’d even go a step further and say that we implicitly ARE using composite criterions, even when we simplify our rationale down to less nuanced statements like “B > A, because B will help more of our students.” If we were sure B would only help 2% more students — maybe the quality of instruction will be the exact same, but A will cap enrollment in their classes 2% lower — and yet A has won “Undergrad Research Mentor of the Year” and B doesn’t do research at all, the choice would probably feel obvious. So really, what we viscerally feel is “B > A, because B will help more of our students (by a wide enough margin that it seems to outweigh the extent to which A is better at research)” And I think it’s worth saying that more clunky version explicitly, and in general getting used to the idea that we are almost always tracking multiple variables, if only in the back of our minds.

      2. As for whether choosing the most convenient projection each time is rational: if you’ll indulge another too-long comment, I think there’s a pretty convincing intuition pump for why it’s not. Imagine that A suddenly became as good a first-gen mentor as C. In that case, this new and improved A — let’s call them A+ — is the clear winner among the three: we already established that C > B by virtue of C’s mentorship ability, and now hiring A+ means we get that same mentorship AND all of A’s research talent (talent which, after all, was enough to make even the original A better than C head-to-head).

        Now, the only difference between A, who we deemed worse than B, and A+, who is necessarily better than B, is their mentorship ability. So (by the intermediate value theorem, if we want to be mathy about it) there must be some version of A — let’s call them A* — with mentorship skills somewhere between A’s and A+’s, for which we’d be indifferent to hiring A* versus hiring B. We can gut check this by actually just imagining versions of A* with increasing mentorship ability and repeatedly asking ourselves whether the statement “B > A*, because B will help more of our students” still rings true.

        We can do a similar exercise to find C*, a version of C but with worse mentorship ability, such that we’d be equally happy hiring B instead of C*. Both A* and C* are equivalently good hires to B, and therefore to each other. But hold on: we said A > C, and yet somehow A* — an enhanced version of A — is no better than C*, a worse version of C. That violates something that should probably be an axiom of having preferences: a worse version of your second choice shouldn’t beat an improved version of your first choice.

        I think playing around with different hypothetical combinations of traits like this isn’t just a good way to expose what’s problematic about circular preferences; it’s also a great way to get traction on the thorny task of arriving at well-ordered preferences. Maybe we start by doing head-to-head comparisons, and come out thinking A < B < C < A, as you laid out. We know that can’t be right, so we start probing by doing the above thought experiment, knowing one of the statements along the way is going to feel wrong. And sure enough, when we stop to really imagine A+ next to B in our minds, we don’t actually prefer A+, even though that should necessarily be true if A+ has everything that made us prefer C over B AND MORE. This prompts us to realize that A+ actually doesn’t have everything we liked about C: when we ask “what’s missing? Why can’t I pull the trigger on A+ over B even though I would with C?”, maybe we realize A+ is missing C’s public speaking ability, and would be a worse spokesperson for the department.

        Perhaps we failed to consider this because we did the B vs. C head-to-head first, and C’s mentorship skills were so strong that we didn’t even have to consider anything else to know they were better than B – so much so that we never consciously noted that public speaking was something we cared about, and did our A vs. C consideration with an overly-simplified “C-is-the-candidate-that’s-all-about-first-gen” stereotype in mind. Or maybe we once heard our boss say “At an institution of higher learning, it’s crazy to pass on a top-tier researcher”, and so when we considered A vs. C we just sort of deferred to that idea and didn’t take the time to really imagine which hire would feel more satisfying overall. But now that we have, we realize that we actually do prefer C over A, as well as over B; we’ve honed in on a cognitive error we made, and after correcting for it, found our top choice.

      3. One last thing and then I promise I’ll hush: ranking preferences in complex situations is, for me at least, often difficult and uncomfortable, and so I’m totally content to have an intransitive mess on questions like “What writer has inspired me the most?” As is probably clear from my being so long-winded about this, though, I really do think it’s important sometimes to lean into the discomfort and earnestly try to figure out what’s best, especially on things like ethical reasoning. Too often, I fear, complexity can be used as an excuse for inaction, instead of a prompt to think hard about our core values and their implications.

        We might see an ad for our local food bank and think “I should donate instead of keeping my money for my family, because my food-insecure neighbors need it more than we do.” And then, “If I’m trying to help the needy, I should actually donate to malaria prevention in Tanzania, a much poorer part of the world.” And then, “Actually, animals on factory farms have no way of helping themselves and suffer tremendously, so maybe they are the most needy.” But then, “When I think about it, I just actually can’t bring myself to emotionally care about the well-being of a chicken, and I’d rather spend the money on my own family.” When the stakes are as high as preventing real suffering, I think it’s wrong to throw up our hands and say “there’s no best option here,” especially if we end up defaulting to the easiest choice as a result (e.g. not donating at all). We should be bothered by the intransitive loop, because it indicates we might be shying away from a difficult conclusion about an important choice.

        This post from one of my favorite blogs (up there with Math With Bad Drawings, which is high praise) has an interesting argument, motivated by transitivity, that I instinctively flinched away from at first. But I think the world would be a better place if more people earnestly wrestled with arguments like this, and taking seriously the idea that rational preferences should be transitive helped me to do that: https://www.cold-takes.com/defending-one-dimensional-ethics/

Leave a Reply to Ben Orlin Cancel reply