Monte Carlo and the Quantification of Effort

by jasonmjordan

In 2005, Dave Graham climbed The Story of Two Worlds, proposing it as the new standard of Font. 8c. Graham’s proposal was met with some incredulity at the time, since the boulder problem added an 8b sit-start to The Dagger—then considered 8b+ in its own right. The ultimate effect, however, was pronounced, occasioning a raft of downgrades among many of the world’s top boulder problems. Throughout the 1990’s there had been a steady progression in difficulty, driven largely by Fred Nicole, who climbed the world’s first 8b, La Danse des Balrogs in 1992, the first 8b+, Radja in 1996, and the first 8c, Dreamtime in 2001. There are a few proposed 8c+ boulder problems, but most have been downgraded after a second ascent (e.g. The Game), or by the original first ascentionist (e.g. Lucid Dreaming). Adam Ondra has proposed 8c+ for two problems, and spoken about the need for a truly open-ended grading system; Daniel Woods has recently expressed similar thoughts; yet, over twelve years after Nicole’s ascent of Dreamtime, there is yet to be a consensus 8c+.

The response typically given to such disagreements regarding the difficulty of a particular climb holds that climbing grades are fundamentally “subjective” in nature. As Nalle Hukkataival recently remarked: “One man’s V15 is another’s V13, quite literally.”

Yet, “subjective” is a vexed concept, with several distinct permutations. The particular problem climbers have in mind when they describe climbing grades and the perception of difficulty as “subjective” is not, I think, a metaphysical one. That is to say, the question of subjectivity in climbing grades is fundamentally different than:


Which is to say, the problem of subjectivity in climbing grades is not the same as the problem of subjectivity itself. Rather, the problem concerns the inter-subjective disagreement between two or more climbers concerning the perceived difficulty of a single particular route.

Differences in the perception of difficulty in sport are often regarded as little more than differences in skill. That is to say, Adam Ondra would probably find my current project to be “easy,” while I would find his to be “difficult”—in the same manner as I would find chasing Usain Bolt or deadlifting a cement truck “difficult.” But again, this difference concerns only the purely subjective aspect of the experience as such.

While I utterly lack the ability to distinguish between a 5.15a and a 5.15c, I do have the ability to distinguish between a 5.12a and a 5.12c. The remarkable thing is—Ondra has both. That is to say, while a 12a or 12c are sure to be “easy” for him relative to me, they’re not equally and indistinguishably so. He can still tell the difference in difficulty even between the two relatively easy routes, just as I can tell the difference between a 10a and a 10c. Climbing grades are thus altogether more interesting than the merely subjective difference between ‘that felt easy to me‘ and ‘that felt hard to me.’

What this means is that, even though climbing grades involve essential arbitrary numerical boundaries and an irreducibly subjective component, they also involve an objective component—they are ‘put out there’ into the world and may be shared between individuals of vastly different skill levels. They are in a way analogous to a language: the word ‘apple’ is an arbitrary linguistic construct, but it also refers to something real in the world, which is different than that we call ‘onion’. The same holds true for climbing grades: they are, in a certain sense, arbitrary and subjective, but they are also, in a certain sense, real features of the world—they have reality, they exist.

Climbing difficulty is experienced as exertion, as a subjective feeling, but the factors that underlie the perception of difficulty are primarily physical—and it is these objective physical factors that form the basis for agreement and discrimination concerning grades, as well as the inherent potential for disagreement.

As far as the latter is concerned, the perception of climbing difficulty is riven by many known factors beyond mere fitness: temperature, humidity, beta, personal morphology (e.g. height, armspan), personal style, past experience, fear, equipment., etc. Moreover, some factors influencing the perception of difficulty seem the elude easy description. Chris Sharma has said that he switched from bouldering to endurance oriented sport-climbing after he began packing on muscle (and hence weight), which made the tiny holds on cutting-edge boulder problems odious. Klem Loskot, another sturdy climber, made similar comments years ago about his unsuccessful efforts to repeat the mono-ridden Action Directe. Yet Wolfgang Güllich was himself a big man. So too is Nicole, well known for first ascents of fingery micro-crimp problems like Amandla and Oliphant’s Dawn.

Loskot and Sharma are two of the strongest climbers in history; it would be absurd to dismiss their complaint as simple excuse-making. Rather, it would seem that the perception of difficulty is simply too complex a phenomenon to reduce to single factors that apply equally to all. This, properly speaking, is the “subjective” problem faced by climbing grades—the perception of difficulty is subjective by virtue of the intractable complexity underlying its objectivity.

The perception of climbing difficulty may thus be attributed to the intrinsically ‘stochastic’ nature of climbing itself. A stochastic system is one whose state or process is non-deterministic and driven—in part or in whole—by deep complexity and/or elements of chance, all of which conspire against simple outcome prediction.

The Monte Carlo method is a statistical-computational strategy used to predict outcomes in inherently stochastic systems, and I should like to propose it as a conceptual model to understand climbing grades. Indeed, mine is not so much a proposal as a description of what is actually happening on web-databases like, where users log their own ascents and offer their own ‘personal grades’ regarding them.

A prominent application of Monte Carlo is in modern weather forecasting. In the past, such forecasts began from a set of initial measurements and assumptions, and then attempted to compute a single future course of events presumed to follow deterministically from them. The problem with this approach is that measurements can never be perfectly accurate, and in complex systems even slight initial differences can produce vastly different results. The Monte Carlo method works by embracing these differences: instead of computing one weather predicition from a single set of initial conditions, it computes dozens, each with slightly different starting data, which are then aggregated and statistically compared. If 18 out of 60 simulations predict rain, that itself becomes the forecast: ’30-percent chance of rain tomorrow’. (Click here for a documentary clip describing this application.) Regardless of whether chance truly exists or is merely a product of human ignorance when faced with complex deterministic systems, the Monte Carlo method is useful insofar as it accepts chance-qua-variation as an irreducible element of predicting outcomes—and thus offers its predictions not as absolutes (i.e. ‘it will rain tomorrow’ or ‘it will not rain tomorrow’) but as probabilities.

This, I would suggest, is ultimately what a climbing grade is. When a climber makes a first ascent and proposes a grade, he or she is essentially offering a prediction concerning the perception of difficulty likely to be experienced by the second ascentionist. As Hukkataival notes:  “grades are only estimates, personal opinions of the difficulty of a climb.” These “personal opinions” are analogous to a computed weather forecast, while the aforementioned factors underlying perception of difficulty are analogous to variations in initial conditions. What the Monte Carlo approach shows is that, when these personal opinions are aggregated together, one arrives at less absolute but much better predictions.

Thus, as much as one might respect Graham’s intentions and integrity (and one should do both), I would suggest that the notion of a “standard” is a relict of the old ‘one-size-fits-all’ grading method—that of a single non-probabilistic weather forecast based on a single set of initial conditions. Action Directe is considered the standard of 9a. Why a short overhang on monos originally given ‘XI’ should be the standard by which to judge the difficulty of a 50 meter endurance 9a like Era Vella, or a 9a slab like The Meltdown is most unclear. If it is because Action Directe was the world’s first 9a, one might as well declare Akira to be the standard of 9b!

Indeed, one notable feature of the ‘downgrading wars’ of the past eight years is that they have been restricted largely to bouldering and have not affected sport-climbing to nearly the same degree, There are several possible explanations for this, although the Monte Carlo approach suggests one in particular. Given differences in initial conditions, the divergence of weather forecasts increases the farther into the future they project. However, in climbing grades the opposite seems to be case. The short intense nature of bouldering entails that problems often boil down to a single decisive move, which exacerbates the effect of differences in initial conditions between climbers (e.g. armspan). Longer sport routes typically lack any one decisive move, and thus present a larger movement  ‘sample size’ which smooths out the initial differences such that the range of probability is decreased.

The point is, if grades are understood as probability-ranges rather than discrete pegs, the acrimony of grading controversy would be considerably dampened. The ironic fact underlying Monte Carlo is: by accepting factors of chance as well as skill in climbing grades, by acknowledging them to be a range of probability rather than a definitive standard, they become more rather than less accurate.