Akrasia vs willpower

Willpower and akrasia are puzzling. Existing theories do not explain common subjective experiences:

Preference reversal without new information eg from one week away prefer to go to the gym but from ten minutes away prefer to stay at home and eat icecream.

To the extent that people deliberately constrain their future actions in order to commit to current preferences eg throwing out all the icecream to avoid the temptation.

Not always just immediate tempations either eg buying icecream on the way home to eat later, all the while thinking ‘this is a bad idea’.

Substance addictions usually explained by chemical properties, but addictions like gambling and overspending follow the same behavioral patterns, even to the point of creating physical withdrawal symptoms. Unwanted habits like unhealthy eating or oversleeping exist on the same scale.

Relapse and pre-commitment in particular are hard for utility theory to explain. The relapsing addict is already familiar with the cost of the habit and had previously chosen to quit, but reverses that decision when presented with temptation close in time. The alcoholic who takes disulfiram presumably does not want to drink, but predicts that they will do so anyway unless they take some additional pre-committing action.


Utility maximizers. Discount utility of future events. Only reasonable discount curve is exponential - any other curve allows for situations where the passing of time can change your preferences without any new information arriving.

Actual discount curve in humans and animals seems to be exponential in some cases and hyperbolic in others. Much [debate] about this still. Tend to get exponential curves in experiments where rewards are easy to measure and obviously interchangable (eg money). Tend to get hyperbolic curves in experiments where rewards are hard to measure and are immediately consumed (eg relief from noise, access to games, vary time between money rather than amounts).

In the latter case, given a series of choices people tend to prefer options that pay off faster but have less total reward to options that are slower but more rewarding eg repeatedly choosing to stay home and eat icecream even though we would be happier in the long run if we were buff.

What if willpower is exactly the phenomenon that produces exponential utility curves, allowing us to overcome this innate preference for fast payoffs?

One typical feature that is ascribed to willpower all across existing literature is deciding on principle rather than immediate utility - that is, grouping decisions together into a single class in which we follow a consistent rule. So the decision changes from icecream-today vs gym-today to icecream-every-day vs gym-every-day. Summed hyperbolic curves approximate an exponential curve, so our gym preference no longer gets reversed close to the event.

But a third option exists - icecream-today-and-then-gym-every-other-day. This dominates both the previous options once we get close to the decision point. But if we took that option we would be tempted to take it every day. Payoff is actually U(icecream) + P(gym-every-day)U(gym-every-day) + P(icecream-every-day)U(gym-every-day).

This becomes really recursive. The third option only dominates if we actually believe that we will go to the gym every other day, but what is to stop us from making the same decision again?

Prisoners dilemma. Eating the icecream is defecting against our future selves, hoping that they will still cooperate and go to the gym. But since all selves are using the same decision making machinery this has aspects of an iterated prisoners dilemma. If defect now, it raises my estimated probability that my future selves will defect, which reduces the expected value of me defecting and reduces the expected value of me cooperating even more.

Willpower is a blunt instrument. Side-effects may include:

The focus on measuring and optimizing in modern society may exacerbate these effects - with grades, credit scores, work evaluations, online reviews etc. Doing well in modern society requires much more willpower which leads to more legalistic and systematized lives.


Timescales of reward:

We can create positive emotions at will eg make up a happy daydream. Why do we ever do anything else? Possibly because we can’t restrain ourselves from early satiation. Need to be externally constrained in order to draw out the enjoyment. So we need reliable but unpredictable sources of emotions - people, stories etc.

Understanding of this process exists, so people place artificial barriers in their own way. But consciously acknowledging it would make the barriers less useful, so they tend to look like unexplainable superstitions or traditions. Examples?

Also explains difference between belief and fantasy - belief is constrained by some external factor. Favorite sports team winning is pleasurable, much better than just imagining that they won. Need not be true, just needs to be a bright line so we can’t satiate early. ‘Uniquely well-established social constructions’.

Confused by the argument that willpower leads to early satiation, when willpower is all about choosing larger long-term rewards.

…the will only works in tasks that have regular, clear-cut steps. This clarity fosters anticipation, which increasingly wastes available appetite through premature satiation and which the will is powerless to prevent in any direct way.


Why discount rewards by time at all? Why not discount directly for perceived expectation of reward?

Why is willpower experienced as effortful? Can gain utility in the long-run by making the curve more exponential. So to the extent that this is easy, we should expect that people are already doing it. We only notice the cases where it is effortful and the easy cases just feel like doing what you want to do.

Why would hyperbolic curves be selected for? Other perceptions are mostly hyperbolic, so possible physiological reasons. Irrationality may just not have mattered for animals which had limited ability to manipulate their environment. Existing long-range behaviors like hoarding food or hibernating evolved matching short-range rewards. Akrasia is a problem caused by our ability to target our rewards with behaviors that they did not evolve to encourage eg we created icecream to target rewards that evolved to motivate eating fatty animals and now we have to struggle to override those rewards.

Do we have any understanding of when decisions are made vs going with the default?


Core model is really interesting and seems to dovetail neatly with subjective experience. Extension to pain-scale compulsions is plausible. I didn’t follow the extension to appetites and satiation at all and I’ve heavily abridged the notes on that section.

The author seems to be content with the idea that the recursive nature makes it difficult to test experimentally. Has anyone else taken up the challenge?