The explore/exploit trade-off

How many people should you date before you choose a partner to spend the rest of your life with?

The hidden difficulty in answering questions goes like this: if you settle for the first person you ever date, there’s a good chance you’re not making a good choice. You don’t have enough experience of who’s out there to make an informed choice about whether this person is the one. Maybe there’s someone better just around the corner? But that could always be the case. Even as you date your hundredth person you can’t know for sure that the hundred-and-first person won’t be your soul mate.

So, when can you say you have enough experience of the market to make a good choice?

There is a mathematical answer for this question, and even an online calculator that allows you to answer it for yourself.

But this is just an example of a wider set of what behavioural ecologists and computer scientists call the explore/exploit trade-off: when is the optimum time to stop spending time looking for new options, and to start to make use of the options available.

A classic example of this is foraging for food: how should we choose between a familiar option with a known reward and exploring unfamiliar options with unknown rewards. Choosing what to watch on Netflix might be a more contemporary one: rewatch an old favourite or risk a new series?

This challenge should be starting to sound familiar to anyone involved in marketing, research and strategy because it shows up at all levels: should we spend time and effort to enter new spaces or deepen our strengths in existing ones? Should we spend research budget answering the immediate questions that will help us converge on our next action, or should we fund divergent exploration of unfamiliar territory?

There is a great deal of academic literature that can help unpack these questions and put some quantified rigour around them. This paper does a good job of summarising the most important places to start for inspiration.

And there are plenty of less-formal heuristics you can use to guide your thinking at a strategic level. Jeff Bezos’ ‘regret minimisation framework’ is one way of solving the problem.

A more nuanced way we help our clients is to contextualise the choice. Time-sensitive choices or those critical to survival favour ‘exploit’, when a known acceptable outcome is good enough. More open-ended or speculative choices may be better suited to a strategy that favours ‘explore’.

We find this framework can also be valuable in understanding consumer, patient and customer behaviour too. When understanding category dynamics:

  • Consider people’s willingness to try new options versus sticking with what they know
  • Investigate what can trigger ‘exploit’ responses that drive loyalty amongst your users. How can you bring to life the risk of switching and the predictable reward of staying loyal?
  • What can nudge people towards ‘explore’ responses that encourage experimentation amongst your competitors’ users? How can you make tangible the benefits of exploring a new brand?

If you want to learn more about how we help solve these problems for our clients and how we can do it for your brand, get in touch.