Exploration vs Exploitation. Which way to go?

· 810 words · 4 minute read

We are all trying to keep our lives in place. As we grow older, it becomes more important to cultivate some more stability and less risk, partially due to extra responsibilities we get from our surrounding and from the system itself. There is a observable transition from child’s irresponsible wandering to adult’s tediously planned next steps.

We actually have a very extended childhood compared to our hominins. Chimpanzees already become self-sufficient around 7 years old, whereas for humans it starts at least around the age of 15 (nowadays in most countries full time working age is accepted to start at 16). Homosapiens adults are obliged to feed and protect their children until they can do it themselves independently. Thus they need to make sure their child’s every next step is planned to be safe. Child then is completely free to do what it wants, thanks to the parents. With the contraction of protection and feeding from the parents as the child grows older, the young one limits the risk of extincting due to random and dangerous acts such as petting a crocodile or going with a stranger to buy an ice cream.

We learn safe ways of living, having fun and working. We try to avoid anything that is risky. However, this extreme cautiousness sometimes traps us into a local minima, thus our growth stops. The opposite of the focus on the short term stability and continuous gains is maximizing the total rewards in long term with the expectation of short-term volatility. It is a choice between exploration versus exploitation.

When we talk about decision-making, we usually focus just on the immediate payoff of a single decision — and if you treat every decision as if it were your last, then indeed only exploitation makes sense. But over a lifetime, you’re going to make a lot of decisions. And it’s actually rational to emphasize exploration — the new rather than the best, the exciting rather than the safe, the random rather than the considered — for many of those choices, particularly earlier in life.

Algorithms to live by

Exploitation is used when the wish is to maximize gains wihtin the near future. Whereas in exploration mode we are focusing on maximizing the long-run rewards constrained by uncertain and potentially risky future. So, which one should we adopt? Playing at the safest and fast rewarding mode, or the one where we have an open end gaining and losing potential on long-run. This is as well an attractive topic in mathematics and AI research, known as “multi-arm bandit” problem. The multi-arm bandit problem tries to find the most optimal strategy to exploit with pokies by putting the focus on making a choice between trying different machines to test them out (exploration) or staying loyal to the most promising machine we have already seen (exploitation).

For me, looking at life, there are many things that are uncertain. There is no possibility of running away from it. Therefore, we should not avoid experimenting new things, with as much variations as we can. As we learn more about the gains and risks by trying it out and as we grow more certainty for that option, we can increase the weight of that choice gradually. However, we should always look for new things still, because we might be stuck in a local minima unknowingly where further exploration at the risk of decreasing gains or increasing costs in short term could’ve put us onto more favorable position.

A playful mind is inquisitive, and learning is fun. If you indulge your natural curiosity and retain a sense of fun in new experience, I think you’ll find it functions as a sort of shock absorber for the bumpy road ahead.

Bill Watterson

Fatigue and ennui are probably evolutionary-evolved triggers that help us to shift from exploit to explore mode. Experimenting helps us to learn more, to gain more information and knowledge, increase certainty by finding multiple variables, and thus provides a better chance in following the best outcome given findings.

To be successful in an environment with such a dynamic shifts, organisations and individuals should adopt the experimentation, exploration and continuous innovation part of their beliefs, cultures and mindsets. Instead of focusing on return on investment (ROI) at exploration phase, we should focus on decreasing the uncertainty of our path on long-term vision.

We all have our own dilemmas in every part of our lives, both personal and professional. The desires, needs, constraints, opportunities or obstacles is randomly distributed onto all stages of our lives, which we have no knowledge about in advance. However, if we do not explore and discover what we want to be and what we want to do, our life will be dull and dissatisfying. We should always look for new, innovative and creative ways to live our lives.


Read more on the topic… 🔗