So, you want to optimise something?
Then, you probably think that your fitness landscape looks like this:
… and the only task is to roll down from your starting position to the optimum, like this:
Yet, you are probably wrong.
The real fitness landscape looks more like this:
… and what you thought was this:
…is actually this:
What you can do about it? Many things obviously. If this is an everyday real-life choice, you can just ignore the problem and think of some catchy justification for the random choice you have made; or maybe wait for a better solution to emerge. Especially for software engineering problems, you can use the premature optimisation is the root of all evil rule to postpone them after product shipment date. If this is a scientific problem, you can use some better, clever approach, something in a spirit of Quantum Anthill Stochastic Conjugate Hessian Nonlinear Programming – there is a whole science around that, with tons of methods, algorithms and books and papers about them. Yet, with full respect, I have never seen a real-world optimisation problem that can be efficiently and exactly solved by such a clever, generic optimisation approach. From my experience they tend to be either trivial enough for some hill climbing (valley rolling?) approach or impossible because the available cost function is generally useless for locating the optimum.
Thus my second-favourite approach to optimisation is to look for a lower bound of the cost function, and optimise in that space, which should be simple enough to make it very easy, even analytical; on our plot it looks like this:
The optimum found there will obviously not map to the minimum of the original function, but it is usually not substantially worse than what some optimisation heuristic brings you after hours of bouncing around the whole choice space, and it is often a good starting point for an exact search if it is really needed.
Also, as I mentioned earlier, the problem with real-world optimisation is that substantial areas of the choice space are not giving any hint where the optimum is because they are flat or vigorously fluctuating.
Even more, cost is often practically non-deterministic, either because it is impossible to be determined exactly or depends on hidden factors we cannot control; what’s worse it is also usually expensive to evaluate thus an attempt to average the noise out by oversampling is unfeasible.
The big selling point of lower bounds here is that they are often robust to those fluctuations, like so:
If it is so good, why is my second-favourite choice, you could ask… Basically because it is not always possible and requires some insight into the nature of the cost and some clever ideas. I.e., work. Brute-force random sampling of the choice space often gives a result faster and requires zero thinking, so it is usually running while I’m thinking about lower bounds.
Previously: WebGL weather globe, later: IPv6.