ROSOMAXA

ROSOMAXA stands for Routing Optimizations with Self-Organizing Maps And EXtrAs - a custom evolutionary algorithm which tries to address the problem of population diversity: ability to retain different individuals in the population and use them as an input for the search procedure. Additionally, it utilizes reinforcement learning technics to dynamically pick suitable meta-heuristics for given problem formulation to avoid premature convergence.

Key ideas

The rosomaxa algorithm is based on the following key ideas:

use Growing Self-Organizing Map(GSOM) to cluster discovered solutions and retain good, but different ones
choice clustering characteristics which are specific to solution geometry rather to objectives
utilize reinforcement learning technics in dynamic hyper-heuristic to choose one of pre-defined meta-heuristics on each solution refinement step.
use 2D visualization to analyze and understand algorithm behavior. See an interactive demo here

Clustering

Solution clustering is preformed by custom implementation of GSOM which is a growing variant of a self-organizing map. In rosomaxa, it has the following characteristics:

each node maintains a small population which keeps track of a few solutions selected by elitism approach
nodes are created and split based on selected solution characteristics. For VRP domain, they are such as:
- vehicle max load variance
- standard deviation of the number of customer per tour
- mean of route durations
- mean of route distances
- mean of route waiting times
- average distance between route medoids
- amount of routes
periodically, the network is compacted and rebalanced to keep search analyzing most prominent local optimum. Compaction is done using a "decimation" approach: remove every second-third (configurable) column/row and move survived cells towards the center (a node with (0, 0) as a coordinate). Overall, this approach seems help to maintain a good exploration-exploitation ratio.

Visualization

This old animation shows some insights how algorithms performs over time:

Visualization example

Here:

u_matrix is unified distance matrix calculated using solution characteristics
t_matrix and l_matrix shows how often nodes are updated
objective_0, objective_1, objective_2: objective values such as amount of unassigned jobs, tours, and cost

The new experimental visualization tool is part of the repo: experiments/heuristic-research. Online version is available here: https://reinterpretcat.github.io/heuristics/www/vector.html

Dynamic hyper-heuristic

Essentially, a built-in dynamic hyper-heuristic uses Multi-Armed Bandit with Thompson sampling approach to pick meta-heuristic for the list. This helps to address exploration-exploitation dilemma in applying a strategy of picking heuristics.

Implementation can be found here

Additional used techniques

TODO: describe additional explorative techniques:

tabu list usage in ruin methods
alternative objectives manipulation
..

Further research

experiment with different solution characteristics
rebalance GSOM parameters based on search progression
analyze "heat" map dynamically to adjust GSOM parameters
more fine-grained control of exploration vs exploitation ratio
try to calculate gradients based on GSOM nodes

A Vehicle Routing Problem Solver Documentation