I was thinking about the Travelling Salesman
problem this morning. I came up with an algorithm that permits a few
nice optimizations. My guess is that Knuth probably already came up
with this algorithm, formally analyzed it, and then came up with 15
others that were much better. Nonetheless, I figured I'd jot it down.Warning
This is a rough draft of an algorithm. You shouldn't bother reading
this if you want real rigor. I'm just writing it down to get it off my
Rather than picking the first segment of the journey and
working my way to the end, I want to pick a middle segment of the
journey and work my way outwards recursively.
Given a list of
distances (or times, or costs, or whatever) between cities, sort the
list from shortest distance to longest distance regardless of which
cities are involved. Now, loop over this list and use each element of
the list as a middle segment of your journey. This gives us a list of
"first steps" (or rather, first middle steps). Looping over the list
from shortest distance to longest distance is an important optimization
because it increases the likelihood that we'll find an optimal path
early, allowing us to skip over a lot of potential paths later.
sort the list of distances so that for each city, we have a list of
other cities in ascending distance order. By sorting all of the city
pairs in order of (first city, distance), you can use one sort for all
of the pairs.
Now, here comes the recursion. We have a bunch of
middle parts of the journey. Use recursion to extend the journey either
by adding to the beginning of the journey or by adding to the end of
the journey. Keep recursing until you have a complete path or a partial
path that is already longer than the best path seen so far. Now, we
can either extend the journey at the beginning or at the end.
Recursively try extending the journey by either adding to the beginning
or the end. However, do it in order so that you try adding the shorter
distances first. There's an implicit merge sort going on in this
algorithm. This, again, is an optimization to allow you to skip work
While we were recursing, we had a function that took two
things, a middle chunk of the journey and a set of cities not yet
visited. Apply memoization so that anytime we get the same parameters,
we return the same answer (by using a cache, of course). This is an
Last of all, using the above algorithm,
we'll quickly come up with the first complete path that has a decently
minimal distance. Keep track of this as the best complete path seen so
far. Anytime we are recursing and we come up with a partial path that
is longer than the best complete path seen so far, we can stop
recursing, give up, and try something else. This is an important
optimization to save work "at the leaves of the tree".
even wrap my head around the big O of this algorithm. I suspect it's
one of those cases where you use words like "amortized cost".
algorithm does have a weakness. If every distance between cities is
exactly the same, it'll try every possibility. Similarly, if every
distance between cities is exactly the same except one pair of cities
which has a longer distance, it'll still try every possibility. I'm not
sure of the degree to which the memoization fixes this problem. It'd
be good to extend the algorithm somehow to recognize this situation and
short circuit, i.e. automatically throw out paths of duplicate length.