**Warning**: This is a rough draft of an algorithm. You shouldn't bother reading this if you want real rigor. I'm just writing it down to get it off my chest.

Rather than picking the first segment of the journey and working my way to the end, I want to pick a middle segment of the journey and work my way outwards recursively.

Given a list of distances (or times, or costs, or whatever) between cities, sort the list from shortest distance to longest distance regardless of which cities are involved. Now, loop over this list and use each element of the list as a middle segment of your journey. This gives us a list of "first steps" (or rather, first middle steps). Looping over the list from shortest distance to longest distance is an important optimization because it increases the likelihood that we'll find an optimal path early, allowing us to skip over a lot of potential paths later.

Also sort the list of distances so that for each city, we have a list of other cities in ascending distance order. By sorting all of the city pairs in order of (first city, distance), you can use one sort for all of the pairs.

Now, here comes the recursion. We have a bunch of middle parts of the journey. Use recursion to extend the journey either by adding to the beginning of the journey or by adding to the end of the journey. Keep recursing until you have a complete path or a partial path that is already longer than the best path seen so far. Now, we can either extend the journey at the beginning or at the end. Recursively try extending the journey by either adding to the beginning or the end. However, do it in order so that you try adding the shorter distances first. There's an implicit merge sort going on in this algorithm. This, again, is an optimization to allow you to skip work later.

While we were recursing, we had a function that took two things, a middle chunk of the journey and a set of cities not yet visited. Apply memoization so that anytime we get the same parameters, we return the same answer (by using a cache, of course). This is an important optimization.

Last of all, using the above algorithm, we'll quickly come up with the first complete path that has a decently minimal distance. Keep track of this as the best complete path seen so far. Anytime we are recursing and we come up with a partial path that is longer than the best complete path seen so far, we can stop recursing, give up, and try something else. This is an important optimization to save work "at the leaves of the tree".

I can't even wrap my head around the big O of this algorithm. I suspect it's one of those cases where you use words like "amortized cost".

This algorithm does have a weakness. If every distance between cities is exactly the same, it'll try every possibility. Similarly, if every distance between cities is exactly the same except one pair of cities which has a longer distance, it'll still try every possibility. I'm not sure of the degree to which the memoization fixes this problem. It'd be good to extend the algorithm somehow to recognize this situation and short circuit, i.e. automatically throw out paths of duplicate length.

## Comments

## Geoffrey De Smet replied on Sun, 2012/07/29 - 1:29am

This algorithm is an exhaustive search of each solution of a variant of the TSP Nearest-Neighbor construction heuristic (or a variant on the TSP Greedy Algorithm construction heurstic). It's an intresting beginner's attempt, but it has several problems:

It will not scale.On first estimation, I would guess the Big0 is n^(n-k). So when scaling out, the running time will quickly go from seconds to billions of years (presuming you'd not terminate it in reasonable time). Have you tried this on a larger TSP dataset, such as this 980 city tour or even this 40 city tour of Europe?It's solution it finds in reasonable time is far from optimal.Don't kid yourself. The TSP Nearest-Neighbor construction heurstic is proven to do no worse than 1 + log(n)/2 times the optimal tour. So a 50 city tour will not be worse than 4 times the optimal tour. The TSP Greedy Algorithm construction heurisitic does no worse than 1/2 + log(n)/2. Both of these construction heuristics will preform similar to the initial solution found by your algorithm. Did you try your algorithm on a several datasets for which the optimal solution is known? If not, how can you be sure the solutions you are seeing are any good?The solution to these problems are:

If you want to learn more about the Traveling Salesman Problem, I recommend these 2 resources: