Performance Zone is brought to you in partnership with:

JJ is a developer advocate for YouTube APIs. His goal is to foster a rich set of third-party applications built on YouTube APIs. He's a well-known member of the Python community. He blogs at jjinux.blogspot.com on topics such as Python, Ruby, Linux, open source software, the Web, and lesser-known programming languages. Shannon is a DZone MVB and is not an employee of DZone and has posted 18 posts at DZone. You can read more from them at their website. View Full User Profile

# My Algorithm for the Travelling Salesman Problem

07.27.2012
| 11694 views |
I was thinking about the Travelling Salesman problem this morning. I came up with an algorithm that permits a few nice optimizations. My guess is that Knuth probably already came up with this algorithm, formally analyzed it, and then came up with 15 others that were much better. Nonetheless, I figured I'd jot it down.

Warning: This is a rough draft of an algorithm. You shouldn't bother reading this if you want real rigor. I'm just writing it down to get it off my chest.

Rather than picking the first segment of the journey and working my way to the end, I want to pick a middle segment of the journey and work my way outwards recursively.

Given a list of distances (or times, or costs, or whatever) between cities, sort the list from shortest distance to longest distance regardless of which cities are involved. Now, loop over this list and use each element of the list as a middle segment of your journey. This gives us a list of "first steps" (or rather, first middle steps). Looping over the list from shortest distance to longest distance is an important optimization because it increases the likelihood that we'll find an optimal path early, allowing us to skip over a lot of potential paths later.

Also sort the list of distances so that for each city, we have a list of other cities in ascending distance order. By sorting all of the city pairs in order of (first city, distance), you can use one sort for all of the pairs.

Now, here comes the recursion. We have a bunch of middle parts of the journey. Use recursion to extend the journey either by adding to the beginning of the journey or by adding to the end of the journey. Keep recursing until you have a complete path or a partial path that is already longer than the best path seen so far. Now, we can either extend the journey at the beginning or at the end. Recursively try extending the journey by either adding to the beginning or the end. However, do it in order so that you try adding the shorter distances first. There's an implicit merge sort going on in this algorithm. This, again, is an optimization to allow you to skip work later.

While we were recursing, we had a function that took two things, a middle chunk of the journey and a set of cities not yet visited. Apply memoization so that anytime we get the same parameters, we return the same answer (by using a cache, of course). This is an important optimization.

Last of all, using the above algorithm, we'll quickly come up with the first complete path that has a decently minimal distance. Keep track of this as the best complete path seen so far. Anytime we are recursing and we come up with a partial path that is longer than the best complete path seen so far, we can stop recursing, give up, and try something else. This is an important optimization to save work "at the leaves of the tree".

I can't even wrap my head around the big O of this algorithm. I suspect it's one of those cases where you use words like "amortized cost".

This algorithm does have a weakness. If every distance between cities is exactly the same, it'll try every possibility. Similarly, if every distance between cities is exactly the same except one pair of cities which has a longer distance, it'll still try every possibility. I'm not sure of the degree to which the memoization fixes this problem. It'd be good to extend the algorithm somehow to recognize this situation and short circuit, i.e. automatically throw out paths of duplicate length.
Published at DZone with permission of Shannon Behrens, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

### Geoffrey De Smet replied on Sun, 2012/07/29 - 1:29am

This algorithm is an exhaustive search of each solution of a variant of the TSP Nearest-Neighbor construction heuristic (or a variant on the TSP Greedy Algorithm construction heurstic). It's an intresting beginner's attempt, but it has several problems:

• It will not scale. On first estimation, I would guess the Big0 is n^(n-k). So when scaling out, the running time will quickly go from seconds to billions of years (presuming you'd not terminate it in reasonable time). Have you tried this on a larger TSP dataset, such as this 980 city tour or even this 40 city tour of Europe?
• It's solution it finds in reasonable time is far from optimal. Don't kid yourself. The TSP Nearest-Neighbor construction heurstic is proven to do no worse than 1 + log(n)/2 times the optimal tour. So a 50 city tour will not be worse than 4 times the optimal tour. The TSP Greedy Algorithm construction heurisitic does no worse than 1/2 + log(n)/2. Both of these construction heuristics will preform similar to the initial solution found by your algorithm. Did you try your algorithm on a several datasets for which the optimal solution is known? If not, how can you be sure the solutions you are seeing are any good?

The solution to these problems are:

• Your algorithm for finding your initial solution is good and intresting. It's probably a good construction heuristic. Depending on the additional constraints that the business comes up with, it could be the best construction heuristic.
• Your algorithm to exhaustively find better solutions from that initial solution is bad (even with the pruning optimizations you mentioned). Instead, use metaheuristics, such as Tabu Search or Simulated Annealing.