We introduce an experimentation procedure for evaluating and comparing optimization algorithms based on the Traveling Salesman Problem (TSP). We argue that end-of-run results alone do not give sufficient information about an algorithm's performance, so our approach analyzes the algorithm's progress over time. Comparisons of performance curves in diagrams can be formalized by comparing the areas under them. Algorithms can be ranked according to a performance metric. Rankings based on different metrics can then be aggregated into a global ranking, which provides a quick overview of the quality of algorithms in comparison. An open source software framework, the TSP Suite, applies this experimental procedure to the TSP. The framework can support researchers in implementing TSP solvers, unit testing them, and running experiments in a parallel and distributed fashion. It also has an evaluator component, which implements the proposed evaluation process and produces detailed reports. We test the approach by using the TSP Suite to benchmark several local search and evolutionary computation methods. This results in a large set of baseline data, which will be made available to the research community. Our experiments show that the tested pure global optimization algorithms are outperformed by local search, but the best results come from hybrid algorithms. © 2014 IEEE.