On the other hand, there are also sorting algorithms that are performing better. This is the family of the integer sorting algorithms. These algorithms are using properties of integer to sort them without comparing them. They can be only be used to sort integers. Nevertheless, a hash function can be used to assign a unique integer to any value and so sort any value. All these algorithms are using extra space. There are several of these algorithms. In this article, we will see three of them and I will present an implementation in C++. At the end of the article, I will compare them to *std::sort*.

In the article, I will use *n* as the size of the array to sort and *m* as the max number that is permitted in the array.

## Bin Sort

Bin Sort, or Bucket Sort, is a very simple algorithm that partition all the input numbers into a number of buckets. Then, all the buckets are outputted in order in the array, resulting in a sorting array. I decided to implement the simplest case of Bin Sort where each number goes in its own bucket, so there are *m* buckets.

The implementation is pretty straightforward:

void binsort(std::vector<std::size_t>& A){ std::vector<std::vector<std::size_t>> B(MAX + 1); for(std::size_t i = 0; i < SIZE; ++i){ B[A[i]].push_back(A[i]); } std::size_t current = 0; for(std::size_t i = 0; i < MAX; ++i){ for(auto item : B[i]){ A[current++] = item; } } }

B is the array of buckets. Each bucket is implemented as a std::vector. The algorithm starts by filling each buckets with the numbers from the input array. Then, it outputs them in order in the array.

This algorithm works in *O(n + m)* and requires *O(m)* extra memory. With these properties, it makes a very limited algorithm, because if you don’t know the maximum number and you have to use the maximum number of the array type, you will have to allocate for instance 2^32 buckets. That won’t be possible.

## Couting Sort

An interesting fact about binsort is that each bucket contains only the same numbers. The size of the bucket would be enough. That is exactly what Counting Sort. It counts the number of times an element is present instead of the elements themselves. I will present two versions. The first one is a version using a secondary array and then copying again into the input array and the second one is an in-place sort.

void counting_sort(std::vector<std::size_t>& A){ std::vector<std::size_t> B(SIZE); std::vector<std::size_t> C(MAX); for (std::size_t i = 0; i < SIZE; ++i){ ++C[A[i]]; } for (std::size_t i = 1; i <= MAX; ++i){ C[i] += C[i - 1]; } for (long i = SIZE - 1; i >= 0; --i) { B[C[A[i]] - 1] = A[i]; --C[A[i]]; } for (std::size_t i = 0; i < SIZE; ++i){ A[i] = B[i]; } }

The algorithm is also simple. It starts by counting the number of
elements in each bucket. Then, it aggregates the number by summing them
to obtain the position of the element in the final sorted array. Then,
all the elements are copied in the temporary array. Finally, the
temporary array is copied in the final array. This algorithms works in *O(m + n)* and requires *O(m + n)*.
This version is presented only because it is present in the literature.
We can do much better by avoiding the temporary array and optimizing it
a bit:

void in_place_counting_sort(std::vector<std::size_t>& A){ std::vector<std::size_t> C(MAX + 1); for (std::size_t i = 0; i < SIZE; ++i){ ++C[A[i]]; } int current = 0; for (std::size_t i = 0; i < MAX; ++i){ for(std::size_t j =0; j < C[i]; ++j){ A[current++] = i; } } }

The temporary array is removed and the elements are directly written
in the sorted array. The counts are not used directly as position, so
there is no need to sum them. This version still works in *O(m + n)* but requires only *O(m)* extra memory. It is much faster than the previous version.

## Radix Sort

The last version that I will discuss here is a Radix Sort. This algorithm sorts the number digit after digit in a specific radix. It is a form of bucket sort, where there is a bucket by digit. Like Counting Sort, only the counts are necessary. For example, if you use radix sort in base 10. It will first sort all the numbers by their first digit, then the second, …. It can work in any base and that is its force. With a well chosen base, it can be very powerful. Here, we will focus on radix that are in the form 2^r. These radix have good properties, we can use shifts and mask to perform division and modulo, making the algorithm much faster.

The implementation is a bit more complex than the other implementations:

static const std::size_t digits = 2; //Digits static const std::size_t r = 16; //Bits static const std::size_t radix = 1 << r; //Bins static const std::size_t mask = radix - 1; void radix_sort(std::vector<std::size_t>& A){ std::vector<std::size_t> B(SIZE); std::vector<std::size_t> cnt(radix); for(std::size_t i = 0, shift = 0; i < digits; i++, shift += r){ for(std::size_t j = 0; j < radix; ++j){ cnt[j] = 0; } for(std::size_t j = 0; j < SIZE; ++j){ ++cnt[(A[j] >> shift) & mask]; } for(std::size_t j = 1; j < radix; ++j){ cnt[j] += cnt[j - 1]; } for(long j = SIZE - 1; j >= 0; --j){ B[--cnt[(A[j] >> shift) & mask]] = A[j]; } for(std::size_t j = 0; j < SIZE; ++j){ A[j] = B[j]; } } }

*r* indicates the power of two used as the radix (2^r). The
mask is used to compute modulo faster. The algorithm repeats the steps
for each digit. Here *digits* equals 2. It means that we support
2^32 values. A 32 bits value is sorted in two pass. The steps are very
similar to counting sort. Each value of the digit is counted and then
the counts are summed to give the position of the number. Finally, the
numbers are put in order in the temporary array and copied into A.

This algorithm works in *O(digits (m + radix))* and requires *O(n + radix)*
extra memory. A very good thing is that the algorithm does not require
space based on the maximum value, only based on the radix.

## Results

It’s time to compare the different implementations in terms of runtime. For each size, each version is tested 25 times on different random arrays. The arrays are the same for each algorithm. The number is the time necessary to sort the 25 arrays. The benchmark has been compiler with GCC 4.7.

The first test is made with very few duplicates (m = 10n).*Normal Scale*

Logarithmic Scale

Radix Sort comes to be the fastest in this case, **twice faster as std::sort**. In place counting sort has almost the same performance as

*std::sort*. The other are performing worse.

The second test is made with few duplicates (m ~= n).

*Normal Scale*

Logarithmic Scale

The numbers are impressive. In place **counting sort is between 3-4 times faster than std::sort** and

**radix sort is twice faster than**! Bin Sort does not performs very well and counting sort even if generally faster than

*std::sort**std::sort*does not scale very well.

Let’s test with more duplicates (m = n / 2).

*Normal Scale*

Logarithmic Scale

*std::sort* and radix sort performance does not change a lot
but the other sort are performing better. In-place counting sort is
still the leader with a higher margin.

Finally, with a lot of duplicates (m = n / 10).

*Normal Scale*

Logarithmic Scale

Again, *std::sort* and radix sort performance are stable, but in-place counting is now **ten times faster than std::sort** !

## Conclusion

To conclude, we have seen that these algorithms can outperforms *std::sort*
by a high factor (10 times for In place Counting Sort when there m
<< n). If you have to sort integers, you should consider these two
cases:

- m > n or m is unknown : Use radix sort that is about twice faster than
*std::sort*. - m << n : Use in place counting sort that can be much faster than
*std::sort*.

I
hope you found this article interesting. The implementation can be
found on Github:
https://github.com/wichtounet/articles/tree/master/src/linear_sorting

## Comments

## Philippe Lhoste replied on Wed, 2012/12/19 - 5:13am

The article was strange, the charts showing algorithms not present in the text (and the truncated names in the legends are clumsy).

I had to go to the original site (fortunately you point to it) to see all the algorithms / implementations.

Interesting article, none less.

## Baptiste Wicht replied on Wed, 2012/12/19 - 12:10pm

For the truncated names in the legends, it is my fault :( I didn't find a way to avoid that with the library I'm using the generate the JavaScript graph.

For the remaining, what do you mean that the charts not present ? I see them.

## Philippe Lhoste replied on Thu, 2012/12/20 - 1:54am

My sentence was clumsy... I meant that not all algorithms presented in the charts are shown in the body of the article here, while there are present in the original article.

## Baptiste Wicht replied on Thu, 2012/12/20 - 12:05pm in response to: Philippe Lhoste

:O

You're right... There probably was an error during the import. I made the fix myself. Thank you for reporting that. The article was no very useful in its form.

## Baptiste Wicht replied on Sun, 2014/09/14 - 6:50am in response to: Philippe Lhoste

Hi,

Indeed, I guess that the import is made automatically, which unfortunately destroys some of the dynamic graphs.

Thanks anyway ;)

Baptiste