Performance Zone is brought to you in partnership with:

I’m a swiss Master student in Computer Science. I’m very interested in C++, open source projects, Linux, Intel Assembly and Agile. I'm currently working on Eddi, a new programming language that I created to improve my skills in C++. I've also worked a lot on Java technologies (Sprint, Osgi, Play!, ...), but I'm not currently working with Java anymore. Baptiste is a DZone MVB and is not an employee of DZone and has posted 51 posts at DZone. You can read more from them at their website. View Full User Profile

Algorithm of the Week: Integer Linear Time Sorting Algorithms

12.18.2012
| 9019 views |
  • submit to reddit
Most of the sorting algorithms that are used are generally comparison sort. It means that each element of the collection being sorted will be compared to see which one is the first one. A comparison must have a lower bound of Ω(n log n) comparisons. That is why there are no comparison-based sorting algorithm better than O(n log n).

On the other hand, there are also sorting algorithms that are performing better. This is the family of the integer sorting algorithms. These algorithms are using properties of integer to sort them without comparing them. They can be only be used to sort integers. Nevertheless, a hash function can be used to assign a unique integer to any value and so sort any value. All these algorithms are using extra space. There are several of these algorithms. In this article, we will see three of them and I will present an implementation in C++. At the end of the article, I will compare them to std::sort.

In the article, I will use n as the size of the array to sort and m as the max number that is permitted in the array.

Bin Sort

Bin Sort, or Bucket Sort, is a very simple algorithm that partition all the input numbers into a number of buckets. Then, all the buckets are outputted in order in the array, resulting in a sorting array. I decided to implement the simplest case of Bin Sort where each number goes in its own bucket, so there are m buckets.

The implementation is pretty straightforward:

void binsort(std::vector<std::size_t>& A){
    std::vector<std::vector<std::size_t>> B(MAX + 1);

    for(std::size_t i = 0; i < SIZE; ++i){
        B[A[i]].push_back(A[i]);
    }

    std::size_t current = 0;
    for(std::size_t i = 0; i < MAX; ++i){
        for(auto item : B[i]){
            A[current++] = item;
        }
    }
}

B is the array of buckets. Each bucket is implemented as a std::vector. The algorithm starts by filling each buckets with the numbers from the input array. Then, it outputs them in order in the array.

This algorithm works in O(n + m) and requires O(m) extra memory. With these properties, it makes a very limited algorithm, because if you don’t know the maximum number and you have to use the maximum number of the array type, you will have to allocate for instance 2^32 buckets. That won’t be possible.

Couting Sort

An interesting fact about binsort is that each bucket contains only the same numbers. The size of the bucket would be enough. That is exactly what Counting Sort. It counts the number of times an element is present instead of the elements themselves. I will present two versions. The first one is a version using a secondary array and then copying again into the input array and the second one is an in-place sort.

void counting_sort(std::vector<std::size_t>& A){
    std::vector<std::size_t> B(SIZE);
    std::vector<std::size_t> C(MAX);
    for (std::size_t i = 0; i < SIZE; ++i){
        ++C[A[i]];
    }
    for (std::size_t i = 1; i <= MAX; ++i){
        C[i] += C[i - 1];
    }
    for (long i = SIZE - 1; i >= 0; --i) {
        B[C[A[i]] - 1] = A[i];
        --C[A[i]];
    }
    for (std::size_t i = 0; i < SIZE; ++i){
        A[i] = B[i];
    }
}

The algorithm is also simple. It starts by counting the number of elements in each bucket. Then, it aggregates the number by summing them to obtain the position of the element in the final sorted array. Then, all the elements are copied in the temporary array. Finally, the temporary array is copied in the final array. This algorithms works in O(m + n) and requires O(m + n). This version is presented only because it is present in the literature. We can do much better by avoiding the temporary array and optimizing it a bit:

void in_place_counting_sort(std::vector<std::size_t>& A){
    std::vector<std::size_t> C(MAX + 1);

    for (std::size_t i = 0; i < SIZE; ++i){
        ++C[A[i]];
    }

    int current = 0;
    for (std::size_t i = 0; i < MAX; ++i){
        for(std::size_t j =0; j < C[i]; ++j){
            A[current++] = i;
        }
    }
}

The temporary array is removed and the elements are directly written in the sorted array. The counts are not used directly as position, so there is no need to sum them. This version still works in O(m + n) but requires only O(m) extra memory. It is much faster than the previous version.

Radix Sort

The last version that I will discuss here is a Radix Sort. This algorithm sorts the number digit after digit in a specific radix. It is a form of bucket sort, where there is a bucket by digit. Like Counting Sort, only the counts are necessary. For example, if you use radix sort in base 10. It will first sort all the numbers by their first digit, then the second, …. It can work in any base and that is its force. With a well chosen base, it can be very powerful. Here, we will focus on radix that are in the form 2^r. These radix have good properties, we can use shifts and mask to perform division and modulo, making the algorithm much faster.

The implementation is a bit more complex than the other implementations:

static const std::size_t digits = 2;        //Digits
static const std::size_t r = 16;                 //Bits
static const std::size_t radix = 1 << r;         //Bins
static const std::size_t mask = radix - 1;

void radix_sort(std::vector<std::size_t>& A){
    std::vector<std::size_t> B(SIZE);
    std::vector<std::size_t> cnt(radix);

    for(std::size_t i = 0, shift = 0; i < digits; i++, shift += r){
        for(std::size_t j = 0; j < radix; ++j){
            cnt[j] = 0;
        }

        for(std::size_t j = 0; j < SIZE; ++j){
            ++cnt[(A[j] >> shift) & mask];
        }

        for(std::size_t j = 1; j < radix; ++j){
            cnt[j] += cnt[j - 1];
        }

        for(long j = SIZE - 1; j >= 0; --j){
            B[--cnt[(A[j] >> shift) & mask]] = A[j];
        }

        for(std::size_t j = 0; j < SIZE; ++j){
           A[j] = B[j];
        }
    }
}

r indicates the power of two used as the radix (2^r). The mask is used to compute modulo faster. The algorithm repeats the steps for each digit. Here digits equals 2. It means that we support 2^32 values. A 32 bits value is sorted in two pass. The steps are very similar to counting sort. Each value of the digit is counted and then the counts are summed to give the position of the number. Finally, the numbers are put in order in the temporary array and copied into A.

This algorithm works in O(digits (m + radix)) and requires O(n + radix) extra memory. A very good thing is that the algorithm does not require space based on the maximum value, only based on the radix.

Results

It’s time to compare the different implementations in terms of runtime. For each size, each version is tested 25 times on different random arrays. The arrays are the same for each algorithm. The number is the time necessary to sort the 25 arrays. The benchmark has been compiler with GCC 4.7.

The first test is made with very few duplicates (m = 10n).

Normal Scale


Logarithmic Scale


Radix Sort comes to be the fastest in this case, twice faster as std::sort. In place counting sort has almost the same performance as std::sort. The other are performing worse.

The second test is made with few duplicates (m ~= n).

Normal Scale


Logarithmic Scale



The numbers are impressive. In place counting sort is between 3-4 times faster than std::sort and radix sort is twice faster than std::sort ! Bin Sort does not performs very well and counting sort even if generally faster than std::sort does not scale very well.

Let’s test with more duplicates (m = n / 2).

Normal Scale


Logarithmic Scale


std::sort and radix sort performance does not change a lot but the other sort are performing better. In-place counting sort is still the leader with a higher margin.

Finally, with a lot of duplicates (m = n / 10).

Normal Scale


Logarithmic Scale


Again, std::sort and radix sort performance are stable, but in-place counting is now ten times faster than std::sort !

Conclusion

To conclude, we have seen that these algorithms can outperforms std::sort by a high factor (10 times for In place Counting Sort when there m << n). If you have to sort integers, you should consider these two cases:

  • m > n or m is unknown : Use radix sort that is about twice faster than std::sort.
  • m << n : Use in place counting sort that can be much faster than std::sort.

I hope you found this article interesting. The implementation can be found on Github: https://github.com/wichtounet/articles/tree/master/src/linear_sorting



Published at DZone with permission of Baptiste Wicht, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Philippe Lhoste replied on Wed, 2012/12/19 - 5:13am

The article was strange, the charts showing algorithms not present in the text (and the truncated names in the legends are clumsy).

I had to go to the original site (fortunately you point to it) to see all the algorithms / implementations.

Interesting article, none less.

Baptiste Wicht replied on Wed, 2012/12/19 - 12:10pm

For the truncated names in the legends, it is my fault :( I didn't find a way to avoid that with the library I'm using the generate the JavaScript graph. 

For the remaining, what do you mean that the charts not present ? I see them. 

Philippe Lhoste replied on Thu, 2012/12/20 - 1:54am

My sentence was clumsy... I meant that not all algorithms presented in the charts are shown in the body of the article here, while there are present in the original article.

Baptiste Wicht replied on Thu, 2012/12/20 - 12:05pm in response to: Philippe Lhoste

:O

You're right... There probably was an error during the import. I made the fix myself. Thank you for reporting that. The article was no very useful in its form. 

Baptiste Wicht replied on Sun, 2014/09/14 - 6:50am in response to: Philippe Lhoste

 Hi,

Indeed, I guess that the import is made automatically, which unfortunately destroys some of the dynamic graphs. 

Thanks anyway ;)

Baptiste

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.