Home

[This page is auto-generated; please prefer to read the article’s PDF Form.]



[Prev]   [Up]   

Performance Comparison

The outer loop of both the methods runs (n1) times, and their inner loop runs in total almost (n1)(n + 4)4 and n(n1)2 times (using CI and CS from (1) and (2)). If the work done by an iteration of the inner loop were to be roughly the same for both methods (and similarly for the outer loop excluding the inner loop’s work), we could say that insertion_sort(), having lesser iterations count, should outperform selection_sort().

But the factors affecting their relative performance can be many, in addition to the visible difference in their set of operations per iteration: compiler optimizations, hardware performance factors related to branches and locality of reference, and so on. (For example, in each iteration of the outer loop, selection_sort() accesses the complete subarray a[i : n 1], but insertion_sort() accesses only a part of the subarray a[0 : i]; this gives the former a lower locality of reference.) In fact, for any algorithm in general, performance related conclusion should be made very carefully with the underlying conditions specified, and often there is no simple answer.

Here we will focus on two factors which may become significant for the performance of any comparison based sorting algorithm: the cost of array elements’ transfer and the cost of comparing them. The transfer cost becomes significant, for example, when the underlying storage has slow read/write speed, or when each array element has large size. The comparison cost becomes significant, for example, when the keys are of a complex type, or the comparison involves many steps instead of a simple ‘<’ operator.

We note from (1) and (2) that the number of transfers in selection_sort() is always linear (in n), but it is quadratic for insertion_sort() (assuming f = 12). This much difference is not present in the growth of their number of comparisons; it is quadratic for both. So, as the transfer cost gets sufficiently high, this difference between TI and TS will start becoming visible in the total costs, and selection_sort() will start outperforming insertion_sort(). As mentioned earlier, one way the transfer cost would increase is by increasing the size of each array element.

For example, with a randomly generated array a[] of n = 30000 elements, I find that selection_sort() starts to outperform insertion_sort() when object’s more_fields[] array has 7 or more elements (resulting in larger sized object). These observations are for a particular system and will vary across hardware and compiler options.

We also know from (1) and (2) that the number of comparisons required by selection_sort() is almost double than that of insertion_sort() (assuming f = 12). So an increase in comparison cost should increase the total cost of selection_sort() more than that of insertion_sort(). While keeping the more_fields[] array at 7 elements, if the comparison function is now modified as below, I notice insertion_sort() outperforming selection_sort() (again, this is system specific). Note that this function is equivalent to the earlier one.

#define less_than(o1, o2) \  
  ((o1.key/10)*10 + o1.key%10 < (o2.key/10)*10 + o2.key%10)

[Prev]   [Up]