| advertise add site services publishers database health videos | ![]() | about toolbar stats live show health store more stuff JOIN/LOGIN |
Smoothsort[1][2] (method) is a comparison-based sorting algorithm. It is a variation of heapsort developed by Edsger Dijkstra in 1981. Like heapsort, smoothsort's upper bound is O(n log n). The advantage of smoothsort is that it comes closer to O(n) time if the input is already sorted to some degree, whereas heapsort averages O(n log n) regardless of the initial sorted state.
[edit] OverviewThe array to be sorted is divided up into a string of heaps, each heap having a size equal to one of the Leonardo numbers L(n). The division process is simple - the leftmost nodes of the array are made into the largest heap possible, and the remainder is likewise divided up. It can be proved that
Each heap, having a size of size L(x), is structured from left to right as a sub-heap of size L(x-1), a sub-heap of size L(x-2), and a root node, with the exception of heaps with a size of L(1) and L(0) (which by definition are 1). Each heap maintains the heap property that a root node is always >= the root nodes of its child heaps (and therefore >= all nodes in its child heaps), and the string of heaps as a whole maintains the string property that the root node of each heap is >= the root node of the heap to the left. The consequence of this is that the rightmost node in the string will always be >= all of the other nodes, and importantly - an array that is already sorted needs no rearrangement to be made into a valid series of heaps. This is the source of the adaptive qualities of the algorithm. The algorithm is simple. We start by dividing up our unsorted array into a single heap of one element, followed by an unsorted portion. A one-element array is trivially a valid sequence of heaps. This sequence is then grown by adding one element at a time to the right, performing swaps to keep the sequence property and the heap property, until it fills the entire original array. From this point on, the rightmost element of the sequence of heaps will be the largest element in any of the heaps, and will therefore be in its correct, final position. We then reduce the series of heaps back down to a single heap of one element by removing the rightmost node (which stays in place) and performing re-arrangements to restore the heap condition. When we are back down to a single heap of one element, the array is sorted. [edit] OperationsIgnoring (for the moment) Dijkstra's optimisations, two operations are necessary - increase the string by adding one element to the right, and decrease the string by removing the right most element (the root of the last heap), preserving the heap and string conditions. [edit] Grow the string by adding an element to the right
After this, the heap and string properties must be restored. To do this,
The filter operation is greatly simplified by the use of leonardo numbers, as a heap will always either be a single node, or will have two child heaps. One does not need to manage the condition of one of the child heaps no being present. [edit] Optimisation
[edit] Shrink the string by removing the rightmost elementIf the rightmost element has a size of 1 (ie, L(1) or L(0)), then nothing needs to be done. Simply remove that rightmost heap. If the rightmost heap does not have a size of 1, then remove the root, exposing the two sub-heaps as members of the string. Restore the string property first on the left one and then on the right one. [edit] Optimisation
[edit] Java implementationThis code uses lo and hi as the bounds of the array inclusive. Note that this is not the usual convention. // by keeping these constants, we can avoid the tiresome business // of keeping track of Dijkstra's b and c. Instead of keeping // b and c, I will keep an index into this array. static final int LP[] = new int[] { 1, 1, 3, 5, 9, 15, 25, 41, 67, 109, 177, 287, 465, 753, 1219, 1973, 3193, 5167, 8361, 13529, 21891, 35421, 57313, 92735, 150049, 242785, 392835, 635621, 1028457, 1664079, 2692537, 4356617, 7049155, 11405773, 18454929, 29860703, 48315633, 78176337, 126491971, 204668309, 331160281, 535828591, 866988873 // the next number is > 31 bits. }; public static <C extends Comparable<C>> void sort(final C[] m, final int lo, final int hi) { int head = lo; // the offset of the first element of the prefix into m // These variables need a little explaining. If our string of heaps // is of length 38, then the heaps will be of size 25+9+3+1, which are // Leonardo numbers 6, 4, 2, 1. // Turning this int a binary number, we get b01010110 = x56. We represent // this number as a pair of numbers by right-shifting all the zeros and // storing the mantissa and exponent as "p" and "pshift". // This is handy, because the exponent is the index into L[] giving the // size of the rightmost heap, and because we can instantly find out if // the rightmost two heaps are consecutive leonardo numbers by checking // (p&3)==3 int p = 1; // the bitmap of the current standard concatenation >> pshift int pshift = 1; while (head < hi) { if ((p & 3) == 3) { // Add 1 by merging the first two blocks into a larger one. // The next Leonardo num is one bigger. sift(m, pshift, head); p >>>= 2; pshift += 2; p |= 1; head++; } else { // adding a new block of length 1 if (LP[pshift - 1] >= hi - head) { // this block is its final size. trinkle(m, p, pshift, head, false); } else { // this block will get merged. Just make it trusty. sift(m, pshift, head); } if (pshift == 1) { // LP[1] is being used, so we add use LP[0] p <<= 1; pshift--; } else { // shift out to position 1, add LP[1] p <<= (pshift - 1); pshift = 1; } p |= 1; head++; } } trinkle(m, p, pshift, head, false); while (pshift != 1 || p != 1) { if (pshift <= 1) { // block of length 1. No fiddling needed final int trail = Integer.numberOfTrailingZeros(p & ~1); p >>>= trail; pshift += trail; } else { p <<= 2; p ^= 7; pshift -= 2; // ok. This block gets broken into three bits. The rightmost // bit is a block of length 1. The left hand part is split into // two, a block of length LP[pshift+1] and one of LP[pshift]. // Both these two are appropriately heapified, but the root // nodes are not nessesarily in order. We therefore semitrinkle // both // of them trinkle(m, p >>> 1, pshift + 1, head - LP[pshift] - 1, true); trinkle(m, p, pshift, head - 1, true); } head--; } } private static <C extends Comparable<C>> void sift(final C[] m, int pshift, int head) { // we do not use Floyd's improvements to the heapsort sift, because we // are not doing what heapsort does - always moving nodes from near // the bottom of the tree to the root. final C val = m[head]; while (pshift > 1) { final int rt = head - 1; final int lf = head - 1 - LP[pshift - 2]; if (val.compareTo(m[lf]) >= 0 && val.compareTo(m[rt]) >= 0) break; if (m[lf].compareTo(m[rt]) >= 0) { m[head] = m[lf]; head = lf; pshift -= 1; } else { m[head] = m[rt]; head = rt; pshift -= 2; } } m[head] = val; } private static <C extends Comparable<C>> void trinkle(final C[] m, int p, int pshift, int head, boolean isTrusty) { final C val = m[head]; while (p != 1) { final int stepson = head - LP[pshift]; if (m[stepson].compareTo(val) <= 0) break; // current node is greater than head. Sift. // no need to check this if we know the current node is trusty, // because we just checked the head (which is val, in the first // iteration) if (!isTrusty && pshift > 1) { final int rt = head - 1; final int lf = head - 1 - LP[pshift - 2]; if (m[rt].compareTo(m[stepson]) >= 0 || m[lf].compareTo(m[stepson]) >= 0) break; } m[head] = m[stepson]; head = stepson; final int trail = Integer.numberOfTrailingZeros(p & ~1); p >>>= trail; pshift += trail; isTrusty = false; } if (!isTrusty) { m[head] = val; sift(m, pshift, head); } } [edit] Notes
| ||||||||||||||||||||||||||||||||||||||||||
| ↑ top of page ↑ | about thumbshots |