QuickMedian Algorithm in Spin

Here's an implementation of the QuickMedian algorithm, invented by C.A.R. Hoare, who also came up with QuickSort. You can read about it here. I adapted the C++ code presented there to Spin and made some corrections to the end conditions that kept the original code from terminating.
The Spin code is attached below. It makes use of my '.debug framework to display the median-finding progress as it goes along. You can download the '.debug background .exe here:
What makes QuickMedian interesting is that it doesn't do a complete sort of the array -- just enough to isolate the median value. I haven't come across any errors in my implementation yet, but I haven't tested the code exhaustively, either.
-Phil
The Spin code is attached below. It makes use of my '.debug framework to display the median-finding progress as it goes along. You can download the '.debug background .exe here:
What makes QuickMedian interesting is that it doesn't do a complete sort of the array -- just enough to isolate the median value. I haven't come across any errors in my implementation yet, but I haven't tested the code exhaustively, either.
-Phil
Comments
Having the median, apply it again to the halves, to find the quartiles. Combine that with the extrema and you have all you need for a quick alternative to mean/variance/skew.
Thanks for sharing your work.
I'm trying to figure out if I'm doing something wrong or if there's an error in the algorithm which when certain combinations of numbers are being sorted, the program gets caught in an endless loop.
I've attached the program I used to generate the following output.
LRF Median Demo |165 164 164 165[166]166 165 165| 165 164 164 165[165]166 165|166 ------------ |165 164 164 165[165]165 166|166 165|164 164 165[165]165 166 166 ------------ 165 164 164 165[165]165 166 166 Filtered Reading = 165 cm Array size: 8 Comparisons: 90 Swaps: 33 Press any key to continue.
In the above example the median is quickly found.
Most of the output from the attached program look this way. However every once in a while the program hangs.
LRF Median Demo |165 164 164 163[164]164 164 165| 164|164 164 163[164]164|165 165 164 164|164 163[164]164 165 165 ------------ 164 164 164|163[164]164 165 165| ------------ 164 164 164|163[164]164|165 165 ------------ 164 164 164|163[164]164|165 165
The sequence shown in the last two lines just keeps repeating.
I changed your original code to use PST and to display three digit numbers rather than two digit numbers and I also changed it to sort longs rather than bytes. Character 149 was placing asterisks in the terminal so I replace "149" characters with "|" characters. I hope my formatting is close enough to the original, you can make sense of the numbers.
I'm hoping there's some small error in the original code which causes this endless loop. I don't see any reason why sorting longs would be different from sorting bytes but just to be sure, I'll try to replicate this with the byte sorting code.
Quick Median Demo |65 64 64 63[64]64 64 65| 64|64 64 63[64]64|65 65 64 64|64 63[64]64 65 65 ------------ 64 64 64|63[64]64 65 65| ------------ 64 64 64|63[64]64|65 65 ------------ 64 64 64|63[64]64|65 65 ------------ 64 64 64|63[64]64|65 65
Here's the program used to generate this output. I moved the "array" to the DAT section so I could fill it with the offending numbers.
CON _clkmode = xtal1 + pll16x _xinfreq = 5_000_000 SIZE = 8'21 OBJ 'rnd : "RealRandom" Pst : "Parallax Serial Terminal" VAR 'byte array[SIZE] long seed long comparisons, swaps PUB start | i 'rnd.start 'seed := 255 'rnd.random 'rnd.stop Pst.Start(115_200) repeat Pst.Str(string(11, 13, "Press any key to begin.")) result := Pst.RxCount waitcnt(clkfreq / 2 + cnt) until result Pst.RxFlush Pst.Clear Pst.Home Pst.Str(string("Quick Median Demo", 13, 13)) {repeat i from 0 to SIZE - 1 'array[i] := (?seed >> 16) // 90 + 10 array[i] := (?seed >> 16) // 18 + 10 } i := byte_median(@array, SIZE) Pst.Str(string(13, "Median: ")) Pst.Dec(i) Pst.Str(string(" Array size: ")) Pst.Dec(SIZE) Pst.Str(string(" Comparisions: ")) Pst.Dec(comparisons) Pst.Str(string(" Swaps: ")) Pst.Dec(swaps) '.debug #CR, "Median: ", dec i, " Array size: " , dec SIZE, " Comparisions: ", dec comparisons, " Swaps: ", dec swaps PUB byte_median(addr, n) | k, first, last, i, j first~ last := n - 1 k := n >> 1 repeat while (first < last) i := byte_split(addr, byte[addr][k], first, last) j := i & $ffff i >>= 16 if (i =< k) first := i if (j => k) last := j Pst.Chars("-", 12) Pst.NewLine '.debug rep "-"\12, #CR show(first, last) return byte[addr][k] PUB byte_split (addr, pivot, first, last) | t repeat show(first, last) repeat while byte[addr][first] < pivot first++ comparisons++ repeat while byte[addr][last] > pivot last-- comparisons++ comparisons += 2 if (first < last) t := byte[addr][first] byte[addr][first] := byte[addr][last] byte[addr][last] := t swaps++ first++ last := last - 1 #> first while first < last return first << 16 | (last #> first) PUB show(first, last) | i repeat i from 0 to SIZE - 1 Pst.Char(" ") if array[i] < 10 Pst.Char(" ") Pst.Dec(array[i]) '.debug " ", dec2 array[i] Pst.PositionX(first * 3) Pst.Char("|") Pst.PositionX(last * 3 + 3) Pst.Char("|") Pst.PositionX((SIZE >> 1)* 3) Pst.Char("[") Pst.MoveRight(2) Pst.Char("]") Pst.NewLine '.debug #MOVX, first * 3, 149, #MOVX, last * 3 + 3, 149 '.debug #MOVX, (SIZE >> 1)* 3, "[", rep #RIGHT\2, "]", #LF, #MOVX, 0 DAT array byte 65, 64, 64, 63, 64, 64, 64, 65
I've also attached the above code as a Spin file.
I'll be trying to figure this out but I find the sort algorithm a bit mystifying.
Edit: Nope. I spoke too soon. It crashes with an odd sized array.
LRF Median Demo |164 165 164 164[165]165 165 165 164| 164 164|164 164[165]165 165 165|165 164 164 164 164[165]165 165|165 165 ------------ |164 164 164 164[165]165 165|165 165 ------------ |164 164 164 164[165]165|165 165 165 ------------ |164 164 164 164[165]165|165 165 165 ------------ |164 164 164 164[165]165|165 165 165 ------------ |164 164 164 164[165]165|165 165 165
'array byte 64, 65, 64, 64, 65, 65, 65, 65, 64 ' does not resolve 'array byte 64, 65, 64, 64, 64, 65, 65, 65, 64 ' Comparisons: 18 Swaps: 6 'array byte 65, 65, 64, 64, 65, 65, 65, 65, 64 ' Comparisons: 15 Swaps: 4 'array byte 64, 64, 64, 64, 65, 65, 65, 65, 65 ' does not resolve 'array byte 64, 64, 64, 64, 64, 65, 65, 65, 65 ' Comparisons: 20 Swaps: 4 'array byte 65, 65, 64, 64, 65, 65, 65, 64, 64 ' does not resolve array byte 65, 65, 64, 64, 65, 65, 65, 64, 63 ' does not resolve
I looked at the original C# code and I don't think Phil's translation completely agrees with the original code. The differences are with comparisons including "or equal to" additions or not. Even when I make the few changes I can see, the code still ends up in endless loops with some of the above data.
I'm starting to wonder if the original C# code is correct.
I wanted to use this code with my experiments using the Laser Rangefinder. One thing I noticed with the data I'm using is much of the data is identical. It's relatively likely the median value will be present in one of the first few elements of the data collected.
I tried to come up with a median algorithm to take advantage of this similar data.
This is the jest of my algorithm:
PUB ByteMedian(arrayAddress, elements) | halfTheElements, toCompareIndex, { } compareAgainstIndex, elementsLarger, elementsSmaller, maxIndex maxIndex := elements - 1 halfTheElements := elements / 2 repeat toCompareIndex from 0 to maxIndex elementsLarger := 0 elementsSmaller := 0 repeat compareAgainstIndex from 0 to maxIndex if byte[arrayAddress][toCompareIndex] > byte[arrayAddress][compareAgainstIndex] elementsSmaller++ elseif byte[arrayAddress][toCompareIndex] < byte[arrayAddress][compareAgainstIndex] elementsLarger++ comparisons += elements * 2 + 2 if elementsSmaller =< halfTheElements and elementsLarger =< halfTheElements loops := toCompareIndex + 1 return byte[arrayAddress][toCompareIndex]
It counts up how many values in the array are larger than element being tested. If the element being tested doesn't have at least half the other elements either larger than it or smaller than it, then it is assumed the element is the median. With lots of identical elements, it doesn't take long to find one which is equal to the median.
If anyone gets the QuickMedian algorithm working with the arrays I've posted, I hope they let us know.
I've attached the program I used to test my algorithm. In order to generate data without much variation, I changed the random number generation equation to:
repeat result from 0 to SIZE - 1 array[result] := (?seed >> 16) // 4 + 10 ' values 10 through 13
As noted in the comments, this limits the values to numbers between 10 and 13 inclusive.
Here's some sample output:
Median Demo 13, 12, 13, 13, 12, 11, 12, 10, 11 Median = 12 Array size: 9 Comparisons: 40 loops: 2 Median Demo 12, 10, 10, 11, 12, 10, 10, 10, 10 Median = 10 Array size: 9 Comparisons: 40 loops: 2 Median Demo 12, 13, 13, 12, 13, 11, 12, 13, 12 Median = 12 Array size: 9 Comparisons: 20 loops: 1 Median Demo 10, 11, 11, 11, 11, 10, 13, 10, 10 Median = 11 Array size: 9 Comparisons: 40 loops: 2 Median Demo 10, 11, 13, 13, 12, 10, 11, 13, 13 Median = 12 Array size: 9 Comparisons: 100 loops: 5 Median Demo 12, 12, 13, 11, 13, 11, 13, 10, 13 Median = 12 Array size: 9 Comparisons: 20 loops: 1
I tested the code with random data similar to yours, and it did end up in an endless loop. I wish I had the time to devote to tracking down the reason, but I don't at the moment.
Worst of all, though, I'd completely forgotten about this thread. I mean ... I looked at it and wondered, "Did I really post this? And it's only been ... what ... 2 1/2 years ago?" Hopefully, 10 ... 20 ... years from now, when I've completely lost it, a forumista or two might come to visit me at the "home."
-Phil
New-type-of-"sort"-algorithmic-function-(I-think)
And another one, while PBASIC, is informative. If I'm not mistaken, Phil's approach there made its way translated into Spin into his median filter for the Parallax #29124 altimeter.
Computing-a-running-median
Thanks, Tracy!
-Phil
I just found a big typo in Phil's code. He had the i and j swapped. Give me a few minutes to try to fix this.
now swap values if they are in the wrong part: if (i <= j) { float t = a[i]; a[i] = a[j]; a[j] = t; i++;j--; }
I think the "i" and "j" are swapped.
Are the "i" and "j" swapped?
I've now got code which no longer loops endlessly but I want to test it a bit more.
Edit: This is Phil's typo (assuming he was translating the C# code).
if (i =< k) first := i if (j => k) last := j
Here's the C#
if (j < k) L = i; if (k < i) R = j;
Fixing this wasn't enough to get the code to work. I'm still testing the latest version.
So far my latest version doesn't work. If you get this working kuroneko, I hope you let us know.
PUB byte_median(addr, n) | k, first, last, i, j first~ last := n - 1 k := n >> 1 repeat while (first < last) i := byte_split(addr, byte[addr][k], first, last) j := i & $ffff i >>= 16 if (j < k) first := i if (k < i) last := j '.debug rep "-"\12, #CR show(first, last) return byte[addr][k] PUB byte_split(addr, pivot, first, last) | t repeat show(first, last) repeat while byte[addr][first] < pivot first++ comparisons++ repeat while pivot < byte[addr][last] last-- comparisons++ comparisons += 2 if (first =< last) t := byte[addr][first] byte[addr][first] := byte[addr][last] byte[addr][last] := t swaps++ first++ last-- while first =< last return first << 16 | last
I just tested it and it appears to work.
Now I need to see what you did that I missed.
Thanks for your help.
Here's the output from one of the arrays (which had been) giving me trouble.
Quick Median Demo |65 65 64 64[65]65 65 64 64| 64|65 64 64[65]65 65 64|65 64 64|64 64[65]65 65|65 65 64 64 64 64[65]65|65 65 65 |64 64 64 64[65]65 65 65 65 64 64 64 64[65]65 65 65 65 Median: 65 Array size: 9 Comparisons: 16 Swaps: 5
Edit: But from a quick glance it looks like that never happens.
I modified the code to sort longs. While it's unlikely I'll be sorting arrays with more elements than the 16-bit return value Phil was using, I changed the code to pass pointers to the 'i" and "j" values. I'm thinking these changes probably slow down the code.
PUB long_median(addr, n) | k, first, last, i, j first := 0 last := n - 1 k := n >> 1 repeat while (first < last) i := first j := last long_split(addr, n, long[addr][k], @i, @j) if j < k first := i if k < i last := j Pst.Chars("-", 12) Pst.NewLine show(addr, n, first, last, MAX_DIGITS) return long[addr][k] PUB long_split (addr, n, pivot, iPtr, jPtr) | t repeat show(addr, n, long[iPtr], long[jPtr], MAX_DIGITS) repeat while long[addr][long[iPtr]] < pivot long[iPtr]++ comparisons++ repeat while pivot < long[addr][long[jPtr]] long[jPtr]-- comparisons++ comparisons += 2 if long[iPtr] =< long[jPtr] t := long[addr][long[iPtr]] long[addr][long[iPtr]] := long[addr][long[jPtr]] long[addr][long[jPtr]] := t swaps++ long[iPtr]++ long[jPtr]-- while long[iPtr] =< long[jPtr]
Thanks Phil and kuroneko for (once again) continuing my education.
I wanted to experiment with using multiple sensor readings from SF02 as discussed a bit in this thread.
I'm having my super cheapy pan and tilt mechanism scan over an array of points and record the readings at each point. I use a scan of a level area as the baseline in order to detect obstacles or drop offs. Subsequent scans would be compared with the data from the level area in order to find high and low spots.
The filter would be applied to the individual points.
I'm not sure if filtering the data will help. The LRF does its job just fine. It's the $3 servos which are giving me some trouble.
I also found a file named SuperFastMedian3.txt with these two lines
z:=a#>b<#(a<#b#>c)
44 microseconds
Do I win and unreadable code award?
John Abshier
I too like the running median as a pre-filter to remove shot noise, single far outliners in the case of a three entry median. The output of the running median can feed a subsequent lowpass or window or adaptive filter. One unfortunate property of a long median filter comes when the robot! suddenly comes to the edge of a cliff. It takes some time for the high values in the table fill up, then suddenly after a delay of 1/2 the table size the median takes a sudden plunge. Not that traditional filters are much better, smooth rather than sudden.
The data is almost real. I subtracted 100 from the distance values so the numbers would work with Phil's two digit sort display.
These values were taken with the robot on the bench without any disturbances. I was thinking a median filter would take care of issues like a leaf blowing in front of the LRF or a possible stray wire swinging in front of the sensor.
I'm sure I'll be looking for ways to speed up the data collection but for now, I'm just hoping to get some reliable numbers to see if my $10 (including the price of the servos) pan and tilt mechanism (when used to point the SF02) is capable to detecting problems in the robot's path.