comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Mode function for floating point arrays
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Mode function for floating point arrays [message #85114] Fri, 05 July 2013 11:55 Go to next message
Matthew Argall is currently offline  Matthew Argall
Messages: 286
Registered: October 2011
Senior Member
PEAMBLE:
I need a function that finds the mode of a floating point array. I have read David Fanning's article about integer arrays

http://www.idlcoyote.com/code_tips/mode.html

From this article about majority voting, it seems like "Hist_ND" works for floating point values, but I have no experience with the magic of HISTOGRAM

https://groups.google.com/forum/#!searchin/comp.lang.idl-pvw ave/Mode$20of$20a$20floating$20point$20array/comp.lang.idl-p vwave/YZK2ey-O5sE/9fLvx_AG2IAJ


QUESTION:
Here is my attempt. Can anyone make it better/faster?

;-----------------------------------------------------
function mrmode, array, $
EPSILON=epsilon
compile_opt idl2

;Number of points in ARRAY
npts = n_elements(array)

;Default value for EPSILON
if n_elements(epsilon) eq 0 then epsilon = 1d-5

;[index, count] for keeping track of mode statistics
mode_count = lonarr(2, npts)

;Store first ~unique number. Count the how many ~unique numbers there are.
mode_count[*,0] = [0,1]
nunique = 1

;Step through all points in ARRAY
for i = 1, npts - 1 do begin
match_found = 0

;Try to pair the new point with other mode candidates
for j = 0, nunique - 1 do begin
if array[i] gt array[mode_count[0,j]]-epsilon && $
array[i] lt array[mode_count[0,j]]+epsilon $
then begin

mode_count[1,j] += 1
match_found = 1
endif
endfor

;If no match was found, create a new mode candidate
if match_found eq 0 then begin
mode_count[*,nunique] = [i,1]
nunique += 1
endif
endfor

;Get the mode
void = max(mode_count[1,*], iMode)
mode = array[mode_count[0,iMode]]

return, mode
end


;----------------------------------------------------------- ----------
;Example Program (IDL> .r mrmode) /////////////////////////////////////////
;----------------------------------------------------------- ----------
array = [1.2, 0.1, 3.3, 0.1, 2.0, 3.3, 4.8, 1.2, 0.1, 0.1, 6.7, 3.3]
mode = MrMode(array)
print, FORMAT='(%"The mode is: %f")', mode

end
Re: Mode function for floating point arrays [message #85119 is a reply to message #85114] Mon, 08 July 2013 01:01 Go to previous messageGo to next message
Rob Klooster is currently offline  Rob Klooster
Messages: 18
Registered: February 2013
Junior Member
Hi Matthew,

Histogram also works on floating point arrays, you just need to set the binsize:

hist = histogram(array, binsize=epsilon, locations=locations)
mode = locations[where(hist eq max(hist))]

Note that for small values of epsilon, the resulting histogram array can become very large.

Regards,
Rob.


Op vrijdag 5 juli 2013 20:55:23 UTC+2 schreef Matthew Argall het volgende:
> PEAMBLE:
>
> I need a function that finds the mode of a floating point array. I have read David Fanning's article about integer arrays
>
>
>
> http://www.idlcoyote.com/code_tips/mode.html
>
>
>
> From this article about majority voting, it seems like "Hist_ND" works for floating point values, but I have no experience with the magic of HISTOGRAM
>
>
>
> https://groups.google.com/forum/#!searchin/comp.lang.idl-pvw ave/Mode$20of$20a$20floating$20point$20array/comp.lang.idl-p vwave/YZK2ey-O5sE/9fLvx_AG2IAJ
>
>
>
>
>
> QUESTION:
>
> Here is my attempt. Can anyone make it better/faster?
>
>
>
> ;-----------------------------------------------------
>
> function mrmode, array, $
>
> EPSILON=epsilon
>
> compile_opt idl2
>
>
>
> ;Number of points in ARRAY
>
> npts = n_elements(array)
>
>
>
> ;Default value for EPSILON
>
> if n_elements(epsilon) eq 0 then epsilon = 1d-5
>
>
>
> ;[index, count] for keeping track of mode statistics
>
> mode_count = lonarr(2, npts)
>
>
>
> ;Store first ~unique number. Count the how many ~unique numbers there are.
>
> mode_count[*,0] = [0,1]
>
> nunique = 1
>
>
>
> ;Step through all points in ARRAY
>
> for i = 1, npts - 1 do begin
>
> match_found = 0
>
>
>
> ;Try to pair the new point with other mode candidates
>
> for j = 0, nunique - 1 do begin
>
> if array[i] gt array[mode_count[0,j]]-epsilon && $
>
> array[i] lt array[mode_count[0,j]]+epsilon $
>
> then begin
>
>
>
> mode_count[1,j] += 1
>
> match_found = 1
>
> endif
>
> endfor
>
>
>
> ;If no match was found, create a new mode candidate
>
> if match_found eq 0 then begin
>
> mode_count[*,nunique] = [i,1]
>
> nunique += 1
>
> endif
>
> endfor
>
>
>
> ;Get the mode
>
> void = max(mode_count[1,*], iMode)
>
> mode = array[mode_count[0,iMode]]
>
>
>
> return, mode
>
> end
>
>
>
>
>
> ;----------------------------------------------------------- ----------
>
> ;Example Program (IDL> .r mrmode) /////////////////////////////////////////
>
> ;----------------------------------------------------------- ----------
>
> array = [1.2, 0.1, 3.3, 0.1, 2.0, 3.3, 4.8, 1.2, 0.1, 0.1, 6.7, 3.3]
>
> mode = MrMode(array)
>
> print, FORMAT='(%"The mode is: %f")', mode
>
>
>
> end
Re: Mode function for floating point arrays [message #85120 is a reply to message #85119] Mon, 08 July 2013 01:24 Go to previous messageGo to next message
Rob Klooster is currently offline  Rob Klooster
Messages: 18
Registered: February 2013
Junior Member
On second thought, it will be more efficient to treat the array as a sparse array and use value_locate, as in David's article:

http://www.idlcoyote.com/code_tips/valuelocate.html

sortedarray = array[Sort(array)]
arrayenum = sortedarray[Uniq(sortedarray)]
mappedarray = Value_Locate(arrayenum, array)
hist = histogram(mappedarray, min=0)
mode = arrayenum[where(hist eq max(hist))]

Maybe you can update the uniq function to accept a value for epsilon to decide whether two floating values are equal or not.

Regards,
Rob.



> Op maandag 8 juli 2013 10:01:16 UTC+2 schreef Rob Klooster het volgende:
> Hi Matthew,
>
>
>
> Histogram also works on floating point arrays, you just need to set the binsize:
>
>
>
> hist = histogram(array, binsize=epsilon, locations=locations)
>
> mode = locations[where(hist eq max(hist))]
>
>
>
> Note that for small values of epsilon, the resulting histogram array can become very large.
>
>
>
> Regards,
>
> Rob.
>
>
>
>
>
> Op vrijdag 5 juli 2013 20:55:23 UTC+2 schreef Matthew Argall het volgende:
>
>> PEAMBLE:
>
>>
>
>> I need a function that finds the mode of a floating point array. I have read David Fanning's article about integer arrays
>
>>
>
>>
>
>>
>
>> http://www.idlcoyote.com/code_tips/mode.html
>
>>
>
>>
>
>>
>
>> From this article about majority voting, it seems like "Hist_ND" works for floating point values, but I have no experience with the magic of HISTOGRAM
>
>>
>
>>
>
>>
>
>> https://groups.google.com/forum/#!searchin/comp.lang.idl-pvw ave/Mode$20of$20a$20floating$20point$20array/comp.lang.idl-p vwave/YZK2ey-O5sE/9fLvx_AG2IAJ
>
>>
>
>>
>
>>
>
>>
>
>>
>
>> QUESTION:
>
>>
>
>> Here is my attempt. Can anyone make it better/faster?
>
>>
>
>>
>
>>
>
>> ;-----------------------------------------------------
>
>>
>
>> function mrmode, array, $
>
>>
>
>> EPSILON=epsilon
>
>>
>
>> compile_opt idl2
>
>>
>
>>
>
>>
>
>> ;Number of points in ARRAY
>
>>
>
>> npts = n_elements(array)
>
>>
>
>>
>
>>
>
>> ;Default value for EPSILON
>
>>
>
>> if n_elements(epsilon) eq 0 then epsilon = 1d-5
>
>>
>
>>
>
>>
>
>> ;[index, count] for keeping track of mode statistics
>
>>
>
>> mode_count = lonarr(2, npts)
>
>>
>
>>
>
>>
>
>> ;Store first ~unique number. Count the how many ~unique numbers there are.
>
>>
>
>> mode_count[*,0] = [0,1]
>
>>
>
>> nunique = 1
>
>>
>
>>
>
>>
>
>> ;Step through all points in ARRAY
>
>>
>
>> for i = 1, npts - 1 do begin
>
>>
>
>> match_found = 0
>
>>
>
>>
>
>>
>
>> ;Try to pair the new point with other mode candidates
>
>>
>
>> for j = 0, nunique - 1 do begin
>
>>
>
>> if array[i] gt array[mode_count[0,j]]-epsilon && $
>
>>
>
>> array[i] lt array[mode_count[0,j]]+epsilon $
>
>>
>
>> then begin
>
>>
>
>>
>
>>
>
>> mode_count[1,j] += 1
>
>>
>
>> match_found = 1
>
>>
>
>> endif
>
>>
>
>> endfor
>
>>
>
>>
>
>>
>
>> ;If no match was found, create a new mode candidate
>
>>
>
>> if match_found eq 0 then begin
>
>>
>
>> mode_count[*,nunique] = [i,1]
>
>>
>
>> nunique += 1
>
>>
>
>> endif
>
>>
>
>> endfor
>
>>
>
>>
>
>>
>
>> ;Get the mode
>
>>
>
>> void = max(mode_count[1,*], iMode)
>
>>
>
>> mode = array[mode_count[0,iMode]]
>
>>
>
>>
>
>>
>
>> return, mode
>
>>
>
>> end
>
>>
>
>>
>
>>
>
>>
>
>>
>
>> ;----------------------------------------------------------- ----------
>
>>
>
>> ;Example Program (IDL> .r mrmode) /////////////////////////////////////////
>
>>
>
>> ;----------------------------------------------------------- ----------
>
>>
>
>> array = [1.2, 0.1, 3.3, 0.1, 2.0, 3.3, 4.8, 1.2, 0.1, 0.1, 6.7, 3.3]
>
>>
>
>> mode = MrMode(array)
>
>>
>
>> print, FORMAT='(%"The mode is: %f")', mode
>
>>
>
>>
>
>>
>
>> end
Re: Mode function for floating point arrays [message #85131 is a reply to message #85120] Tue, 09 July 2013 06:36 Go to previous messageGo to next message
Matthew Argall is currently offline  Matthew Argall
Messages: 286
Registered: October 2011
Senior Member
It seems like VALUE_LOCATE and HISTOGRAM solutions would have large limitations. The bin size for HISTOGRAM would have to be "2*epsilon", which would rule out data with a large dynamic range. Also, the bin should be centered on the data point so that two points falling within "epsilon" of one another do not get separated because the bins are offset.

For the sparse array idea, I would need to know beforehand what the is going to look like in order to create an acceptable map.

> Maybe you can update the uniq function to accept a value for epsilon to decide whether two floating values are equal or not.

This seems promising. I will check out the source code!
Re: Mode function for floating point arrays [message #85132 is a reply to message #85131] Tue, 09 July 2013 07:01 Go to previous messageGo to next message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
Matthew Argall writes:

>> Maybe you can update the uniq function to accept a value for epsilon to decide whether two floating values are equal or not.
>
> This seems promising. I will check out the source code!

FLOATS_EQUAL in the Coyote Library may be a place to start.

http://www.idlcoyote.com/programs/floats_equal.pro

Cheers,

David



--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: Mode function for floating point arrays [message #85224 is a reply to message #85131] Wed, 17 July 2013 07:08 Go to previous messageGo to next message
Rob Klooster is currently offline  Rob Klooster
Messages: 18
Registered: February 2013
Junior Member
Op dinsdag 9 juli 2013 15:36:46 UTC+2 schreef Matthew Argall het volgende:
> It seems like VALUE_LOCATE and HISTOGRAM solutions would have large limitations. The bin size for HISTOGRAM would have to be "2*epsilon", which would rule out data with a large dynamic range. Also, the bin should be centered on the data point so that two points falling within "epsilon" of one another do not get separated because the bins are offset.

The case of a large dynamical range is precisely the reason why I used VALUE_LOCATE instead of a plain HISTOGRAM with binsize set. Define the function like this:

function mode, array
sortedarray = array[Sort(array)]
arrayenum = sortedarray[Uniq(sortedarray)]
mappedarray = Value_Locate(arrayenum, array)
hist = histogram(mappedarray, min=0)
return, arrayenum[where(hist eq max(hist))]
end

Example:
print, mode([1., 10.^8, 10.^8])
1.00000e+008
print, mode([10.^8, 10.^8+1, 1.])
1.00000e+008
print, mode([10.^8, 10.^8+10, 1.])
1.00000 1.00000e+008 1.00000e+008

So in this case the machine precision is about 7 significant digits, as expected for floats. Note that two floats are only assumed equal when they have the exact same binary value.
Re: Mode function for floating point arrays [message #85269 is a reply to message #85224] Fri, 19 July 2013 16:50 Go to previous messageGo to next message
Matthew Argall is currently offline  Matthew Argall
Messages: 286
Registered: October 2011
Senior Member
It seems like the "goodness" of this lies in how well the UNIQ function can determine if two numbers are truly unique. Then, after that, how well Value_Locate can match unique values to their duplicates. Is that right?

> Note that two floats are only assumed equal when they have the exact same binary value.

I think there is more information in this sentence than I can grasp at the moment... Is there any reason to suspect that the precision of the result is less than the precision of the numeric type of the input array?
Re: Mode function for floating point arrays [message #85273 is a reply to message #85269] Mon, 22 July 2013 04:24 Go to previous message
Rob Klooster is currently offline  Rob Klooster
Messages: 18
Registered: February 2013
Junior Member
Op zaterdag 20 juli 2013 01:50:08 UTC+2 schreef Matthew Argall het volgende:
> It seems like the "goodness" of this lies in how well the UNIQ function can determine if two numbers are truly unique. Then, after that, how well Value_Locate can match unique values to their duplicates. Is that right?

Exactly, UNIQ() is used for comparing floats to see if they are equal or not. You could change some lines in that function from:
indices = where(q ne shift(q,-1), count)
to:
indices = where(abs(q - shift(q,-1)) gt eps, count)
for a fixed value of eps. Be careful with this kind of comparisons, as the value to take for eps is not well defined. Take a look at this article which explains all the pitfalls when comparing floating point numbers:

http://www.cygnus-software.com/papers/comparingfloats/compar ingfloats.htm

Value_locate() will work whatever the input is, since it does not look at exact matches. It will just find the interval to which a specific number belongs. You just need to make sure that the UNIQ() function outputs the lowest number of a particular bin.


>> Note that two floats are only assumed equal when they have the exact same binary value.
>
>
>
> I think there is more information in this sentence than I can grasp at the moment... Is there any reason to suspect that the precision of the result is less than the precision of the numeric type of the input array?

Again, have a look at the article. It will make things a bit clearer.

Rob.
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Re: IDL 8.2, read pixel value along a surface
Next Topic: Extracting pixel values from large image using RasterIterator

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 13:38:27 PDT 2025

Total time taken to generate the page: 0.00844 seconds