comp.lang.idl-pvwave archive: archive » Re: Need Some Advice on Seperating Out Some Data

Home » Public Forums » archive » Re: Need Some Advice on Seperating Out Some Data

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Re: Need Some Advice on Seperating Out Some Data [message #49659 is a reply to message #49657]

Tue, 08 August 2006 14:45

JD Smith
Messages: 850
Registered: December 1999

Senior Member

On Tue, 08 Aug 2006 16:57:28 -0400, Ben Tupper wrote:
> Hi,
>
> Just an end-of-the-day wildcard, but I would bin the data into a 2d
> histogram (ala JD's HIST_ND or the built-in HIST_2D). Then I would try to
> find the "saddle" between the data and noise. You'll have to fiddle with
> the binsize a bit to balance "lumping" and "splitting" - maybe that can be
> done dynamically. I dunno. But it should be quick.
>
> It is an interesting problem that we have face here with flow cytometry -
> but we work the region manually as you do. I'll be interested to see what
> your final solution is.

A related concept would be to:

1. Bin the original data into a 2D image, with HIST_ND, with using
REVERSE_INDICES (call this RI#1).
2. Threshold this binned image so that it's zero below, and 1 above
some threshold value representing the "no data" saddle. This
threshold could be zero, but doesn't have to be (e.g. to take care
of random noisy points in the distribution). As Ben mentions,
you'll have to experiment to pick a good bin size.
3. Use LABEL_REGION to find all contiguous blobs of data in the
bi-valued, thresholded, binned image.
4. Use HISTOGRAM with REVERSE_INDICES (RI#2) on the resulting "label
image" to find the extents/centroid/etc. of the data in each "blob"
(either roughly via the bin positions present in the blob, or more
precisely using RI#2 and RI#1 to locate the original un-binned data
which fall in the blob, performing an average over the data).
5. Pick the blob which is at the lower-right, and is large enough,
etc. The criteria you use here can be quite flexible, assuming the
"blobs" always arrive in the same pattern. You might even choose
just to exclude certain blobs that have a given shape and relative
position, and then take everything else.
6. Find the bins which belong to the chosen blob(s), using RI#2, and
then locate the data points within these original bins, with RI#1.
7. Give yourself a raise.

This is actually a very good exercise to try if you want to know
everything about HISTOGRAM and REVERSE_INDICES.

JD

Report message to a moderator

[Message index]

		Re: Need Some Advice on Seperating Out Some Data By: JD Smith on Wed, 09 August 2006 10:47
		Re: Need Some Advice on Seperating Out Some Data By: rdellsy on Wed, 09 August 2006 10:11
		Re: Need Some Advice on Seperating Out Some Data By: James Kuyper on Wed, 09 August 2006 08:38
		Re: Need Some Advice on Seperating Out Some Data By: edward.s.meinel@aero. on Wed, 09 August 2006 07:38
		Re: Need Some Advice on Seperating Out Some Data By: btt on Wed, 09 August 2006 05:37
		Re: Need Some Advice on Seperating Out Some Data By: rdellsy on Tue, 08 August 2006 15:20
		Re: Need Some Advice on Seperating Out Some Data By: JD Smith on Tue, 08 August 2006 14:45
		Re: Need Some Advice on Seperating Out Some Data By: btt on Tue, 08 August 2006 13:57
		Re: Need Some Advice on Seperating Out Some Data By: rdellsy on Tue, 08 August 2006 12:57
		Re: Need Some Advice on Seperating Out Some Data By: adisn123 on Tue, 08 August 2006 12:30
		Re: Need Some Advice on Seperating Out Some Data By: James Kuyper on Thu, 10 August 2006 12:30
		Re: Need Some Advice on Seperating Out Some Data By: rdellsy on Wed, 09 August 2006 13:13

Previous Topic:	Re: netcdf and hdf together
Next Topic:	Re: POLY_2D inconsitent interpolation

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Fri Nov 28 11:31:07 PST 2025

Total time taken to generate the page: 1.20457 seconds