Re: compute quartiles of a distribution [message #77909] |
Tue, 18 October 2011 21:32 |
Jeremy Bailin
Messages: 618 Registered: April 2008
|
Senior Member |
|
|
On 10/18/11 3:48 PM, Jeremy Bailin wrote:
> On 10/18/11 12:12 PM, bing999 wrote:
>> Thanks to both of you for your answers.
>>
>> The procedures in summary.pro and cgBoxPlot.pro compute "real"
>> quartiles. Actually, I should not have used this word in my case i
>> guess.
>>
>> What I want is the interval [M-Q;M+Q] which encompass 75% of the
>> values of the sample around the mean (not the median) value M, where Q
>> is unique (i.e the same at lower and higher values around M). I do not
>> want the 37.5% above M and the 37.5% below. It makes a little
>> difference with what is calculated with your routines.
>> The idea would be to span the sample starting from the mean, and
>> counting the points at lower and higher values around the mean in an
>> iterative manner, until I have counted 75% of sample. This would give
>> the value of Q at which the 75% is reached. I have a crude idea to do
>> that with for loops but it will take forever...
>>
>> If you see what I mean, and if you have a piece of code, this could
>> help a lot!
>>
>> Thanks again.
>>
>>
>>> bing999 writes:
>>>> I have sample of data (which distribution is unknown) of mean M. I
>>>> would like to calculate the quartiles with IDL, i.e what is the value
>>>> of Q for which 25% (or 75%) of the sample is comprised between [M-Q;M
>>>> +Q] ?
>>>> Do you know a routine which does that?
>>>
>>> cgBoxPlot.
>>>
>>> Cheers,
>>>
>>> David
>>>
>>> --
>>> David Fanning, Ph.D.
>>> Fanning Software Consulting, Inc.
>>> Coyote's Guide to IDL Programming:http://www.idlcoyote.com/
>>> Sepore ma de ni thui. ("Perhaps thou speakest truth.")
>>
>
> Easy enough (untested):
>
> data = [......]
> frac_to_enclose = 0.75
> meanval = mean(data)
> absdiff = abs(data-meanval)
> quartile_index = floor(n_elements(absdiff) * frac_to_enclose)
> q = absdiff[quartile_index]
>
>
> But I share David's concern that this may not really be what you want...
>
> -Jeremy.
Okay, now that I've tested it, there's clearly a SORT missing.
Substitute the last line with:
q = absdiff[(sort(absdiff))[quartile_index]]
-Jeremy.
|
|
|
Re: compute quartiles of a distribution [message #77911 is a reply to message #77909] |
Tue, 18 October 2011 12:48  |
Jeremy Bailin
Messages: 618 Registered: April 2008
|
Senior Member |
|
|
On 10/18/11 12:12 PM, bing999 wrote:
> Thanks to both of you for your answers.
>
> The procedures in summary.pro and cgBoxPlot.pro compute "real"
> quartiles. Actually, I should not have used this word in my case i
> guess.
>
> What I want is the interval [M-Q;M+Q] which encompass 75% of the
> values of the sample around the mean (not the median) value M, where Q
> is unique (i.e the same at lower and higher values around M). I do not
> want the 37.5% above M and the 37.5% below. It makes a little
> difference with what is calculated with your routines.
> The idea would be to span the sample starting from the mean, and
> counting the points at lower and higher values around the mean in an
> iterative manner, until I have counted 75% of sample. This would give
> the value of Q at which the 75% is reached. I have a crude idea to do
> that with for loops but it will take forever...
>
> If you see what I mean, and if you have a piece of code, this could
> help a lot!
>
> Thanks again.
>
>
>> bing999 writes:
>>> I have sample of data (which distribution is unknown) of mean M. I
>>> would like to calculate the quartiles with IDL, i.e what is the value
>>> of Q for which 25% (or 75%) of the sample is comprised between [M-Q;M
>>> +Q] ?
>>> Do you know a routine which does that?
>>
>> cgBoxPlot.
>>
>> Cheers,
>>
>> David
>>
>> --
>> David Fanning, Ph.D.
>> Fanning Software Consulting, Inc.
>> Coyote's Guide to IDL Programming:http://www.idlcoyote.com/
>> Sepore ma de ni thui. ("Perhaps thou speakest truth.")
>
Easy enough (untested):
data = [......]
frac_to_enclose = 0.75
meanval = mean(data)
absdiff = abs(data-meanval)
quartile_index = floor(n_elements(absdiff) * frac_to_enclose)
q = absdiff[quartile_index]
But I share David's concern that this may not really be what you want...
-Jeremy.
|
|
|
|
Re: compute quartiles of a distribution [message #77914 is a reply to message #77912] |
Tue, 18 October 2011 09:36  |
Thibault Garel
Messages: 55 Registered: October 2009
|
Member |
|
|
:) On this one, I am my own reviewer !
I know what I ask sounds weird but that is really what I'd like to
compute. As I want to work with the means, not medians, "statistically
justifiable real" quartiles do not really help. In my case, means and
median may be quite different so that normal 75% quartiles may be out
of the sample...
I am gonna try to find a way to code that.
Thanks again,
Cheers
bing
> bing999 writes:
>> The procedures in summary.pro and cgBoxPlot.pro compute "real"
>> quartiles. Actually, I should not have used this word in my case i
>> guess.
>
>> What I want is the interval [M-Q;M+Q] which encompass 75% of the
>> values of the sample around the mean (not the median) value M, where Q
>> is unique (i.e the same at lower and higher values around M). I do not
>> want the 37.5% above M and the 37.5% below. It makes a little
>> difference with what is calculated with your routines.
>> The idea would be to span the sample starting from the mean, and
>> counting the points at lower and higher values around the mean in an
>> iterative manner, until I have counted 75% of sample. This would give
>> the value of Q at which the 75% is reached. I have a crude idea to do
>> that with for loops but it will take forever...
>
> I'm guessing you are going to have a hard time
> explaining to your reviewers why your "fake"
> quartiles are better than the statistically
> justifiable real quartiles. :-)
>
> Cheers,
>
> David
>
> --
> David Fanning, Ph.D.
> Fanning Software Consulting, Inc.
> Coyote's Guide to IDL Programming:http://www.idlcoyote.com/
> Sepore ma de ni thui. ("Perhaps thou speakest truth.")
|
|
|
Re: compute quartiles of a distribution [message #77916 is a reply to message #77914] |
Tue, 18 October 2011 09:25  |
David Fanning
Messages: 11724 Registered: August 2001
|
Senior Member |
|
|
bing999 writes:
> The procedures in summary.pro and cgBoxPlot.pro compute "real"
> quartiles. Actually, I should not have used this word in my case i
> guess.
>
> What I want is the interval [M-Q;M+Q] which encompass 75% of the
> values of the sample around the mean (not the median) value M, where Q
> is unique (i.e the same at lower and higher values around M). I do not
> want the 37.5% above M and the 37.5% below. It makes a little
> difference with what is calculated with your routines.
> The idea would be to span the sample starting from the mean, and
> counting the points at lower and higher values around the mean in an
> iterative manner, until I have counted 75% of sample. This would give
> the value of Q at which the 75% is reached. I have a crude idea to do
> that with for loops but it will take forever...
I'm guessing you are going to have a hard time
explaining to your reviewers why your "fake"
quartiles are better than the statistically
justifiable real quartiles. :-)
Cheers,
David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
|
|
|
Re: compute quartiles of a distribution [message #77917 is a reply to message #77916] |
Tue, 18 October 2011 09:12  |
Thibault Garel
Messages: 55 Registered: October 2009
|
Member |
|
|
Thanks to both of you for your answers.
The procedures in summary.pro and cgBoxPlot.pro compute "real"
quartiles. Actually, I should not have used this word in my case i
guess.
What I want is the interval [M-Q;M+Q] which encompass 75% of the
values of the sample around the mean (not the median) value M, where Q
is unique (i.e the same at lower and higher values around M). I do not
want the 37.5% above M and the 37.5% below. It makes a little
difference with what is calculated with your routines.
The idea would be to span the sample starting from the mean, and
counting the points at lower and higher values around the mean in an
iterative manner, until I have counted 75% of sample. This would give
the value of Q at which the 75% is reached. I have a crude idea to do
that with for loops but it will take forever...
If you see what I mean, and if you have a piece of code, this could
help a lot!
Thanks again.
> bing999 writes:
>> I have sample of data (which distribution is unknown) of mean M. I
>> would like to calculate the quartiles with IDL, i.e what is the value
>> of Q for which 25% (or 75%) of the sample is comprised between [M-Q;M
>> +Q] ?
>> Do you know a routine which does that?
>
> cgBoxPlot.
>
> Cheers,
>
> David
>
> --
> David Fanning, Ph.D.
> Fanning Software Consulting, Inc.
> Coyote's Guide to IDL Programming:http://www.idlcoyote.com/
> Sepore ma de ni thui. ("Perhaps thou speakest truth.")
|
|
|
Re: compute quartiles of a distribution [message #77918 is a reply to message #77917] |
Tue, 18 October 2011 09:13  |
Thibault Garel
Messages: 55 Registered: October 2009
|
Member |
|
|
Thanks to both of you for your answers.
The procedures in summary.pro and cgBoxPlot.pro compute "real"
quartiles. Actually, I should not have used this word in my case i
guess.
What I want is the interval [M-Q;M+Q] which encompass 75% of the
values of the sample around the mean (not the median) value M, where Q
is unique (i.e the same at lower and higher values around M). I do not
want the 37.5% above M and the 37.5% below. It makes a little
difference with what is calculated with your routines.
The idea would be to span the sample starting from the mean, and
counting the points at lower and higher values around the mean in an
iterative manner, until I have counted 75% of sample. This would give
the value of Q at which the 75% is reached. I have a crude idea to do
that with for loops but it will take forever...
If you see what I mean, and if you have a piece of code, this could
help a lot!
Thanks again.
> bing999 writes:
>> I have sample of data (which distribution is unknown) of mean M. I
>> would like to calculate the quartiles with IDL, i.e what is the value
>> of Q for which 25% (or 75%) of the sample is comprised between [M-Q;M
>> +Q] ?
>> Do you know a routine which does that?
>
> cgBoxPlot.
>
> Cheers,
>
> David
>
> --
> David Fanning, Ph.D.
> Fanning Software Consulting, Inc.
> Coyote's Guide to IDL Programming:http://www.idlcoyote.com/
> Sepore ma de ni thui. ("Perhaps thou speakest truth.")
|
|
|
Re: compute quartiles of a distribution [message #77927 is a reply to message #77917] |
Mon, 17 October 2011 18:26  |
David Fanning
Messages: 11724 Registered: August 2001
|
Senior Member |
|
|
bing999 writes:
> I have sample of data (which distribution is unknown) of mean M. I
> would like to calculate the quartiles with IDL, i.e what is the value
> of Q for which 25% (or 75%) of the sample is comprised between [M-Q;M
> +Q] ?
> Do you know a routine which does that?
cgBoxPlot.
Cheers,
David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
|
|
|
|