comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » REGRESS Question
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
REGRESS Question [message #31982] Wed, 04 September 2002 14:21 Go to next message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
Folks,

I have a client who has asked me to create a pixel density
function between two images and then perform a regression
analysis on the resulting distribution. No problem doing all
this, but she finds that the results of my regression analysis
differ from the same analysis performed in other statistics
packages. In fact, three different packages give the same
answer, and then there is IDL. :-(

For example, if the other packages calculate a "goodness
of fit" of 0.95, IDL might report 0.97.

Here is my question. Are there any known problems with REGRESS?
Or, can I assume that this problem comes from my own mathematical
ignorance?

Cheers,

David

--
David W. Fanning, Ph.D.
Fanning Software Consulting, Inc.
Phone: 970-221-0438, E-mail: david@dfanning.com
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Toll-Free IDL Book Orders: 1-888-461-0155
Re: REGRESS Question [message #32046 is a reply to message #31982] Fri, 06 September 2002 00:26 Go to previous messageGo to next message
Mike Alport is currently offline  Mike Alport
Messages: 8
Registered: December 1998
Junior Member
I think Bill may have a point - either R or R^2 is sometimes used as a
measure of "Goodness of Fit". One way to check this would be to compare this
quantity from both programs to eg 5 dec places and see if one is the SQRT of
the other.
Mike

"Bill" <wclodius@lanl.gov> wrote in message
news:3D7794C1.17DD18AB@lanl.gov...
>
>
> David Fanning wrote:
>
>> Folks,
>>
>> I have a client who has asked me to create a pixel density
>> function between two images and then perform a regression
>> analysis on the resulting distribution. No problem doing all
>> this, but she finds that the results of my regression analysis
>> differ from the same analysis performed in other statistics
>> packages. In fact, three different packages give the same
>> answer, and then there is IDL. :-(
>>
>> For example, if the other packages calculate a "goodness
>> of fit" of 0.95, IDL might report 0.97.
>>
>> Here is my question. Are there any known problems with REGRESS?
>> Or, can I assume that this problem comes from my own mathematical
>> ignorance?
>>
>> Cheers,
>>
>> David
>
> <snip>
>
> Almost any package can have problems, but the original REGRESS in
> Bevington has stood the test of time. IDL's version works for me, but it
> is possible that the introduced some problems. One thing that bothers
> me is tha 0.95 is to a good approximation 0.97^2. Could you be fitting
> the square root of the customer's data.
>
Re: REGRESS Question [message #32058 is a reply to message #31982] Thu, 05 September 2002 11:31 Go to previous messageGo to next message
julia[1] is currently offline  julia[1]
Messages: 1
Registered: September 2002
Junior Member
David --

I ran into problems with the regress routine a few years ago, trying
to regress large amounts of data. The problem is that regress.pro
calls the total routine, which is called in floating point precision.
I had more obvious problems than the 2% difference in goodness of fit
that your reporting, but I found I had to modify regress.pro to call
total in double precision.

Julia



David Fanning <david@dfanning.com> wrote in message news:<MPG.17e027561a59e932989994@news.frii.com>...
> Folks,
>
> I have a client who has asked me to create a pixel density
> function between two images and then perform a regression
> analysis on the resulting distribution. No problem doing all
> this, but she finds that the results of my regression analysis
> differ from the same analysis performed in other statistics
> packages. In fact, three different packages give the same
> answer, and then there is IDL. :-(
>
> For example, if the other packages calculate a "goodness
> of fit" of 0.95, IDL might report 0.97.
>
> Here is my question. Are there any known problems with REGRESS?
> Or, can I assume that this problem comes from my own mathematical
> ignorance?
>
> Cheers,
>
> David
Re: REGRESS Question [message #32061 is a reply to message #31982] Thu, 05 September 2002 10:31 Go to previous messageGo to next message
William Clodius is currently offline  William Clodius
Messages: 30
Registered: December 1996
Member
David Fanning wrote:

> Folks,
>
> I have a client who has asked me to create a pixel density
> function between two images and then perform a regression
> analysis on the resulting distribution. No problem doing all
> this, but she finds that the results of my regression analysis
> differ from the same analysis performed in other statistics
> packages. In fact, three different packages give the same
> answer, and then there is IDL. :-(
>
> For example, if the other packages calculate a "goodness
> of fit" of 0.95, IDL might report 0.97.
>
> Here is my question. Are there any known problems with REGRESS?
> Or, can I assume that this problem comes from my own mathematical
> ignorance?
>
> Cheers,
>
> David

<snip>

Almost any package can have problems, but the original REGRESS in
Bevington has stood the test of time. IDL's version works for me, but it
is possible that the introduced some problems. One thing that bothers
me is tha 0.95 is to a good approximation 0.97^2. Could you be fitting
the square root of the customer's data.
Re: REGRESS Question [message #36720 is a reply to message #31982] Wed, 22 October 2003 07:01 Go to previous messageGo to next message
justspam03 is currently offline  justspam03
Messages: 36
Registered: October 2003
Member
Hej Kevin,

you may mix up 'regress' and 'linfit' - at least your
argument rather seems to relate to the latter.
Cheers
Oliver
Re: REGRESS Question [message #36721 is a reply to message #31982] Wed, 22 October 2003 06:03 Go to previous messageGo to next message
wmconnolley is currently offline  wmconnolley
Messages: 106
Registered: November 2000
Senior Member
"Christopher Lee" <cl> wrote:
> "Kevin M. Lausten" <kevinlausten@hotmail.com> wrote:

>> I am having difficulty working with the REGRESS function. I continually
>> get values <1 for my slope when doing a regression between two vectors.
>> When I do a regression mapping y to x (slope = regress(x, y, const =
>> const)) and when I do a regression mapping x to y (slope = regress(y, x,
>> const = const) I get a slope<1 for both calculations. Shouldn't the
>> y=mx+b of these two regressions be inverses of each other (leading to
>> one slope>1, and one<1?) Maybe I am misunderstanding regressions?

You've certainly misunderstood some basic maths: the inverse (as in
reciprocal) of -1 is -1, not 1. If the regression of y against x
has a negative slope, then you would expect the regression of x against y
to have too.

OTOH the relation is *not* reciprocal anyway, unless the fit is perfect.
(probably because the fit is asymmetric: y values are assumed uncertain,
x values exact).

-W.

--
William M Connolley | wmc@bas.ac.uk | http://www.antarctica.ac.uk/met/wmc/
Climate Modeller, British Antarctic Survey | Disclaimer: I speak for myself
I'm a .signature virus! copy me into your .signature file & help me spread!
Re: REGRESS Question [message #36722 is a reply to message #31982] Wed, 22 October 2003 03:44 Go to previous messageGo to next message
Chris Lee is currently offline  Chris Lee
Messages: 101
Registered: August 2003
Senior Member
In article <932b9720.0310210627.f93c6f2@posting.google.com>, "Kevin M.
Lausten" <kevinlausten@hotmail.com> wrote:


> I am having difficulty working with the REGRESS function. I continually
> get values <1 for my slope when doing a regression between two vectors.
> When I do a regression mapping y to x (slope = regress(x, y, const =
> const)) and when I do a regression mapping x to y (slope = regress(y, x,
> const = const) I get a slope<1 for both calculations. Shouldn't the
> y=mx+b of these two regressions be inverses of each other (leading to
> one slope>1, and one<1?) Maybe I am misunderstanding regressions?
> Thanks,
> kevin

Hi,

If you try the regression with the simplest possible straight line

y = mx +c

where m=1 and c=0 , so

y=x

if you regress with y=f(x), you get a value of 1 (and a constant of 0)
if you regress with x=f(y), you get a value of 1, again.

if the gradient is negative for y=f(x), it has to be negative for x=f(y).
The two equations you are assuming in the regressions are

y= mx + c OR x = (y-c)/m = ny + d

n=1/m, so sign is preserved. (and d=-c/m = -cn)


HTH

Chris.
Re: regress question [message #64049 is a reply to message #31982] Mon, 01 December 2008 00:38 Go to previous message
Wout De Nolf is currently offline  Wout De Nolf
Messages: 194
Registered: October 2008
Senior Member
On Sun, 30 Nov 2008 23:42:38 -0800 (PST), James McCreight
<mccreigh@gmail.com> wrote:

> I have some vague recollection of doing this once within an IDL
> function. A quick look turned up this, looks promising and like
> something i've seen before:
>
> Curvefit( X, Y, Weights, A [, Sigma] [, CHISQ=variable] [, /DOUBLE] [,
> FITA=vector] [, FUNCTION_NAME=string] [, ITER=variable] [,
> ITMAX=value] [, /NODERIVATIVE] [, STATUS={0 | 1 | 2}] [, TOL=value] [,
> YERROR=variable] )
> A
> A vector with as many elements as the number of terms in the user-
> supplied function, containing the initial estimate for each parameter.
> On return, the vector A contains the fitted model parameters.
>
> FITA
> Set this keyword to a vector, with as many elements as A, which
> contains a zero for each fixed parameter, and a non-zero value for
> elements of A to fit. If not supplied, all parameters are taken to be
> non-fixed.

Why using a non-linear least squares fitting algorithm for a linear
problem? Fixing parameters is not all that difficult using the linear
algorithms (i.e. orthogonal decomposition methods like SVD), although
you have to do it yourself.

Suppose y=a.x1+b.x2+c then you find the least squares solution by (X1
and X2 column vectors)
SVDC, [X1,X2,replicate(1,1,n_elements(X1))], W, U, V
result=SVSOL(U, W, V, Y) ; gives LSSol. [a,b,c]

Suppose I want to fix b=3 then you would do this
SVDC, [X1,replicate(1,1,n_elements(X1))], W, U, V
result=SVSOL(U, W, V, Y-3*X2); gives LSSol. [a,c]
Re: regress question [message #64050 is a reply to message #31982] Sun, 30 November 2008 23:42 Go to previous message
mccreigh is currently offline  mccreigh
Messages: 13
Registered: January 2008
Junior Member
I have some vague recollection of doing this once within an IDL
function. A quick look turned up this, looks promising and like
something i've seen before:

Curvefit( X, Y, Weights, A [, Sigma] [, CHISQ=variable] [, /DOUBLE] [,
FITA=vector] [, FUNCTION_NAME=string] [, ITER=variable] [,
ITMAX=value] [, /NODERIVATIVE] [, STATUS={0 | 1 | 2}] [, TOL=value] [,
YERROR=variable] )
A
A vector with as many elements as the number of terms in the user-
supplied function, containing the initial estimate for each parameter.
On return, the vector A contains the fitted model parameters.

FITA
Set this keyword to a vector, with as many elements as A, which
contains a zero for each fixed parameter, and a non-zero value for
elements of A to fit. If not supplied, all parameters are taken to be
non-fixed.
Re: regress question [message #64056 is a reply to message #31982] Sat, 29 November 2008 07:58 Go to previous message
Kenneth P. Bowman is currently offline  Kenneth P. Bowman
Messages: 585
Registered: May 2000
Senior Member
In article
<7bfe8515-07f4-4c20-ad19-e2de871e3cc7@x38g2000yqj.googlegroups.com>,
Brian Larsen <balarsen@gmail.com> wrote:

> Russ,
>
> this has been discussed on this newsgroup for y=mx+b before, I have
> turkey on the brain now and not regression but extending this idea to
> multiple is probably not too bad (if it turns out to be the right
> thing). And if this is not easy to do it is an interesting thread
> that is good to remind oneself of.
>
>
> This is the thread: http://tinyurl.com/2bfhl9
> Here's a nice summary: http://tinyurl.com/2aqlgx

Like Brian, being too lazy to work this out myself, it occurred to me that
you could use MPFIT to fit a general linear function and put very
tight constraints on the intercept. Because the problem is linear, it
should converge almost instantaneously.

Ken Bowman
Re: regress question [message #64083 is a reply to message #31982] Thu, 27 November 2008 07:37 Go to previous message
Brian Larsen is currently offline  Brian Larsen
Messages: 270
Registered: June 2006
Senior Member
Russ,

this has been discussed on this newsgroup for y=mx+b before, I have
turkey on the brain now and not regression but extending this idea to
multiple is probably not too bad (if it turns out to be the right
thing). And if this is not easy to do it is an interesting thread
that is good to remind oneself of.


This is the thread: http://tinyurl.com/2bfhl9
Here's a nice summary: http://tinyurl.com/2aqlgx


Cheers,

Brian

------------------------------------------------------------ --------------
Brian Larsen
Boston University
Center for Space Physics
Re: regress question [message #64088 is a reply to message #31982] Thu, 27 November 2008 02:47 Go to previous message
Wout De Nolf is currently offline  Wout De Nolf
Messages: 194
Registered: October 2008
Senior Member
On Thu, 27 Nov 2008 01:23:06 -0800 (PST), russ <rlayberry@hotmail.com>
wrote:

> Hi
>
> I'm using multiple linear regression using the REGRESS function. This
> gives me
>
> y = c + a1x1 + a2 x2 ...+ anxn
>
> with the coefficents a1,a2 etc.
>
> What I want to do is the above but force the constant to be zero. ie
> find the coeffcients that give the best linear fit whilst the function
> goes through thr origin (which it should do for physical reasons).
>
> Any ideas?
>
> Thanks
>
> Russ


You can create the design-matrix yourself and then use some
factorization like LU, SVD, Cholesky, QR,... (is your linear system
over/under determined?) The example below uses SVD. First it solves a
system not going through the origin by REGRESS and then by SVD.
Finally SVD is used for a system that goes through the origin.


X1 = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0]
X2 = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
X = transpose([[X1],[X2]])
Y = 3*X1 - 4*X2 + 5
Yorg = 3*X1 - 4*X2

; Regress
result1=regress(X,Y,const=const)
result1=[reform(result1),const]

; SVD (concat. X with 1's for the const)
SVDC, [X,replicate(1,1,n_elements(Y))], W, U, V
result2=reform(SVSOL(U, W, V, Y))

; SVD (origin)
SVDC, X, W, U, V
result3=reform(SVSOL(U, W, V, Yorg))

print,'Regress: ',result1
print,'SVD: ',result2
print,'SVD(origin): ',result3
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: IDL Programming Techniques, 3rd edition
Next Topic: Reading fortran

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 11:42:24 PDT 2025

Total time taken to generate the page: 0.00450 seconds