 Calculating the Center of Mass of a 2D Array

QUESTION: Can you show me how to calculate the center of mass or centroid of a 2D array? ANSWER: Thanks to David Foster at UCSD for this algorithm. You can use this Centroid program to return a two-element array containing the center of mass of a 2D array. The program code is extremely simple.

FUNCTION Centroid, array
s = Size(array, /Dimensions)
totalMass = Total(array)
xcm = Total( Total(array, 2) * Indgen(s) ) / totalMass
ycm = Total( Total(array, 1) * Indgen(s) ) / totalMass
RETURN, [xcm, ycm]
END

ADDENDUM: This question recently came up on the IDL newsgroup, where the questioner wanted to eventually extend this idea to a 3D array. As often happens, that led some readers to the N-dimesional case, at which time I became completely lost. :-)

So here is the discussion verbatium for those of you who are interested. Subject: Re: Center of mass???
From: Paul Hick
Date: Mon, 08 Nov 1999 19:21:26 -0800

David Fanning wrote:
>
> Anders Wennerberg (anders@mrc.ks.se) writes:
>
> >    I have tried to find a routine that could calculate the center of
> > mass, in preliminary 2D but in near future 3D.   I would be happy if
> > anyone could give direction where to find that kind of routine.
>
> Here is how you do it in 2D. I've never had to extend
> this to 3D, but when you figure it out, please let us
> know. I'll write up an article on it. :-)
>
>   s = Size(array, /Dimensions)
>   totalMass = Total(array)
>   xcm = Total(Total(array,2) * Indgen(s)) / totalMass
>   ycm = Total(Total(array,1) * Indgen(s)) / totalMass
>
> Cheers,
>
> David

Reasoning by analogy to the 2D case, this should work, I think:

xcm = Total( Total(Total(array,3),2) * Indgen(s)) / totalMass
ycm = Total( Total(Total(array,3),1) * Indgen(s)) / totalMass
zcm = Total( Total(Total(array,2),1) * Indgen(s)) / totalMass

--
_________________________________________________________________
|                 Paul Hick (pphick@ucsd.edu)                     |
| Office : SERF Rm. 302          Smail  : UCSD/CASS/0424          |
| Phone  : (858) 534-8965                 9500 Gilman Drive       |
| Fax    : (858) 534-0177                 La Jolla  CA 92093-0424 |
| WWW    : http://casswww.ucsd.edu/personal/Plh.html              |
|_________________________________________________________________| Subject: Re: Center of mass??? From: "J.D. Smith" Date: Tue, 09 Nov 1999 15:51:06 -0500 David Fanning wrote: > > Paul Hick (pphick@ucsd.edu) writes: > > > Reasoning by analogy to the 2D case, this should work, I think: > > > > xcm = Total( Total(Total(array,3),2) * Indgen(s)) / totalMass > > ycm = Total( Total(Total(array,3),1) * Indgen(s)) / totalMass > > zcm = Total( Total(Total(array,2),1) * Indgen(s)) / totalMass > > Well, it seems to work in the simple-minded cases I > tried it on. :-) > > I'll write it up in an article if no one else has > serious objections. > Why stop at 3 dimensions? How about any dimension: function com, arr,DOUBLE=dbl s=size(arr,/DIMENSIONS) n=n_elements(s) tot=total(arr,DOUBLE=dbl) if keyword_set(dbl) then ret=dblarr(n,/NOZERO) else ret=fltarr(n,/NOZERO) for i=0,n-1 do begin tmp=arr for j=0,i-1 do tmp=total(tmp,1,DOUBLE=dbl) for j=i+1,n-1 do tmp=total(tmp,2,DOUBLE=dbl) ret[i]=total(findgen(s[i])*tmp,DOUBLE=dbl)/tot endfor return,ret end I would have posted this yesterday, but I couldn't help but feel there must be a better way to do it. Here's why: In the 3D example above, Paul calculates total(array,3) twice, which means once too many. It could have been saved between steps. The number of times total() is called on the array in this straightforward, brute-force method is n*(n-1) (for each of n dimensions total the array over the other n-1), not counting the final index total. And many of those are repeats to the exact same total() call with the exact same array! Since we needed to do all directional sub-totals, there must be a more efficient way, analogous to the 3D case of saving total(array,3). After a bit of scratching on paper, I found that I could do it with only (n-1)(n+2)/2 calls to total(), by working simultaneously from the last dimension backward and the first dimension forward, saving sub-totals which can be shared by subsequent calls in both cases. For 10 dimensions that becomes 54 vs. 90 calls to total() (though only 5 vs. 6 for 3 dimens -- 1 saved call as described above). The problem is how to code this model. Two reciprocating mutually recursive functions should do the trick, but I haven't had time to explore. Any takers? Anybody think they can save more calls to total()? This may be folly, since few will use it for more than 3D, but it's an interesting problem. JD P.S. Another wrinkle: The total()'s you'd prefer to save are the ones that do the most work... those which total the "largest" dimensions. So sorting by dimension size first of all might speed things up even more. -- J.D. Smith |*| WORK: (607) 255-5842 Cornell University Dept. of Astronomy |*| (607) 255-6263 304 Space Sciences Bldg. |*| FAX: (607) 255-5875 Ithaca, NY 14853 |*| Subject: Re: Center of mass??? From: Jonathan Joseph Date: Tue, 09 Nov 1999 19:05:07 -0500 Well, I'm not going to take up JD's challenge, but are you all sure you are answering the right question? I mean sure, great, if you happen to have an MxNx.... array of masses then you've got everything you need. But when I first read Anders' post, I thought, "gee that sounds simple." I thought of N masses at N locations, m = 1D array of N masses pos = D x N array of locations of the masses in D dimensions then: s=size(pos, /dimensions) mm = m ## replicate(1,s(0)) cm = total(pos * mm, 2) / total(m) Please someone correct me if I'm wrong. Also, Is there a better way of multiplying an MxN array by a one dimensional array of length N such that each row of the MxN array is multiplied by the corresponding element of the one dimensional array? -Jonathan Subject: Re: Center of mass??? From: "J.D. Smith" Date: Wed, 10 Nov 1999 13:16:25 -0500 Jonathan Joseph wrote: > > Well, I'm not going to take up JD's challenge, > but are you all sure you are answering the right > question? > > I mean sure, great, if you happen to have > an MxNx.... array of masses then you've got > everything you need. But when I first read > Anders' post, I thought, "gee that sounds simple." > > I thought of N masses at N locations, > > m = 1D array of N masses > pos = D x N array of locations of the masses in D dimensions > > then: > > s=size(pos, /dimensions) > mm = m ## replicate(1,s(0)) > cm = total(pos * mm, 2) / total(m) > > Please someone correct me if I'm wrong. > Also, Is there a better way of multiplying > an MxN array by a one dimensional array of > length N such that each row of the MxN array > is multiplied by the corresponding element > of the one dimensional array? You can use : rebin(reform(m,1,N),D,N,/SAMP)*pos but the array multiplication method also works. The relative speeds are system dependent. As far as your method of C.O.M. calculation, it's clearly good when you have such a DxN array, or even a sparse array of almost all zeroes (which you can safely ignore in the calculation). However, the application I imagine is some plane or cube or hypercube of data for which the C.O.M. is required. In that case, to use your method, you'd have to inflate your data by a factor of D. That is, you're paying for all those repeated indices being multiplied *before* totalling the data. This is an Index first rather than total first method. Here is a replacement routine using your idea which takes regular MxNx... arrays of data and makes the DxN index array. function com2, arr,DOUBLE=dbl s=size(arr,/DIMENSIONS) d=n_elements(s) n=n_elements(arr) inds=lonarr(d,n,/NOZERO) fac=1 for i=0,d-1 do begin inds[i,*]=lindgen(n)/fac mod s[i] fac=fac*s[i] endfor return, total(inds*rebin(reform(arr,1,n),d,n,/SAMP),2,DOUBLE=dbl)/ \$ total(arr,DOUBLE=dbl) end You see the work here is in generating the index array and inflating the data array. I compared this routine to my other one for a random array of size 10x10x10x10x10. The results were: Index First Method Time: 0.45424998 Total First Method Time: 0.048227489 How about 1024x1024: Index First Method Time: 1.9181580 Total First Method Time: 0.23625147 And for something really ludicrous... 5x5x5x5x5x5x5x5 Index First Method Time: 2.7887635 Total First Method Time: 0.44504005 So you see, even for many dimension, for which the Total First routine is currently inefficient, it is always faster. And for big arrays, like 100x100x100x20, I couldn't even get the Index First method to run (memory issues). Too much copying of data. JD -- J.D. Smith |*| WORK: (607) 255-5842 Cornell University Dept. of Astronomy |*| (607) 255-6263 304 Space Sciences Bldg. |*| FAX: (607) 255-5875 Ithaca, NY 14853 |*|  Web Coyote's Guide to IDL Programming