comp.lang.idl-pvwave archive: archive » Re: large matrix operations

Home » Public Forums » archive » Re: large matrix operations

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Re: large matrix operations [message #62872]

Tue, 14 October 2008 06:10

Vince Hradil
Messages: 574
Registered: December 1999

Senior Member

On Oct 13, 5:14 pm, "mgal...@gmail.com" <mgal...@gmail.com> wrote:
> On Oct 13, 3:17 pm, Vince Hradil <vincehra...@gmail.com> wrote:
>
>> Thanks for cleaning that up. I think the conclusions are the same -
>> matrix_multiply does not speed things up. I wonder what happens if we
>> really push the envelope wrt RAM and matrix size. Unfortunately, I
>> don't have a lot of time to play with this.
>
> MATRIX_MULTIPLY is about 15% faster in my tests, but I guess you were
> wanting a *very* significant speedup i.e. like the speedup you get
> when vectorizing code?
>
> Mike
> --www.michaelgalloy.com
> Tech-X Corporation
> Software Developer II

Yeah- I guess if you're doing a lot of these and each one can be 15%
faster then you might gain something. I was really expecting
(hoping?) MATRIX_MULTIPLY to be 50% or better faster. Besides, if my
programs run too fast how can I justify my salary? 8^)

I'm reminded of this: http://xkcd.com/303/

Report message to a moderator

Re: large matrix operations [message #62886 is a reply to message #62872]

Mon, 13 October 2008 15:14

Michael Galloy
Messages: 1114
Registered: April 2006

Senior Member

On Oct 13, 3:17 pm, Vince Hradil <vincehra...@gmail.com> wrote:
> Thanks for cleaning that up. I think the conclusions are the same -
> matrix_multiply does not speed things up. I wonder what happens if we
> really push the envelope wrt RAM and matrix size. Unfortunately, I
> don't have a lot of time to play with this.

MATRIX_MULTIPLY is about 15% faster in my tests, but I guess you were
wanting a *very* significant speedup i.e. like the speedup you get
when vectorizing code?

Mike
--
www.michaelgalloy.com
Tech-X Corporation
Software Developer II

Report message to a moderator

Re: large matrix operations [message #62888 is a reply to message #62886]

Mon, 13 October 2008 14:17

Vince Hradil
Messages: 574
Registered: December 1999

Senior Member

On Oct 13, 3:51 pm, "mgal...@gmail.com" <mgal...@gmail.com> wrote:
> On Oct 13, 11:07 am, Vince Hradil <vincehra...@gmail.com> wrote:
>
>
>
>> So much for that theory...
>
>> Hardware:
>> IDL> print, !version
>> { x86 Win32 Windows Microsoft Windows 7.0 Oct 25 2007 32 64}
>
>> Results:
>> % TEST: Allocating A array
>> % TEST: took 0.14000010 sec
>> % TEST: Allocating B array
>> % TEST: took 0.21899986 sec
>> % TEST: A#B
>> % TEST: took 40.878000 sec
>> % TEST: matrix_multipy(A,B)
>> % TEST: took 42.284000 sec
>> % TEST: transpose(A)#B
>> % TEST: took 42.645000 sec
>> % TEST: matrix_multipy(A,B,/atranspose)
>> % TEST: took 43.449000 sec
>> % TEST: transpose(temporary(A))#B
>> % TEST: took 43.387000 sec
>> % TEST: matrix_multipy(temporary(A),B,/atranspose)
>> % TEST: took 50.029000 sec
>
> I think there are problems in the testing mechanism:
>
> * you're only timing one execution of the operation
> * you're timing MESSAGE which has an I/O component
> * allocating B was 50% longer than allocating A? They are the same
> size, only 12 MB.
>
> I ran the same tests 10 times and averaged the results, also removing
> the MESSAGE statements from the timed portions.
>
> Here are my results:
>
> IDL> print, !version
> { i386 darwin unix Mac OS X 7.0 Oct 25 2007 32 64}
>
> % TEST: Allocating A: 0.0767788
> % TEST: Allocating B: 0.0775957
> % TEST: A # B: 7.68955
> % TEST: MATRIX_MULTIPLY(A, B): 7.62423
> % TEST: TRANSPOSE(A) # B: 7.64414
> % TEST: MATRIX_MULTIPLY(A, B, /ATRANSPOSE): 6.66077
> % TEST: TRANSPOSE(TEMPORARY(A)) # B: 7.51523
> % TEST: MATRIX_MULTIPLY(TEMPORARY(A), B, /ATRANSPOSE): 6.50667
>
> Here is my code:
>
> pro test
> nel = 2000L
> ntests = 10L
> times = fltarr(8, ntests)
>
> for i = 0L, ntests - 1L do begin
> print, 'Running test ' + strtrim(i, 2) + '...'
> t0 = systime(1)
> a = randomu(seed,[nel,nel])
> times[0, i] = systime(1)-t0
>
> t0 = systime(1)
> b = randomu(seed,[nel,nel])
> times[1, i] = systime(1)-t0
>
> t0 = systime(1)
> c = a#b
> times[2, i] = systime(1)-t0
>
> t0 = systime(1)
> c = matrix_multiply(a,b)
> times[3, i] = systime(1)-t0
>
> t0 = systime(1)
> c = transpose(a)#b
> times[4, i] = systime(1)-t0
>
> t0 = systime(1)
> c = matrix_multiply(a,b,/atranspose)
> times[5, i] = systime(1)-t0
>
> ahold = a
>
> t0 = systime(1)
> c = transpose(temporary(a))#b
> times[6, i] = systime(1)-t0
>
> a = ahold
>
> t0 = systime(1)
> c = matrix_multiply(temporary(a),b,/atranspose)
> times[7, i] = systime(1)-t0
> endfor
>
> message, 'Allocating A: ' + strtrim(mean(times[0, *]), 2), /info
> message, 'Allocating B: ' + strtrim(mean(times[1, *]), 2), /info
> message, 'A # B: ' + strtrim(mean(times[2, *]), 2), /info
> message, 'MATRIX_MULTIPLY(A, B): ' + strtrim(mean(times[3, *]), 2), /
> info
> message, 'TRANSPOSE(A) # B: ' + strtrim(mean(times[4, *]), 2), /info
> message, 'MATRIX_MULTIPLY(A, B, /ATRANSPOSE): ' +
> strtrim(mean(times[5, *]), 2), /info
> message, 'TRANSPOSE(TEMPORARY(A)) # B: ' + strtrim(mean(times[6,
> *]), 2), /info
> message, 'MATRIX_MULTIPLY(TEMPORARY(A), B, /ATRANSPOSE): ' +
> strtrim(mean(times[7, *]), 2), /info
> end
>
> Mike
> --www.michaelgalloy.com
> Tech-X Corporation
> Software Developer II

Mike,
Thanks for cleaning that up. I think the conclusions are the same -
matrix_multiply does not speed things up. I wonder what happens if we
really push the envelope wrt RAM and matrix size. Unfortunately, I
don't have a lot of time to play with this.

Vince

Report message to a moderator

Re: large matrix operations [message #62889 is a reply to message #62888]

Mon, 13 October 2008 13:51

Michael Galloy
Messages: 1114
Registered: April 2006

Senior Member

On Oct 13, 11:07 am, Vince Hradil <vincehra...@gmail.com> wrote:
> So much for that theory...
>
> Hardware:
> IDL> print, !version
> { x86 Win32 Windows Microsoft Windows 7.0 Oct 25 2007 32 64}
>
> Results:
> % TEST: Allocating A array
> % TEST: took 0.14000010 sec
> % TEST: Allocating B array
> % TEST: took 0.21899986 sec
> % TEST: A#B
> % TEST: took 40.878000 sec
> % TEST: matrix_multipy(A,B)
> % TEST: took 42.284000 sec
> % TEST: transpose(A)#B
> % TEST: took 42.645000 sec
> % TEST: matrix_multipy(A,B,/atranspose)
> % TEST: took 43.449000 sec
> % TEST: transpose(temporary(A))#B
> % TEST: took 43.387000 sec
> % TEST: matrix_multipy(temporary(A),B,/atranspose)
> % TEST: took 50.029000 sec

I think there are problems in the testing mechanism:

* you're only timing one execution of the operation
* you're timing MESSAGE which has an I/O component
* allocating B was 50% longer than allocating A? They are the same
size, only 12 MB.

I ran the same tests 10 times and averaged the results, also removing
the MESSAGE statements from the timed portions.

Here are my results:

IDL> print, !version
{ i386 darwin unix Mac OS X 7.0 Oct 25 2007 32 64}

% TEST: Allocating A: 0.0767788
% TEST: Allocating B: 0.0775957
% TEST: A # B: 7.68955
% TEST: MATRIX_MULTIPLY(A, B): 7.62423
% TEST: TRANSPOSE(A) # B: 7.64414
% TEST: MATRIX_MULTIPLY(A, B, /ATRANSPOSE): 6.66077
% TEST: TRANSPOSE(TEMPORARY(A)) # B: 7.51523
% TEST: MATRIX_MULTIPLY(TEMPORARY(A), B, /ATRANSPOSE): 6.50667

Here is my code:

pro test
nel = 2000L
ntests = 10L
times = fltarr(8, ntests)

for i = 0L, ntests - 1L do begin
print, 'Running test ' + strtrim(i, 2) + '...'
t0 = systime(1)
a = randomu(seed,[nel,nel])
times[0, i] = systime(1)-t0

t0 = systime(1)
b = randomu(seed,[nel,nel])
times[1, i] = systime(1)-t0

t0 = systime(1)
c = a#b
times[2, i] = systime(1)-t0

t0 = systime(1)
c = matrix_multiply(a,b)
times[3, i] = systime(1)-t0

t0 = systime(1)
c = transpose(a)#b
times[4, i] = systime(1)-t0

t0 = systime(1)
c = matrix_multiply(a,b,/atranspose)
times[5, i] = systime(1)-t0

ahold = a

t0 = systime(1)
c = transpose(temporary(a))#b
times[6, i] = systime(1)-t0

a = ahold

t0 = systime(1)
c = matrix_multiply(temporary(a),b,/atranspose)
times[7, i] = systime(1)-t0
endfor

message, 'Allocating A: ' + strtrim(mean(times[0, *]), 2), /info
message, 'Allocating B: ' + strtrim(mean(times[1, *]), 2), /info
message, 'A # B: ' + strtrim(mean(times[2, *]), 2), /info
message, 'MATRIX_MULTIPLY(A, B): ' + strtrim(mean(times[3, *]), 2), /
info
message, 'TRANSPOSE(A) # B: ' + strtrim(mean(times[4, *]), 2), /info
message, 'MATRIX_MULTIPLY(A, B, /ATRANSPOSE): ' +
strtrim(mean(times[5, *]), 2), /info
message, 'TRANSPOSE(TEMPORARY(A)) # B: ' + strtrim(mean(times[6,
*]), 2), /info
message, 'MATRIX_MULTIPLY(TEMPORARY(A), B, /ATRANSPOSE): ' +
strtrim(mean(times[7, *]), 2), /info
end

Mike
--
www.michaelgalloy.com
Tech-X Corporation
Software Developer II

Report message to a moderator

Re: large matrix operations [message #62900 is a reply to message #62889]

Mon, 13 October 2008 10:39

David Fanning
Messages: 11724
Registered: August 2001

Senior Member

Vince Hradil writes:

> So much for that theory...
>
> Hardware:
> IDL> print, !version
> { x86 Win32 Windows Microsoft Windows 7.0 Oct 25 2007 32 64}
>
> Results:
> % TEST: Allocating A array
> % TEST: took 0.14000010 sec
> % TEST: Allocating B array
> % TEST: took 0.21899986 sec
> % TEST: A#B
> % TEST: took 40.878000 sec
> % TEST: matrix_multipy(A,B)
> % TEST: took 42.284000 sec
> % TEST: transpose(A)#B
> % TEST: took 42.645000 sec
> % TEST: matrix_multipy(A,B,/atranspose)
> % TEST: took 43.449000 sec
> % TEST: transpose(temporary(A))#B
> % TEST: took 43.387000 sec
> % TEST: matrix_multipy(temporary(A),B,/atranspose)
> % TEST: took 50.029000 sec

I'm glad to hear this, because I always thought that theory
leaned a little toward the bogus side. In my experience, slow
matrix operations are always due to memory paging problems.
According to a VERY interesting program on the History Channel
last night, shamanic rituals are sometimes effective in these
situations where we are up against demonic forces.

Cheers,

David

--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")

Report message to a moderator

Re: large matrix operations [message #62901 is a reply to message #62900]

Mon, 13 October 2008 10:07

Vince Hradil
Messages: 574
Registered: December 1999

Senior Member

On Oct 13, 11:00 am, Vince Hradil <vincehra...@gmail.com> wrote:
> On Oct 13, 10:08 am, Vince Hradil <vincehra...@gmail.com> wrote:
>
>> On Oct 13, 9:44 am, arp...@gmail.com wrote:
>
>>> It seems that very large matrix, e.g.1.0e6*1.0e5*100 operations, such
>>> as multiplication and matrix multiplication, in IDL is very slow.
>>> Anyone know how to speed up large matrix operations ?
>
>> Multiplication should be pretty fast. Matrix multiplication can be
>> slow if you have to transpose the matrices. This can be sped up by
>> using MatrixMultiply() instead of # or ##. Perhaps Temporary() can be
>> used to avoid allocation of extra memory. I think David has the best
>> ideas, though.
>
> oops, that should be MATRIX_MULTIPY()

So much for that theory...

Hardware:
IDL> print, !version
{ x86 Win32 Windows Microsoft Windows 7.0 Oct 25 2007 32 64}

Results:
% TEST: Allocating A array
% TEST: took 0.14000010 sec
% TEST: Allocating B array
% TEST: took 0.21899986 sec
% TEST: A#B
% TEST: took 40.878000 sec
% TEST: matrix_multipy(A,B)
% TEST: took 42.284000 sec
% TEST: transpose(A)#B
% TEST: took 42.645000 sec
% TEST: matrix_multipy(A,B,/atranspose)
% TEST: took 43.449000 sec
% TEST: transpose(temporary(A))#B
% TEST: took 43.387000 sec
% TEST: matrix_multipy(temporary(A),B,/atranspose)
% TEST: took 50.029000 sec

Code:
;;;;;;;;;;;;;;;;;;;;;;;;;
pro test

nel = 2000L

t0 = systime(1)
message, 'Allocating A array', /info
a = randomu(seed,[nel,nel])
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

t0 = systime(1)
message, 'Allocating B array', /info
b = randomu(seed,[nel,nel])
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

t0 = systime(1)
message, 'A#B', /info
c = a#b
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

t0 = systime(1)
message, 'matrix_multipy(A,B)', /info
c = matrix_multiply(a,b)
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

t0 = systime(1)
message, 'transpose(A)#B', /info
c = transpose(a)#b
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

t0 = systime(1)
message, 'matrix_multipy(A,B,/atranspose)', /info
c = matrix_multiply(a,b,/atranspose)
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

ahold = a

t0 = systime(1)
message, 'transpose(temporary(A))#B', /info
c = transpose(temporary(a))#b
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

a = ahold

t0 = systime(1)
message, 'matrix_multipy(temporary(A),B,/atranspose)', /info
c = matrix_multiply(temporary(a),b,/atranspose)
message, 'took '+strtrim(systime(1)-t0,2)+' sec', /info

return
end

Report message to a moderator

Re: large matrix operations [message #62904 is a reply to message #62901]

Mon, 13 October 2008 09:00

Vince Hradil
Messages: 574
Registered: December 1999

Senior Member

On Oct 13, 10:08 am, Vince Hradil <vincehra...@gmail.com> wrote:
> On Oct 13, 9:44 am, arp...@gmail.com wrote:
>
>> It seems that very large matrix, e.g.1.0e6*1.0e5*100 operations, such
>> as multiplication and matrix multiplication, in IDL is very slow.
>> Anyone know how to speed up large matrix operations ?
>
> Multiplication should be pretty fast. Matrix multiplication can be
> slow if you have to transpose the matrices. This can be sped up by
> using MatrixMultiply() instead of # or ##. Perhaps Temporary() can be
> used to avoid allocation of extra memory. I think David has the best
> ideas, though.

oops, that should be MATRIX_MULTIPY()

Report message to a moderator

Re: large matrix operations [message #62905 is a reply to message #62904]

Mon, 13 October 2008 08:08

Vince Hradil
Messages: 574
Registered: December 1999

Senior Member

On Oct 13, 9:44 am, arp...@gmail.com wrote:
> It seems that very large matrix, e.g.1.0e6*1.0e5*100 operations, such
> as multiplication and matrix multiplication, in IDL is very slow.
> Anyone know how to speed up large matrix operations ?

Multiplication should be pretty fast. Matrix multiplication can be
slow if you have to transpose the matrices. This can be sped up by
using MatrixMultiply() instead of # or ##. Perhaps Temporary() can be
used to avoid allocation of extra memory. I think David has the best
ideas, though.

Report message to a moderator

Re: large matrix operations [message #62906 is a reply to message #62905]

Mon, 13 October 2008 08:04

David Fanning
Messages: 11724
Registered: August 2001

Senior Member

arp244@gmail.com writes:

> It seems that very large matrix, e.g.1.0e6*1.0e5*100 operations, such
> as multiplication and matrix multiplication, in IDL is very slow.
> Anyone know how to speed up large matrix operations ?

More RAM? 64-bit OS? Prayer?

Cheers,

David

--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")

Report message to a moderator

Previous Topic:	multislice
Next Topic:	Catalyst Library Going Open Source

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Sat Oct 25 04:00:40 PDT 2025

Total time taken to generate the page: 5.37874 seconds