comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Merits of different ways of 'extending' arrays
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Merits of different ways of 'extending' arrays [message #85726] Thu, 29 August 2013 08:44 Go to next message
Andy Sayer is currently offline  Andy Sayer
Messages: 127
Registered: February 2009
Senior Member
Hi all,

I am writing some code where I am loading a whole bunch of files one by one, querying them from valid data, and putting valid data from each file into an array (for later use). I don't know ahead of time how many files there will be, or how many valid data points there will be in a file.

The way I have written my code so far is like this:

var_1_arr=[!values.f_nan]
var_2_arr=[!values.f_nan]
var_3_arr=[!values.f_nan]

f=file_search( [path and identifier to files],count=nfiles)

for i=0l,nfiles-1 do begin

[load contents of file f[i] into a structure]

is_valid=where(blah blah,n_valid)

if n_valid gt 0 then begin
var_1_arr=[var_1_arr,f.var_1[is_valid]]
var_2_arr=[var_2_arr,f.var_2[is_valid]]
var_3_arr=[var_3_arr,f.var_3[is_valid]]

endif

endfor

So, hopefully you get the idea. I only have a small subset of the test data to work with at the moment (the rest is a few months off).

It occurs to me that I could code it something like this:

max_points=1.e7

var_1_arr=fltarr(max_points)
var_1_arr(*)=!values.f_nan
var_2_arr=var_1_arr
var_3_arr=var_1_arr

f=file_search( [path and identifier to files],count=nfiles)

ctr=0l

for i=0l,nfiles-1 do begin

[load contents of file f[i] into a structure]

is_valid=where(blah blah,n_valid)

if n_valid gt 0 then begin
var_1_arr[ctr:ctr+n_valid]=f.var_1[is_valid]
var_2_arr[ctr:ctr+n_valid]=f.var_2[is_valid]
var_3_arr[ctr:ctr+n_valid]=f.var_3[is_valid]
ctr=ctr+n_valid
endif

endfor

This has the drawback that I have to know in advance the maximum number of data points I could have (but I can set max_points to some arbitrary high number to be safe). Does anyone know whether any one method is better/less memory-intensive than the other, when it comes to largeish data volumes (tens of millions of points)? I only have a few percent of the final data so far, so am interested in the likely merits of each method. Google didn't help but perhaps I was using the wrong search keywords.

In case relevant, this is IDL 7.1.1. or 8.2.2.

Thanks,

Andy
Re: Merits of different ways of 'extending' arrays [message #85727 is a reply to message #85726] Thu, 29 August 2013 08:57 Go to previous messageGo to next message
Andy Sayer is currently offline  Andy Sayer
Messages: 127
Registered: February 2009
Senior Member
I always find typos after I click the button to post. :) The second code snippet should probably be [ctr:ctr_n_valid-1] rather than [ctr:ctr_n_valid]. And also, by 'better/less memory-intensive' I really mean 'faster/less memory-intensive'.

On Thursday, August 29, 2013 11:44:51 AM UTC-4, AMS wrote:
> Hi all,
>
>
>
> I am writing some code where I am loading a whole bunch of files one by one, querying them from valid data, and putting valid data from each file into an array (for later use). I don't know ahead of time how many files there will be, or how many valid data points there will be in a file.
>
>
>
> The way I have written my code so far is like this:
>
>
>
> var_1_arr=[!values.f_nan]
>
> var_2_arr=[!values.f_nan]
>
> var_3_arr=[!values.f_nan]
>
>
>
> f=file_search( [path and identifier to files],count=nfiles)
>
>
>
> for i=0l,nfiles-1 do begin
>
>
>
> [load contents of file f[i] into a structure]
>
>
>
> is_valid=where(blah blah,n_valid)
>
>
>
> if n_valid gt 0 then begin
>
> var_1_arr=[var_1_arr,f.var_1[is_valid]]
>
> var_2_arr=[var_2_arr,f.var_2[is_valid]]
>
> var_3_arr=[var_3_arr,f.var_3[is_valid]]
>
>
>
> endif
>
>
>
> endfor
>
>
>
> So, hopefully you get the idea. I only have a small subset of the test data to work with at the moment (the rest is a few months off).
>
>
>
> It occurs to me that I could code it something like this:
>
>
>
> max_points=1.e7
>
>
>
> var_1_arr=fltarr(max_points)
>
> var_1_arr(*)=!values.f_nan
>
> var_2_arr=var_1_arr
>
> var_3_arr=var_1_arr
>
>
>
> f=file_search( [path and identifier to files],count=nfiles)
>
>
>
> ctr=0l
>
>
>
> for i=0l,nfiles-1 do begin
>
>
>
> [load contents of file f[i] into a structure]
>
>
>
> is_valid=where(blah blah,n_valid)
>
>
>
> if n_valid gt 0 then begin
>
> var_1_arr[ctr:ctr+n_valid]=f.var_1[is_valid]
>
> var_2_arr[ctr:ctr+n_valid]=f.var_2[is_valid]
>
> var_3_arr[ctr:ctr+n_valid]=f.var_3[is_valid]
>
> ctr=ctr+n_valid
>
> endif
>
>
>
> endfor
>
>
>
> This has the drawback that I have to know in advance the maximum number of data points I could have (but I can set max_points to some arbitrary high number to be safe). Does anyone know whether any one method is better/less memory-intensive than the other, when it comes to largeish data volumes (tens of millions of points)? I only have a few percent of the final data so far, so am interested in the likely merits of each method. Google didn't help but perhaps I was using the wrong search keywords.
>
>
>
> In case relevant, this is IDL 7.1.1. or 8.2.2.
>
>
>
> Thanks,
>
>
>
> Andy
Re: Merits of different ways of 'extending' arrays [message #85728 is a reply to message #85726] Thu, 29 August 2013 09:29 Go to previous messageGo to next message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
AMS writes:

> This has the drawback that I have to know in advance the maximum number of data points I could have (but I can set max_points to some arbitrary high number to be safe). Does anyone know whether any one method is better/less memory-intensive than the other, when it comes to largeish data volumes (tens of millions of points)? I only have a few percent of the final data so far, so am interested in the likely merits of each method. Google didn't help but perhaps I was using
the wrong search keywords.

You are MUCH better off to allocate memory in large chucks and then trim
or add to your arrays (in more large chunks) as necessary. This will
keep you from fragmenting your memory space, which is the single biggest
problem when working with large arrays.

Cheers,

David



--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: Merits of different ways of 'extending' arrays [message #85729 is a reply to message #85728] Thu, 29 August 2013 09:59 Go to previous messageGo to next message
Andy Sayer is currently offline  Andy Sayer
Messages: 127
Registered: February 2009
Senior Member
Thanks; happily, this was a simple recode to make. :)

Andy

On Thursday, August 29, 2013 12:29:43 PM UTC-4, David Fanning wrote:
> AMS writes:
>
>
>
>> This has the drawback that I have to know in advance the maximum number of data points I could have (but I can set max_points to some arbitrary high number to be safe). Does anyone know whether any one method is better/less memory-intensive than the other, when it comes to largeish data volumes (tens of millions of points)? I only have a few percent of the final data so far, so am interested in the likely merits of each method. Google didn't help but perhaps I was using
>
> the wrong search keywords.
>
>
>
> You are MUCH better off to allocate memory in large chucks and then trim
>
> or add to your arrays (in more large chunks) as necessary. This will
>
> keep you from fragmenting your memory space, which is the single biggest
>
> problem when working with large arrays.
>
>
>
> Cheers,
>
>
>
> David
>
>
>
>
>
>
>
> --
>
> David Fanning, Ph.D.
>
> Fanning Software Consulting, Inc.
>
> Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
>
> Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: Merits of different ways of 'extending' arrays [message #85730 is a reply to message #85729] Thu, 29 August 2013 10:11 Go to previous messageGo to next message
chris_torrence@NOSPAM is currently offline  chris_torrence@NOSPAM
Messages: 528
Registered: March 2007
Senior Member
Actually, if you use lists, then you can add each individual chunk of data to each list, and then use ToArray() with the DIMENSION keyword. For example:

l = list(findgen(20))
l.add, findgen(20) + 20
help, l.ToArray(DIM=1)
<Expression> FLOAT = Array[40]

This will be both the fastest way and will use the least memory.

Cheers,
Chris
ExelisVIS
Re: Merits of different ways of 'extending' arrays [message #85731 is a reply to message #85730] Thu, 29 August 2013 10:13 Go to previous messageGo to next message
Andy Sayer is currently offline  Andy Sayer
Messages: 127
Registered: February 2009
Senior Member
Thanks; some of our machines are on IDL 7.1.1. though so I don't think we can use lists for code portability. :)

Andy

On Thursday, August 29, 2013 1:11:25 PM UTC-4, Chris Torrence wrote:
> Actually, if you use lists, then you can add each individual chunk of data to each list, and then use ToArray() with the DIMENSION keyword. For example:
>
>
>
> l = list(findgen(20))
>
> l.add, findgen(20) + 20
>
> help, l.ToArray(DIM=1)
>
> <Expression> FLOAT = Array[40]
>
>
>
> This will be both the fastest way and will use the least memory.
>
>
>
> Cheers,
>
> Chris
>
> ExelisVIS
Re: Merits of different ways of 'extending' arrays [message #85742 is a reply to message #85731] Fri, 30 August 2013 02:54 Go to previous messageGo to next message
Fabzi is currently offline  Fabzi
Messages: 305
Registered: July 2010
Senior Member
On 08/29/2013 07:13 PM, AMS wrote:
> Thanks; some of our machines are on IDL 7.1.1. though so
> I don't think we can use lists for code portability.:)
>
> Andy

for versions before 8. you have the List from Michael Galloy also, which
is very fast:

http://docs.idldev.com/idllib/collection/dir-overview.html

Cheers
Re: Merits of different ways of 'extending' arrays [message #85749 is a reply to message #85742] Fri, 30 August 2013 14:08 Go to previous messageGo to next message
Michael Galloy is currently offline  Michael Galloy
Messages: 1114
Registered: April 2006
Senior Member
On 8/30/13 3:54 AM, Fabien wrote:
> On 08/29/2013 07:13 PM, AMS wrote:
>> Thanks; some of our machines are on IDL 7.1.1. though so
>> I don't think we can use lists for code portability.:)
>>
>> Andy
>
> for versions before 8. you have the List from Michael Galloy also, which
> is very fast:
>
> http://docs.idldev.com/idllib/collection/dir-overview.html
>
> Cheers

MGcoArrayList implements the block strategy that David was talking
about. Set the BLOCK_SIZE keyword to the size of chunks you want to
allocate.

Also, my library has moved to GitHub, so you should go there to make
sure you always get the most recent versions:

https://github.com/mgalloy/mglib

It's in src/collection and you need several other of the files in the
collection directory.

Mike
--
Michael Galloy
www.michaelgalloy.com
Modern IDL: A Guide to IDL Programming (http://modernidl.idldev.com)
Research Mathematician
Tech-X Corporation
Re: Merits of different ways of 'extending' arrays [message #85798 is a reply to message #85726] Mon, 09 September 2013 15:41 Go to previous messageGo to next message
suicidaleggroll is currently offline  suicidaleggroll
Messages: 14
Registered: September 2013
Junior Member
Another option is to set up a pointer array nfiles long before the loop, inside the loop load the file and find the valid points, then put that array into that file's pointer, while incrementing a counter to keep track of the total number of points. When you're done, you have all of your data saved in pointers (one per file), and a count of the total number of valid points. Then you allocate your array, loop back through the elements of the pointer array, and fill the array as necessary. Something like:

f = file_search(path, count=nfiles)
ptrs = ptrarr(nfiles)
num = 0l
for i=0l,nfiles-1 do begin
;; load contents of file
is_valid = where(stuff, n_valid)
if n_valid gt 0 then begin
num += n_valid
ptrs[i] = ptr_new(f.var_1[is_valid])
endif
endfor

data = fltarr(num)
idx = 0l
for i=0l,nfiles-1 do begin
if ptr_valid(ptrs[i]) then begin
num = n_elements(*ptrs[i])
data[idx:idx+num-1] = *ptrs[i]
ptr_free, ptrs[i]
endif
endfor
Re: Merits of different ways of 'extending' arrays [message #85805 is a reply to message #85798] Tue, 10 September 2013 13:08 Go to previous messageGo to next message
suicidaleggroll is currently offline  suicidaleggroll
Messages: 14
Registered: September 2013
Junior Member
On Monday, September 9, 2013 4:41:10 PM UTC-6, suicida...@gmail.com wrote:
> Another option is to set up a pointer array nfiles long before the loop, inside the loop load the file and find the valid points, then put that array into that file's pointer, while incrementing a counter to keep track of the total number of points. When you're done, you have all of your data saved in pointers (one per file), and a count of the total number of valid points. Then you allocate your array, loop back through the elements of the pointer array, and fill the array as necessary. Something like:
>
>
>
> f = file_search(path, count=nfiles)
>
> ptrs = ptrarr(nfiles)
>
> num = 0l
>
> for i=0l,nfiles-1 do begin
>
> ;; load contents of file
>
> is_valid = where(stuff, n_valid)
>
> if n_valid gt 0 then begin
>
> num += n_valid
>
> ptrs[i] = ptr_new(f.var_1[is_valid])
>
> endif
>
> endfor
>
>
>
> data = fltarr(num)
>
> idx = 0l
>
> for i=0l,nfiles-1 do begin
>
> if ptr_valid(ptrs[i]) then begin
>
> num = n_elements(*ptrs[i])
>
> data[idx:idx+num-1] = *ptrs[i]
>
> ptr_free, ptrs[i]
>
> endif
>
> endfor

Wish I could edit my post...

There should be an "idx += num" next to the ptr_free at the end of the second loop.
Re: Merits of different ways of 'extending' arrays [message #85862 is a reply to message #85805] Sat, 14 September 2013 18:11 Go to previous messageGo to next message
Andy Sayer is currently offline  Andy Sayer
Messages: 127
Registered: February 2009
Senior Member
Thanks for the continuing tips!

The first suggestion (allocate a 'big enough' array up-front, rather than continually extend) worked great for my purposes, so that's what I stuck with, given that it was also very simple. Although I appreciate the continued suggestions.

Andy

On Tuesday, September 10, 2013 4:08:40 PM UTC-4, suicida...@gmail.com wrote:
> On Monday, September 9, 2013 4:41:10 PM UTC-6, suicida...@gmail.com wrote:
>
>> Another option is to set up a pointer array nfiles long before the loop, inside the loop load the file and find the valid points, then put that array into that file's pointer, while incrementing a counter to keep track of the total number of points. When you're done, you have all of your data saved in pointers (one per file), and a count of the total number of valid points. Then you allocate your array, loop back through the elements of the pointer array, and fill the array as necessary. Something like:
>
>>
>
>>
>
>>
>
>> f = file_search(path, count=nfiles)
>
>>
>
>> ptrs = ptrarr(nfiles)
>
>>
>
>> num = 0l
>
>>
>
>> for i=0l,nfiles-1 do begin
>
>>
>
>> ;; load contents of file
>
>>
>
>> is_valid = where(stuff, n_valid)
>
>>
>
>> if n_valid gt 0 then begin
>
>>
>
>> num += n_valid
>
>>
>
>> ptrs[i] = ptr_new(f.var_1[is_valid])
>
>>
>
>> endif
>
>>
>
>> endfor
>
>>
>
>>
>
>>
>
>> data = fltarr(num)
>
>>
>
>> idx = 0l
>
>>
>
>> for i=0l,nfiles-1 do begin
>
>>
>
>> if ptr_valid(ptrs[i]) then begin
>
>>
>
>> num = n_elements(*ptrs[i])
>
>>
>
>> data[idx:idx+num-1] = *ptrs[i]
>
>>
>
>> ptr_free, ptrs[i]
>
>>
>
>> endif
>
>>
>
>> endfor
>
>
>
> Wish I could edit my post...
>
>
>
> There should be an "idx += num" next to the ptr_free at the end of the second loop.
Re: Merits of different ways of 'extending' arrays [message #85877 is a reply to message #85862] Mon, 16 September 2013 07:47 Go to previous messageGo to next message
suicidaleggroll is currently offline  suicidaleggroll
Messages: 14
Registered: September 2013
Junior Member
The only problem with that is, what is "big enough"? It's going to change from application to application. What happens when your assumption of "big enough" breaks down? Do you have support for re-allocating the arrays when you hit their limits? In order to avoid this you have to make the array SO big that you can start to run into significant memory allocation delays, even when loading just a small amount of data.

Allocating and expanding in fixed "blocks" as suggested before is a way to elegantly handle this problem, however the block size needs to be tuned for every application or you can start to get some big slowdowns.

Just some things to consider when choosing your approach.

On Saturday, September 14, 2013 7:11:07 PM UTC-6, AMS wrote:
> Thanks for the continuing tips!
>
>
>
> The first suggestion (allocate a 'big enough' array up-front, rather than continually extend) worked great for my purposes, so that's what I stuck with, given that it was also very simple. Although I appreciate the continued suggestions.
>
>
>
> Andy
>
>
>
> On Tuesday, September 10, 2013 4:08:40 PM UTC-4, suicida...@gmail.com wrote:
>
>> On Monday, September 9, 2013 4:41:10 PM UTC-6, suicida...@gmail.com wrote:
>
>>
>
>>> Another option is to set up a pointer array nfiles long before the loop, inside the loop load the file and find the valid points, then put that array into that file's pointer, while incrementing a counter to keep track of the total number of points. When you're done, you have all of your data saved in pointers (one per file), and a count of the total number of valid points. Then you allocate your array, loop back through the elements of the pointer array, and fill the array as necessary. Something like:
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> f = file_search(path, count=nfiles)
>
>>
>
>>>
>
>>
>
>>> ptrs = ptrarr(nfiles)
>
>>
>
>>>
>
>>
>
>>> num = 0l
>
>>
>
>>>
>
>>
>
>>> for i=0l,nfiles-1 do begin
>
>>
>
>>>
>
>>
>
>>> ;; load contents of file
>
>>
>
>>>
>
>>
>
>>> is_valid = where(stuff, n_valid)
>
>>
>
>>>
>
>>
>
>>> if n_valid gt 0 then begin
>
>>
>
>>>
>
>>
>
>>> num += n_valid
>
>>
>
>>>
>
>>
>
>>> ptrs[i] = ptr_new(f.var_1[is_valid])
>
>>
>
>>>
>
>>
>
>>> endif
>
>>
>
>>>
>
>>
>
>>> endfor
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> data = fltarr(num)
>
>>
>
>>>
>
>>
>
>>> idx = 0l
>
>>
>
>>>
>
>>
>
>>> for i=0l,nfiles-1 do begin
>
>>
>
>>>
>
>>
>
>>> if ptr_valid(ptrs[i]) then begin
>
>>
>
>>>
>
>>
>
>>> num = n_elements(*ptrs[i])
>
>>
>
>>>
>
>>
>
>>> data[idx:idx+num-1] = *ptrs[i]
>
>>
>
>>>
>
>>
>
>>> ptr_free, ptrs[i]
>
>>
>
>>>
>
>>
>
>>> endif
>
>>
>
>>>
>
>>
>
>>> endfor
>
>>
>
>>
>
>>
>
>> Wish I could edit my post...
>
>>
>
>>
>
>>
>
>> There should be an "idx += num" next to the ptr_free at the end of the second loop.
Re: Merits of different ways of 'extending' arrays [message #85908 is a reply to message #85877] Tue, 17 September 2013 10:40 Go to previous messageGo to next message
Andy Sayer is currently offline  Andy Sayer
Messages: 127
Registered: February 2009
Senior Member
Right, but for this particular piece of code, I do know the maximum possible size (it's a standalone function I will call for one piece of analysis, as opposed to something I am plugging into the guys of a larger project), so it's good for this application. :)

On Monday, September 16, 2013 10:47:38 AM UTC-4, suicida...@gmail.com wrote:
> The only problem with that is, what is "big enough"? It's going to change from application to application. What happens when your assumption of "big enough" breaks down? Do you have support for re-allocating the arrays when you hit their limits? In order to avoid this you have to make the array SO big that you can start to run into significant memory allocation delays, even when loading just a small amount of data.
>
>
>
> Allocating and expanding in fixed "blocks" as suggested before is a way to elegantly handle this problem, however the block size needs to be tuned for every application or you can start to get some big slowdowns.
>
>
>
> Just some things to consider when choosing your approach.
>
>
>
> On Saturday, September 14, 2013 7:11:07 PM UTC-6, AMS wrote:
>
>> Thanks for the continuing tips!
>
>>
>
>>
>
>>
>
>> The first suggestion (allocate a 'big enough' array up-front, rather than continually extend) worked great for my purposes, so that's what I stuck with, given that it was also very simple. Although I appreciate the continued suggestions.
>
>>
>
>>
>
>>
>
>> Andy
>
>>
>
>>
>
>>
>
>> On Tuesday, September 10, 2013 4:08:40 PM UTC-4, suicida...@gmail.com wrote:
>
>>
>
>>> On Monday, September 9, 2013 4:41:10 PM UTC-6, suicida...@gmail.com wrote:
>
>>
>
>>>
>
>>
>
>>>> Another option is to set up a pointer array nfiles long before the loop, inside the loop load the file and find the valid points, then put that array into that file's pointer, while incrementing a counter to keep track of the total number of points. When you're done, you have all of your data saved in pointers (one per file), and a count of the total number of valid points. Then you allocate your array, loop back through the elements of the pointer array, and fill the array as necessary. Something like:
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> f = file_search(path, count=nfiles)
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> ptrs = ptrarr(nfiles)
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> num = 0l
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> for i=0l,nfiles-1 do begin
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> ;; load contents of file
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> is_valid = where(stuff, n_valid)
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> if n_valid gt 0 then begin
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> num += n_valid
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> ptrs[i] = ptr_new(f.var_1[is_valid])
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> endif
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> endfor
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> data = fltarr(num)
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> idx = 0l
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> for i=0l,nfiles-1 do begin
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> if ptr_valid(ptrs[i]) then begin
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> num = n_elements(*ptrs[i])
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> data[idx:idx+num-1] = *ptrs[i]
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> ptr_free, ptrs[i]
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> endif
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> endfor
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> Wish I could edit my post...
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> There should be an "idx += num" next to the ptr_free at the end of the second loop.
Re: Merits of different ways of 'extending' arrays [message #85912 is a reply to message #85877] Tue, 17 September 2013 12:34 Go to previous messageGo to next message
Yngvar Larsen is currently offline  Yngvar Larsen
Messages: 134
Registered: January 2010
Senior Member
On Monday, 16 September 2013 15:47:38 UTC+1, suicida...@gmail.com wrote:

> Allocating and expanding in fixed "blocks" as suggested before is a way to elegantly handle this problem, however the block size needs to be tuned for every application or you can start to get some big slowdowns.

In the generic case when "big enough" is not known, the best algorithm is to double the size of the array every time you hit the current capacity. (Or 3x or 1.5x, does not matter as long as the growth is exponential in as a function of the number of resizes.)

See

http://en.wikipedia.org/wiki/Dynamic_array



--
Yngvar
Re: Merits of different ways of 'extending' arrays [message #85913 is a reply to message #85912] Tue, 17 September 2013 14:47 Go to previous message
Michael Galloy is currently offline  Michael Galloy
Messages: 1114
Registered: April 2006
Senior Member
On 9/17/13 1:34 PM, Yngvar Larsen wrote:
> On Monday, 16 September 2013 15:47:38 UTC+1, suicida...@gmail.com
> wrote:
>
>> Allocating and expanding in fixed "blocks" as suggested before is a
>> way to elegantly handle this problem, however the block size needs
>> to be tuned for every application or you can start to get some big
>> slowdowns.
>
> In the generic case when "big enough" is not known, the best
> algorithm is to double the size of the array every time you hit the
> current capacity. (Or 3x or 1.5x, does not matter as long as the
> growth is exponential in as a function of the number of resizes.)
>
> See
>
> http://en.wikipedia.org/wiki/Dynamic_array
>
>
>

This is what MGcoArrayList does.


https://github.com/mgalloy/mglib/blob/master/src/collection/ mgcoarraylist__define.pro

I used to add capacity in increments of a BLOCK_SIZE property set by the
user, but I think the doubling is the way to go.

Mike
--
Michael Galloy
www.michaelgalloy.com
Modern IDL: A Guide to IDL Programming (http://modernidl.idldev.com)
Research Mathematician
Tech-X Corporation
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: IDL X11 forwarding segmentation fault
Next Topic: defining structure after ascii template

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 13:40:24 PDT 2025

Total time taken to generate the page: 0.00651 seconds