comp.lang.idl-pvwave archive: archive » Re: What's better: 1 big HDF file or several samller ones??

Home » Public Forums » archive » Re: What's better: 1 big HDF file or several samller ones??

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Re: What's better: 1 big HDF file or several samller ones?? [message #32226]

Wed, 25 September 2002 07:06

James Kuyper
Messages: 425
Registered: March 2000

Senior Member

Brian Huether wrote:
>
> I have radar data that is broken down based on target orientation (i.e. the
> data is split into 72 5 degree azimuthal windows). So do I create one HDF
> file with 72 datasets? The other thing to consider is this: I need to run
> computational algorithms that will need to access the data so are there
> speed considerations when saving the data this way? Basically I have to data
> storage goals: 1) make the data readily shareable, 2)make the data storage
> appropriate for quick retrieval.
>
> I suppose when it comes to the computational stuff, I can use the one big
> file and then one time I can just read all the info into an array. So the
> awkwardness of the big file will only be problematic one time.

The basic issues you have to consider are file size limits, and how you
intend to use the data. On many systems there's an upper limit on the
size of files, either set by the available disk space, by a file
addressing limit (typically 2GB). You have to break up your file if
that's the case.

How often will the code that reads this (these) file(s) need to access
data across multiple azimuth bins? If never, then you should split each
azimuth bin into a seperate file. If frequently, they should be in the
same file if possible, and you should consider the possibility of
merging them into a single SDS with one dimension for the azimuth bins.
If infrequently, it's a judgement call.

Report message to a moderator

Re: What's better: 1 big HDF file or several samller ones?? [message #32231 is a reply to message #32226]

Wed, 25 September 2002 05:23

Robert Stockwell
Messages: 74
Registered: October 2001

Member

Brian Huether wrote:
> I have radar data that is broken down based on target orientation (i.e. the
> data is split into 72 5 degree azimuthal windows). So do I create one HDF
> file with 72 datasets? The other thing to consider is this: I need to run
> computational algorithms that will need to access the data so are there
> speed considerations when saving the data this way? Basically I have to data
> storage goals: 1) make the data readily shareable, 2)make the data storage
> appropriate for quick retrieval.
>
> I suppose when it comes to the computational stuff, I can use the one big
> file and then one time I can just read all the info into an array. So the
> awkwardness of the big file will only be problematic one time.
>
> Any thoughts?
>
> -brian
>
>

Hi Brian,

I tend to vote with "one big file". It is easier to keep track of, to backup
(assuming it fits on your media), and to transfer. With many smaller files,
it is not always easy to tell that one of them "disappear".
In fact I did have that problem with my satellite data. It had 13,000 files
in a directory, and retreiving the directory listing resulted in some
100 random files getting dropped (on a slightly older version of linux, 7.0 maybe).
It is very difficult to find an error like that, which would not happen if the
data was in giant files.

I store global data maps (x,y,t) in large (1 gig) yearly files as contiguous
spatial maps one on top of each other, but I also reproduce the data as a
collection of contiguous time series, so depending on what type of data you
want (a snapshot of the data at a particular time, or all the data from one
site as a time series) the retrieval is blazingly fast.

Cheers,
bob

Report message to a moderator

Previous Topic:	Re: Triangulation invalid
Next Topic:	Re: Again: How to use FORMAT like sprintf??

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Fri Dec 05 09:12:17 PST 2025

Total time taken to generate the page: 0.01395 seconds