comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Reading h5 dataset by chunks
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Reading h5 dataset by chunks [message #93293] Tue, 07 June 2016 02:33 Go to next message
Nikola is currently offline  Nikola
Messages: 53
Registered: November 2009
Member
Gentle folks,

I'm trying to read a dataset from a h5 file by chunks. Let's say that in the file I have dataset called 'temperature' that contains 3D matrix (nx x ny x nz). Normally I use H5D_READ to read the entire dataset/cube at once. Since the dimensions of the cube may be huge (I easily get out of memory), I wonder is it possible to read h5 datasets chunk by chunk (slice by slice for example)? Something like using ASSOC to read large binary files.

I'm lost in the list of h5-related IDL routines. Any help will be appreciated!

Thanks!

Nikola
Re: Reading h5 dataset by chunks [message #93294 is a reply to message #93293] Tue, 07 June 2016 05:53 Go to previous messageGo to next message
lecacheux.alain is currently offline  lecacheux.alain
Messages: 325
Registered: January 2008
Senior Member
Le mardi 7 juin 2016 11:33:10 UTC+2, Nikola Vitas a écrit :
> Gentle folks,
>
> I'm trying to read a dataset from a h5 file by chunks. Let's say that in the file I have dataset called 'temperature' that contains 3D matrix (nx x ny x nz). Normally I use H5D_READ to read the entire dataset/cube at once. Since the dimensions of the cube may be huge (I easily get out of memory), I wonder is it possible to read h5 datasets chunk by chunk (slice by slice for example)? Something like using ASSOC to read large binary files.
>
> I'm lost in the list of h5-related IDL routines. Any help will be appreciated!
>
> Thanks!
>
> Nikola

The recipe with IDL implementation of HDF5 library might be the following:
- open your file: fileId = H5F_OPEN(...)
- open your 3D dataset: dsId = H5D_OPEN(fileId, ...)
- get the corresponding dataspace: dId = H5D_GET_SPACE(dsId)
- define the memory space to hold each readout chunk:
mId = H5S_CREATE_SIMPLE(dims)
(dims is the 3-vector containing sizes of the 3D slice):
Inside the reading loop:
- define an individual chunk: H5S_SELECT_HYPERSLAB, dId, start, dims, /RESET
(start is the 3-vector containing position of the 3D slice)
- read the data subset: data = H5D_READ(dsId, FILE_SPACE=dId, MEMORY_SPACE=mId)
Loop as far as you like.
When finished, close all the opened Ids.
I guess that the reading performance will depend on the way in which the file was originally written.
Cheers,
alx
Re: Reading h5 dataset by chunks [message #93300 is a reply to message #93294] Thu, 09 June 2016 08:25 Go to previous messageGo to next message
Michael Galloy is currently offline  Michael Galloy
Messages: 1114
Registered: April 2006
Senior Member
On 6/7/16 6:53 am, alx wrote:
> Le mardi 7 juin 2016 11:33:10 UTC+2, Nikola Vitas a écrit :
>> Gentle folks,
>>
>> I'm trying to read a dataset from a h5 file by chunks. Let's say that in the file I have dataset called 'temperature' that contains 3D matrix (nx x ny x nz). Normally I use H5D_READ to read the entire dataset/cube at once. Since the dimensions of the cube may be huge (I easily get out of memory), I wonder is it possible to read h5 datasets chunk by chunk (slice by slice for example)? Something like using ASSOC to read large binary files.
>>
>> I'm lost in the list of h5-related IDL routines. Any help will be appreciated!
>>
>> Thanks!
>>
>> Nikola
>
> The recipe with IDL implementation of HDF5 library might be the following:
> - open your file: fileId = H5F_OPEN(...)
> - open your 3D dataset: dsId = H5D_OPEN(fileId, ...)
> - get the corresponding dataspace: dId = H5D_GET_SPACE(dsId)
> - define the memory space to hold each readout chunk:
> mId = H5S_CREATE_SIMPLE(dims)
> (dims is the 3-vector containing sizes of the 3D slice):
> Inside the reading loop:
> - define an individual chunk: H5S_SELECT_HYPERSLAB, dId, start, dims, /RESET
> (start is the 3-vector containing position of the 3D slice)
> - read the data subset: data = H5D_READ(dsId, FILE_SPACE=dId, MEMORY_SPACE=mId)
> Loop as far as you like.
> When finished, close all the opened Ids.
> I guess that the reading performance will depend on the way in which the file was originally written.
> Cheers,
> alx
>

Yes, I believe those are all the steps/routines you need. Check out
MG_H5_GETDATA for an example of doing this (or just use it, if that
suits your purposes):

https://github.com/mgalloy/mglib/blob/master/src/hdf5/mg_h5_ getdata.pro

Mike
--
Michael Galloy
www.michaelgalloy.com
Modern IDL: A Guide to IDL Programming (http://modernidl.idldev.com)
Re: Reading h5 dataset by chunks [message #94185 is a reply to message #93300] Fri, 17 February 2017 01:38 Go to previous message
Nikola is currently offline  Nikola
Messages: 53
Registered: November 2009
Member
Hi Mike,

Sorry that I haven't give you any feedback earlier. I got distracted with another project at the time and only now I have tried your routine. It's excellent and it does exactly what I need. Many thanks!

Best, Nikola
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Coyote's New Gig
Next Topic: k-mean clustering idl

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 11:30:15 PDT 2025

Total time taken to generate the page: 0.00464 seconds