comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Re: Reading a very large ascii data file
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Return to the default flat view Create a new topic Submit Reply
Re: Reading a very large ascii data file [message #26441 is a reply to message #26440] Sat, 25 August 2001 04:32 Go to previous messageGo to previous message
R.Bauer is currently offline  R.Bauer
Messages: 1424
Registered: November 1998
Senior Member
Mirko Vukovic wrote:
>
> Paul van Delst <paul.vandelst@noaa.gov> wrote in message news:<3B8698E0.B3F13251@noaa.gov>...
>> Mirko Vukovic wrote:
>>>
>>> I am reading some large ascii data files in csv (comma separated
>>> fields) format, and would like to speed the process up.
>>>
>>> I recall someone discussing reading such files as binaries and then
>>> converting to ascii after finding line breaks, but was un-able to find
>>> the discussion on the group.
>>>
>>> Can anyone offer pointers, code, or suggestions on who might have
>>> discussed it (so that I can look again on the newsgroup).
>>
>> Can you provide more information about your data files? E.g. are the number of columns
>> fixed? Are the number of lines fixed? If not, is there a maximum number of lines which the
>> files won't exceed?
>>
>> Try the DDREAD.PRO and associated IDL code. Have a look at
>>
>> http://www.dfanning.com/tips/unknown_rows.html
>>
>> for some issues and a link to the source code.
>>
>> paulv
>
> Thanks for the comments,
>
> The file format is variable. The file contains a log of data of a
> variable number of channels, and of arbitrary duration. It is
> generated by the TrendLink software from Fluke.
>
> The file consists of a header, which has as many lines as diagnostics.
> Next comes the data, with one column for the time and date, and a
> column each for each channel.
>
> I therefore use a two-pass system. In the first, I read all the
> lines, and count their number, and from the last line also extract the
> number of channels.
>
> With this info, I then initialize the header and data structures, and
> then go again through the file, and store the stuff.
>
> In that sense, I am not using the very slow procedure noted by martin
> (appending a line to the matrix). However, I am going explicitly
> through a very long loop, twice.
>
> One methode may be to open the file in binary mode, get info about the
> number of bytes, initialize a byte vector to appropriate size, and
> then read the file into it. Now, with the file stored in memory
> (although it can be megabytes in size), go through it, ``reading''
> line by line.
>
> This actually looks to be a quite generic procedure. Any idea whether
> it has been implemented already?
>
> Any more suggestions?
>
> Thanks,
>
> Mirko

Dear Mirko,

you should use our read_data_file.

This routine itselfs separates header, datablock and trailer.
The datablock must be a tabular of numbers.
You got returned a structure .header, .separator, .data
because you haven't a trailer.

data is a tabular of n columns and m lines

This routine is very fast.

http://www.fz-juelich.de/icg/icg1/idl_icglib/idl_source/idl_ html/dbase/download/read_data_file.tar.gz


regards

Reimar

--
Reimar Bauer

Institut fuer Stratosphaerische Chemie (ICG-1)
Forschungszentrum Juelich
email: R.Bauer@fz-juelich.de
http://www.fz-juelich.de/icg/icg1/
============================================================ ======
a IDL library at ForschungsZentrum Juelich
http://www.fz-juelich.de/icg/icg1/idl_icglib/idl_lib_intro.h tml

http://www.fz-juelich.de/zb/text/publikation/juel3786.html
============================================================ ======

read something about linux / windows
http://www.suse.de/de/news/hotnews/MS.html
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Dataminer and mySQL?
Next Topic: Now you can process your image(s) with CWT algorithm for FREE !

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Fri Oct 10 08:23:47 PDT 2025

Total time taken to generate the page: 0.32111 seconds