comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » READ_CSV() gotcha
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
READ_CSV() gotcha [message #90721] Fri, 03 April 2015 10:44 Go to next message
wlandsman is currently offline  wlandsman
Messages: 743
Registered: June 2000
Senior Member
The documentation for READ_CSV() describes how a column is stored as Double, rather than an integer type, if it has a decimal point or an exponent.

What it doesn't say is that READ_CSV() only looks at the first 100 rows, so that if these first 100 numbers are compatible with an integer (no decimal points or exponents) then the entire column is read as an integer. In my case, the data becomes floating point after about 2000 rows (with values between 0 and 1), so these are all truncated to zero.

There doesn't seem to be an easy fix, e.g. a way to force a column to be read as Double, so I ended up writing a specialized reader. --Wayne
Re: READ_CSV() gotcha [message #90742 is a reply to message #90721] Wed, 08 April 2015 16:36 Go to previous message
penteado is currently offline  penteado
Messages: 866
Registered: February 2018
Senior Member
Administrator
Hello Wayne,

It seems this has come in a bit late, but I have an altered version of read_csv(), which provides a bunch of additional options, including allowing the user to choose how many rows to use for testing (select 0, to use all rows), it makes the resulting structure get field names based on the csv header, and has more elaborate options to control the testing for column types:

http://ppenteado.net/idl/pp_lib/doc/read_csv_pp.html

I generally use it with the /transp keyword, so that the result is a structure array, with one element per csv row, which I find more useful.

It also has a companion, write_csv_pp, which is a wrapper to write_csv, to create a csv with a table header with a structure's field names.

http://ppenteado.net/idl/pp_lib/doc



On Friday, April 3, 2015 at 2:45:00 PM UTC-3, wlandsman wrote:
> The documentation for READ_CSV() describes how a column is stored as Double, rather than an integer type, if it has a decimal point or an exponent.
>
> What it doesn't say is that READ_CSV() only looks at the first 100 rows, so that if these first 100 numbers are compatible with an integer (no decimal points or exponents) then the entire column is read as an integer. In my case, the data becomes floating point after about 2000 rows (with values between 0 and 1), so these are all truncated to zero.
>
> There doesn't seem to be an easy fix, e.g. a way to force a column to be read as Double, so I ended up writing a specialized reader. --Wayne
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: FIT_ELLIPSE
Next Topic: CXFORM on WinXP

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 11:36:42 PDT 2025

Total time taken to generate the page: 0.00448 seconds