comp.lang.idl-pvwave archive: archive » READ

Home » Public Forums » archive » READ_CSV() gotcha

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

READ_CSV() gotcha [message #90721]

Fri, 03 April 2015 10:44

wlandsman
Messages: 743
Registered: June 2000

Senior Member

The documentation for READ_CSV() describes how a column is stored as Double, rather than an integer type, if it has a decimal point or an exponent.

What it doesn't say is that READ_CSV() only looks at the first 100 rows, so that if these first 100 numbers are compatible with an integer (no decimal points or exponents) then the entire column is read as an integer. In my case, the data becomes floating point after about 2000 rows (with values between 0 and 1), so these are all truncated to zero.

There doesn't seem to be an easy fix, e.g. a way to force a column to be read as Double, so I ended up writing a specialized reader. --Wayne

Report message to a moderator

Re: READ_CSV() gotcha [message #90742 is a reply to message #90721]

Wed, 08 April 2015 16:36

penteado
Messages: 866
Registered: February 2018

Senior Member
Administrator

Hello Wayne,

It seems this has come in a bit late, but I have an altered version of read_csv(), which provides a bunch of additional options, including allowing the user to choose how many rows to use for testing (select 0, to use all rows), it makes the resulting structure get field names based on the csv header, and has more elaborate options to control the testing for column types:

http://ppenteado.net/idl/pp_lib/doc/read_csv_pp.html

I generally use it with the /transp keyword, so that the result is a structure array, with one element per csv row, which I find more useful.

It also has a companion, write_csv_pp, which is a wrapper to write_csv, to create a csv with a table header with a structure's field names.

http://ppenteado.net/idl/pp_lib/doc

On Friday, April 3, 2015 at 2:45:00 PM UTC-3, wlandsman wrote:
> The documentation for READ_CSV() describes how a column is stored as Double, rather than an integer type, if it has a decimal point or an exponent.
>
> What it doesn't say is that READ_CSV() only looks at the first 100 rows, so that if these first 100 numbers are compatible with an integer (no decimal points or exponents) then the entire column is read as an integer. In my case, the data becomes floating point after about 2000 rows (with values between 0 and 1), so these are all truncated to zero.
>
> There doesn't seem to be an easy fix, e.g. a way to force a column to be read as Double, so I ended up writing a specialized reader. --Wayne

Report message to a moderator

Previous Topic:	FIT_ELLIPSE
Next Topic:	CXFORM on WinXP

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Fri Nov 28 05:09:37 PST 2025

Total time taken to generate the page: 0.01063 seconds