Read_ASCII and 'invalid' ascii files [message #41949] |
Fri, 10 December 2004 11:01  |
jaden
Messages: 3 Registered: December 2004
|
Junior Member |
|
|
Hi Everyone,
I have been using IDL's read_ascii function to read in data files
using a template. The only problem is that all the source files have a
+/- symbol in the header, and IDL reports that these files are not
valid ascii files, when they clearly are. When I open with a text
editor, and remove the suspect character, the function runs perfectly.
Any suggestions?
Jaden
|
|
|
Re: Read_ASCII and 'invalid' ascii files [message #42046 is a reply to message #41949] |
Fri, 10 December 2004 12:14  |
Michael Wallace
Messages: 409 Registered: December 2003
|
Senior Member |
|
|
>> I have been using IDL's read_ascii function to read in data files
>> using a template. The only problem is that all the source files have a
>> +/- symbol in the header, and IDL reports that these files are not
>> valid ascii files, when they clearly are. When I open with a text
>> editor, and remove the suspect character, the function runs perfectly.
>> Any suggestions?
>>
>> Jaden
>
>
> I think the ASCII code defines only 128 characters (7 bit). The +/- sign
> is part of an "extended" ASCII code, which makes use of the 8th bit but
> is not strictly defined.
There's a pretty standard technique for determining whether a file is
ASCII or binary. Basically, you look at the bytes within the file and
determine what percentage of the bytes fall into the printable standard
ASCII range. If this percentage is really high, you can guess that the
file is probably text. Otherwise the file is probably binary. It
appears that IDL's check is a lot more rigorous. If any byte falls
outside printable standard ASCII, it's throwing the error you see.
Now, what to do about it? I don't know what system you're on, but if
you're using *nix, you can use a simple little sed command to remove or
replace the plus/minus sign. For example, the following command finds
all occurrences of the plus/minus, removes them and saves the result in
a new file.
$ sed "s/�//" file.txt > new_file.txt
-Mike
|
|
|
Re: Read_ASCII and 'invalid' ascii files [message #42048 is a reply to message #41949] |
Fri, 10 December 2004 11:15  |
Benjamin Hornberger
Messages: 258 Registered: March 2004
|
Senior Member |
|
|
Jaden wrote:
> Hi Everyone,
>
> I have been using IDL's read_ascii function to read in data files
> using a template. The only problem is that all the source files have a
> +/- symbol in the header, and IDL reports that these files are not
> valid ascii files, when they clearly are. When I open with a text
> editor, and remove the suspect character, the function runs perfectly.
> Any suggestions?
>
> Jaden
I think the ASCII code defines only 128 characters (7 bit). The +/- sign
is part of an "extended" ASCII code, which makes use of the 8th bit but
is not strictly defined.
Benjamin
|
|
|