comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Cut down a big file into several files based on its a column.
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Cut down a big file into several files based on its a column. [message #53012] Tue, 13 March 2007 18:22 Go to next message
kim20026 is currently offline  kim20026
Messages: 54
Registered: November 2006
Member
Good day, everyone! (I wrote 'Good morning', but maybe it is not
morning for others. anyway) I would like to slice a big data file
into several files based on its a column, but it was not successful
yet. Please give me some suggestions.

The contents of my original data file are...

90|2000|1|1|95|95|95|95|95|96|95|95|94|93|93|93|94|94|94|95| 95|93|91|
90|94|95|96|95|
90|2000|1|2|96|95|94|96|93|93|76|74|76|81|85|84|76|53|43|40| 39|41|33|
33|32|32|33|35|
90|2000|1|3|35|34|38|35|29|28|30|29|26|23|25|22|22|30|29|24| 23|24|36|
31|34|39|31|34|
.
.
.

This is a 2-dim array, I named this array as 'data(*, *)'.

What I want to do now is to slice this file into 6 files based on the
information in 2nd column, ( data(1,*) ). This column includes year
info, and the values are from 2000 to 2006.

I tried this way so far.

pro hum_year
; 1. Read orignial file, and designate the initial values of
variables and arrays.

file='hum.txt'
ndata=file_lines(file)
data=intarr(28, ndata)

; 2. Close any unit files before processing.
close,1
close,2
close,3
close,4
close,5
close,6

; 3. Open files and prepare for writing.
openw,1,'HumNWS_2000.txt'
openw,2,'HumNWS_2001.txt'
openw,3,'HumNWS_2002.txt'
openw,4,'HumNWS_2003.txt'
openw,5,'HumNWS_2004.txt'
openw,6,'HumNWS_2005.txt'

; 4. Classify data and write them into designated file.

for t=0L,ndata-1 do begin

case data[1,t] of
2000:printf,1,format='(28i6)',data[*,t]
2001:printf,2,format='(28i6)',data[*,t]
2002:printf,3,format='(28i6)',data[*,t]
2003:printf,4,format='(28i6)',data[*,t]
2004:printf,5,format='(28i6)',data[*,t]
2005:printf,6,format='(28i6)',data[*,t]
endcase

endfor

; 5. Close files.
close,1
close,2
close,3
close,4
close,5
close,6

end
----------------------

However, compile stops process 2 everytime. I don't understand. Do I
have do something to treat the '|'s in original file? Please give me
any suggestions and comments. Thank you in advance.
Re: Cut down a big file into several files based on its a column. [message #53060 is a reply to message #53012] Fri, 16 March 2007 08:54 Go to previous message
Maarten[1] is currently offline  Maarten[1]
Messages: 176
Registered: November 2005
Senior Member
On Mar 14, 1:22 am, "DirtyHarry" <kim20...@gmail.com> wrote:

> However, compile stops process 2 everytime. I don't understand. Do I
> have do something to treat the '|'s in original file? Please give me
> any suggestions and comments. Thank you in advance.


This threw me off a bit: you're not reading your data at all.

Now, the delimiter is a but screwed up. Read the help on read_ascii
and ascii_template. You'll end up with your data in a structure, and
then you can use where as you see fit.

Maarten
Re: Cut down a big file into several files based on its a column. [message #53061 is a reply to message #53012] Fri, 16 March 2007 08:34 Go to previous message
Maarten[1] is currently offline  Maarten[1]
Messages: 176
Registered: November 2005
Senior Member
On Mar 16, 9:46 am, "DirtyHarry" <kim20...@gmail.com> wrote:
> Thanks, Maarten... but... T.T
>
> Is there any way to do this in Windows version of IDL?

This was my first try. It doesn't quite work, because I can't get the
format statment right (how do you insert linebreaks?)

;;; failed attempt.
file='hum.txt'
ndata=file_lines(file)
data=intarr(28, ndata)

for year=2000,2005 do begin
openw, unit, string(format="('HumNWS_',I04,'.txt')", year), /get_lun

idx = where(data[1,*] eq year, cnt)
if cnt gt 0 then begin
fmt=string(format="('(',I04,'(28i6))')", cnt)
; it would ne neat if IDL actually allowed to add newlines between
the blocks here.
printf, unit, format=fmt, data[*,idx]
endif

free_lun, unit
endfor
end

;;; Second attempt: use explicit for loop.
file='hum.txt'
ndata=file_lines(file)
data=intarr(28, ndata)

for year=2000,2005 do begin
openw, unit, string(format="('HumNWS_',I04,'.txt')", year), /get_lun

idx = where(data[1,*] eq year, cnt)
fmt='(28i6)'
if cnt gt 0 then $
for ii=0,cnt-1 do $
printf, unit, format=fmt, data[*,idx[ii]]

free_lun, unit
endfor
end

HTMH,

Maarten
Re: Cut down a big file into several files based on its a column. [message #53064 is a reply to message #53012] Fri, 16 March 2007 03:24 Go to previous message
Peter Clinch is currently offline  Peter Clinch
Messages: 98
Registered: April 1996
Member
DirtyHarry wrote:
> Thanks, Maarten... but... T.T
>
> Is there any way to do this in Windows version of IDL?

Perl is usually better for this sort of string handling IME, and Perl is
freely available for Windows and is a very handy thing to have to
pre-process data files.
It'll probably be quicker to hack on in IDL for this immediate problem,
but I'd look into Perl for future jobs, and get the O'Reilly Learning
Perl "Llama book" by Schwarz and Phoenix to go with it.

Pete.
--
Peter Clinch Medical Physics IT Officer
Tel 44 1382 660111 ext. 33637 Univ. of Dundee, Ninewells Hospital
Fax 44 1382 640177 Dundee DD1 9SY Scotland UK
net p.j.clinch@dundee.ac.uk http://www.dundee.ac.uk/~pjclinch/
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Change in Map_Contintent behavior?
Next Topic: Re: Histogram looks like spiky bars

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Thu Oct 09 02:49:35 PDT 2025

Total time taken to generate the page: 0.55990 seconds