Fanning Software Consulting

Determining the Number of Columns in an ASCII File

QUESTION: I have a number of ASCII text files to read. The data are always in columns of floating point numbers, but I don't always know ahead of time how many columns of numbers are in the file. Is there a way I can determine the number of columns at run-time?

ANSWER: There are any number of ways to determine this. If your columns are separated by white space (blank characters or tabs), you could read the first line and test the number of columns with StrSplit. Your code might look like this:

   file = 'mydatafile.dat'

   ; Determine the number of rows in the file.
   rows = File_Lines(file)

   ; Determine the number of colums in the file by reading
   ; the first line and parsing it into column units.
   OpenR, lun, file, /Get_Lun
   line = ""
   ReadF, lun, line

   ; Find the number of columns in the line.
   cols = N_Elements(StrSplit(line, /RegEx, /Extract))

   ; Create a variable to hold the data.
   data = FltArr(cols, rows)

   ; Rewind the data file to its start.
   Point_Lun, lun, 0

   ; Read the data.
   ReadF, lun, data
   Free_Lun, lun

Note that you do not need to specify a "pattern" to StrSplit because the default pattern when you use the REGEX keyword is to locate "white space." Note, too, that you have to rewind the data file to the start with POINT_LUN before you try to read the data into the larger data array.

Here is a slight variation if there is a header in the data file. Assume, for the sake of this illustration that there is a three line ASCII header in the file. That is, there are three lines of some kind of information in the file before the actual data begins. Then you might read the data file like this.

   file = 'mydatafile.dat'

   ; Determine the number of rows in the file.
   rows = File_Lines(file)

   ; Open the file and read the three line header. When you
   ; are finished, mark this location in the file as the "currentLocation".
   OpenR, lun, file, /Get_Lun
   header = StrArr(3)
   ReadF, lun, header
   Point_Lun, -lun, currentLocation

   ; Read the first line and parse it into column units.
   line = ""
   ReadF, lun, line

   ; Find the number of columns in the line.
   cols = N_Elements(StrSplit(line, /RegEx, /Extract))

   ; Create a variable to hold the data.
   data = FltArr(cols, rows-N_Elements(header))

   ; Rewind the data file to the start of the data.
   Point_Lun, lun, currentLocation

   ; Read the data.
   ReadF, lun, data
   Free_Lun, lun

Another Method for Counting Columns

Here is another routine, named Count_Columns, that uses a different method to read the first line of the data file and then processes that line into numbers, white keeping track of how many numbers it processes.

   FUNCTION Count_Columns, filename, MaxColumns = maxcolumns

      ; This utility routine is used to count the number of
      ; columns in an ASCII data file. It uses the first row.
      ; as the count example.

      IF N_Elements(maxcolumns) EQ 0 THEN maxcolumns = 500

      OpenR, lun, filename, /Get_Lun

      Catch, theError
      IF theError NE 0 THEN BEGIN
         count = count-1
         RETURN, count
      ENDIF

      count = 1
      line = ''
      ReadF, lun, line
      FOR j=count, maxcolumns DO BEGIN
         text = FltArr(j)
         ReadS, line, text
         count = count + 1
      ENDFOR

      RETURN, -1
   END

Some combination of either of these two methods (sometimes tweaked a little bit for your particular data file) usually works with most ASCII data files containing numerical data.

Google
 
Web Coyote's Guide to IDL Programming