comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » bug in stregex?
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
bug in stregex? [message #31532] Fri, 19 July 2002 13:54
Vapuser is currently offline  Vapuser
Messages: 63
Registered: November 1998
Member
I don't you, you tell me. Is this a bug?

IDL> tt=stregex('cdefaz',"(.*)(a|b|c)z",/extract)
IDL> help,tt
TT STRING = 'cdefaz'

My understanding of regular expressions says it's a bug. The first
'(.*)' should only match up to the 'a' and the second subexpression,
since it is *unqualified*, should handle the 'az', i.e. TT should
have two parts. tt[0] should be 'cdef' and tt[1] should be 'a.'

If my regex had been '(.*)((a|b|c)z)*', the result would be
understandible; then the 'greediness' of the (.*) regular expression
should have consumed the whole string because the 'zero or more'
qualifier applied to the second would have been satisfied, i.e. zero
matches. But in my example the engine should have failed when
attempting this match and should have backtracked two characters to
produce the output I suggest above.

Perl certainly does it this way:

% perl
$s="cdefaz";
print "s=$s\n";
@tt=($s =~/(.*)(a|b|c)z/);
foreach (@tt) { print "$_\n";}

this code produces the output:

s=cdefaz
cdef
a

i.e. $tt[0] = cdef and $tt[1] = a, as expected.



But regular expression are always somewhat mysterious. Am I missing
something here?

Comments?

whd
--
William Daffer: 818-354-0161: William.Daffer@jpl.nasa.gov
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Spectral tracings
Next Topic: Meaning of outer product

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 17:42:49 PDT 2025

Total time taken to generate the page: 0.00546 seconds