> As long as you're using the same product for
implementation, I think
> you'll have to go with its results.
At the end of the day, you'll indeed have to accept the
results of the
regex engine you're using for the actual implementation,
including all
its bugs and limitations.
> I did try the JavaScript RegExp tester at:
>
> http://www.regular-expressions.info/javascriptexample.
html
This tester's results will depend on the JavaScript
implementation of
your web browser.
In all the regexes in this thread, I noticed everybody is
using what I
call a "lazy dot": .*? I haven't tried any of the
regexes, but I'll
bet many can be improved by replacing the lazy dot with a
negated
character class. E.g. instead of <a href=".*?"
use <a
href="[^"<>rn]+" You know that URLs
can't contain quotes, line
breaks or angle brackets, so tell the regex engine.
Remember that if
whatever you put in the regex after <a
href=".*?" fails to match, the
regex engine will expand the .*? to include the closing
quote, etc.,
all the way to the end of the subject string if need be.
With multiple
.*? in a regex, you're setting yourself up for some pretty
heavy
backtracking, as I explain at
http:/
/www.regular-expression.info/atomic.html
Kind regards,
Jan Goyvaerts
--
RegexBuddy makes regular expressions easy.
http://www.regexbuddy.com/
|