regex is a high-level, general-purpose, interpreted, dynamic programming language. regex was originally developed by Larry Wall, a linguist working as a systems administrator for NASA, in 1987, as a general purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and became widely popular among programmers. Larry Wall continues to oversee development of the core language, and its newest version, regex 6.
Name:
Anonymous2008-10-20 20:12
s/regex/BRO FIST/iS
Name:
Anonymous2008-10-20 20:43
>>2 who kill thier babbys. becuse these babby cant frigth back?
couldn't find anyone actually answering ur question so basically regexp on the implementation level is the most basic shit you can do in programming, loop each character and check it, then check it again when you loop to the second character and so on
look at the source code for perl's regexp sometime, it tells you a lot about how stuff like this is done, there are some while loops in there spanning over thousands of lines, all to process each character and the overall syntax in your regexp
if you don't understand that very basic explanation then gtfo
>>30
I think that paper is a bit dated. Last I heard, Perl's regexes fell back on a NFA-based evaluator when none of the Perl regex extensions (lookaheads, etc) are used. That said, I have no citations for this. Re-running those experiments is probably the easiest way to verify/disprove the claim.
btw, here's the code I tested them with. I didn't bother adapting all of the tests since this pretty much gets the point across anyway, but it shouldn't be hard to do.
# adapt these as you like, but 25 is sufficently enough to show how bad perl sucks at this
for n in 1 5 10 20 25; do
i=`repeat $n a`
r=`repeat $n a\?`$i
echo n = $n
for body in awk perl python
do
echo "$body":
time test$body $i $r
done
echo
done
Name:
Anonymous2008-10-24 23:35
>>31
From that paper, ``As of Perl 5.6, Perl's regular expression engine is said to memoize the recursive backtracking search, which should, at some memory cost, keep the search from taking exponential amounts of time unless backreferences are being used.''.