Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

regexen

Name: Anonymous 2011-12-07 18:42

Is there any way (in any language) to get a list of all the strings that would be matched by any given regular expression?

For example "[fm]aggot" would generate "faggot", "maggot"

Of course + and * would cause trouble, no need to support them. In fact I only really need []. I was hoping there's a better way than a nested for loop.

Name: Anonymous 2011-12-07 19:05

i do nt kno w so fAASADDSADDDDDDDSdD NEVERTHELESS.

Name: Anonymous 2011-12-07 19:35

In Perl 6 it's not so troublesome, supposing you have done the work to get a list of alternating: 1. a string with the [...] replaced with %s, and 2. a reference to the extracted character class is placed in an array, something like this:


my @groups = ();
for @list -> $string, $alternations {
    # fmt($string, $_) for each $alternation:
    my $tmp = $alternations>>.fmt: $string;
    @groups.push: $tmp;
}

# reduce @groups to a single product, .flat is needed in most releases:
my @possibilities = @groups.reduce: { ($^a X $^b).flat };


Nesting character classes in proper alternations (or alternations within alternations) is going to ruin this completely. I guess you could apply it recursively and only do the outermost alternations in a given pass.

If you use named character classes you'll need to expand them yourself in the associated array. If you use negated ones, god help you.

Name: Anonymous 2011-12-08 3:39

>>1

1. Convert regular expression to an epsilon nfa.
2. Convert epsilon nfa to an nfa with no epsilon transions.
3. Try to convert nfa to a dfa.
4. If too many states were generated, abandon the dfa and goto 5.
4.1 Take the dfa and find all paths that stem from the source, and emit the collected string whenever an accept state is passed.
4.2 Take note of cycles. These represent repeating substrings in the string that can repeat arbitrarily many times. You may want to use a special notation to represent them, or process them in a special way to avoid infinite computation.

5. Do the same algorithm depicted in steps 4.1 and 4.2 but for an nfa. You will start in the start state, and maintain a set of active states as you expand out. Whenever at least one active state is in an accept state, emit the current string. Take note of cycles to avoid infinite computation.

Don't change these.
Name: Email:
Entire Thread Thread List