Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

regex

Name: sage 2012-01-11 18:23

Is there a common regex notation for "at least one of each?"

For example, I want to specify that a string must contain "a", "b", and "c" in any order.  So these are all matched:


"abcabcabcbacbacb"
"acb"
"aaabbbccc"
"cba"
"cccbbbaaa"


but not "ab" or "aaaaaaaaaaaabbbbbbb".

Similarly, how about "exactly one of each" so that one_of_each("a" "b" "c") would match


abc
acb
bca


but not "aabc" or "aaaaa"?

Name: Anonymous 2012-01-11 18:29

why don't you munch on some poop this is /g/ after all

Name: sage 2012-01-11 18:31

>>2
Aren't regular expressions related to technology?

Name: Anonymous 2012-01-11 18:41

>>1
but not "aabc"
why not?

Name: Anonymous 2012-01-11 18:58

Read SICP.

Name: Anonymous 2012-01-11 19:08

Sounds like a context-dependant grammar aka NP-complete aka impossible to solve with today's technology.  Maybe we will prove that P=NP and then you'll be able to solve this.

Name: Anonymous 2012-01-11 19:08

>>1
This is not a regular expression anymore, it is a context-free grammar.

Name: >>7 2012-01-11 19:12

>>6
Right, it's context-sensitive.

Name: Anonymous 2012-01-11 19:22

$ cat list | grep -e a -e b -e c
[code]$ cat list | grep -e a -e b -e c | grep -v -e a{2,} -e b{2,} -e c{2,}

Name: kodak_gallery_programmer !!kCq+A64Losi56ze 2012-01-11 19:35

They 90's called. They want their preprocessor back.

Name: sage 2012-01-11 19:35

>>7
Good call, I'm actually writing a grammar parser.  I was just wondering what the common notation for such an expression was.

Name: Anonymous 2012-01-11 19:50

>>11
writing a grammar parser
what the common notation was?
fail

Name: sage 2012-01-11 19:57

>>12
Well, I just mean, for example, that I use the * operator to mean "zero or more," the + operator to mean "one or more," etc... and I thought there might be a fairly common symbol for "one of each."

But it doesn't really matter.  I've decided to go with the & operator, like so:

a & b & c - matches "abc", "cba", "bca"...
+a & +b & +c - matches "abc", "cba", "abcabcacbacb"...

Name: Anonymous 2012-01-11 20:03

#include <stdbool.h>
bool has_matched_all_of(char *s) {
     bool matches[strlen(s)];
     /* The remainder of the function is left as an exercise to the reader. */
}

Name: Anonymous 2012-01-11 20:08

>>13
a|b|c is the usual one_of, but that matches a; b; c; ab; bc; bbbaacccc; cccbbbaaa; aaabbbccc; etc.

Name: Anonymous 2012-01-11 20:12

>>15
Also [abc] if they're single characters.

Name: sage 2012-01-12 1:00

>>13,15
Strange, it turns out that a & b & c is actually quite hard to implement.  Maybe that's why it's uncommon.

Implementing a | b | c is easy.  I use a tree structure where each node has up to two children.  So a | b | c essentially becomes (a | b) | c.  But with a & b & c, you can't reduce that to (a & b) & c, because it's not semantically the same.  For example, the string "acb" would not match that tree because it does not contain a substring that satisfies (a & b).  Ho hum.

Name: OP IS A !FAG 2012-01-12 3:36

Some people, when presented with a problem, say "I know!  I'll use regular expressions!"  Now they have two problems.
-- Jamie W. Zawinski


No, you can't use a regex for that.

[code]
bool match(char* s){
 int a=0, b=0, c=0;

 for(;*s;++s){
  switch(*s){
   case 'a': ++a; break;
   case 'b': ++b; break;
   case 'c': ++c; break;
  }
 }

 if(a<2 || b<2 || c<2) return false else return true;
}

Name: Anonymous 2012-01-12 4:26

>>17
Like I said, it's context sensitive.
In your grammar, (a & b & c)+, corresponds to { a^n b^n c^n | n >= 1 } (and b^n a^n c^n, etc, but we don't care about them), which is the canonical example of non-context free grammar. Context-sensitive grammars are no joke.

Don't change these.
Name: Email:
Entire Thread Thread List