public static string CleanString(this string str)
{
var result = new StringBuilder(str.Length);
for (int i = 0; i < str.Length; i++)
{
if (!badChars.Contains(str[i]))
result.Append(str[i]);
}
return result.ToString();
}
This algorithm makes use of the .NET 3.5 'HashSet' class to give O(1) look up time for detecting a bad char. This makes the overall algorithm O(n) rather than the O(nm) of your posted one (m being the number of bad chars); it also is lot a better with memory usage, as explained above.
char badChars[256] = { /* the contents of
this are left as an exercise for the reader */ };
void CleanString(char *s) {
char *d=s;
do
if(badChars[*s])
*d++ = *s;
while(*s++);
}
>>2
I never thought I'd see this approach used on /prog/, and in fact I've avoided mentioning it in other contexts (to avoid tedious explanations.) Props.
>>5
That's but a small part of my aversion to tedious explanation. AFAIK to the average /prog/lodyte, >>2 might as well be using a Y-combinator--which I suspect is better understood around here... and at least it has an evocative name and possibly bewildering implementation to wield against people who would write it off.
>>8
Hm I'm actually intrigued. How would you implement a filter+lambda using the Y combinator? HIBT?
Name:
Anonymous2010-08-22 18:58
>>9
You seem confused. The purpose of the Y combinator is to show that lambda is the only thing necessary to implement recursion; it cannot be used to implement lambd.
>>9
You haven't been trolled, just uh... look, >>8 is saying that /prog/lodytes aren't any more (or even as) likely to understand what's going on in >>2 than they are to understand code that uses a Y-combinator. It has nothing to do with implementing >>2 with a Y-combinator. (I have no idea where the lambda part in your post comes from.)
>>11 How so?
A lot of languages memoize for you, and my biggest problem is when it's relevant to portable assembly discussions. You can safely bring up the Y-combinator with HASKALites because, at worst, that's what they think fix does.
Here at least, if you bring memoization into a discussion about using memoization in C you'll probably end up facing someone who will not believe it's possible to solve any kind of search in O(1). That is my fear at least. It's only been relevant when this kind of prejudice has been already presented in the discussion so I've never brought it up. Might as well argue religion with someone who firmly believes whatever you don't.
>>13
It's close enough. The best examples of this kind of LUT use are to cache searches as needed.
Name:
Anonymous2010-08-22 21:55
>>2
Okay, I think I know what happens but I have a lingering question. After you've changed the contents of the array s points to how do you know the length of the valid array? what happens to the /0 character?
>>20
Same deal, but with a larger (mostly unallocated) array and direct access to a very cooperative MMU. Or, possibly doable with your virtual memory system, who knows?
Hashtable lookup would be worth it if the number of items you check against is large enough, and those items were large/complex enough (not chars). For item counts like yours, the character compare is faster. An even faster method would be to use a 256character array (for ASCII) or bitvector to mark the characters that should be filtered, lookup would be as fast as one memory access there (or shift+memory access), and it would be much faster than your hashtable lookup. Oh wait, I should have read the thread before I posted this >>2 already gave you this idea.