strtok is a horribly unsafe function. It uses an internal static variable to keep the state. It would be much safer if it returned a structure to keep the state, or you used an object or (pseudo-)closure (since C doesn't have native support for them, but it's easy to emulate them) to do the same thing. If you still want to use strtok, then doing it that way is okay too. What exactly don't you understand about it?
strtok(STRING, TOKENS);
returns a pointer to first substring in STRING up to a character in TOKENS
strtok(NULL, TOKENS)
returns a pointer to the next substring that contains no characters from TOKENS, keeps the old string internally
if a call to strtok returns NULL then there is no more substrings that contain no characters in TOKENS
strtok modifies the original string by adding \0s so make a copy if you need to keep it.
I've never used strtok_r, but it seems from the man page that it uses an additional variable to store the state for the next string instead of modifying the original string.
>>4
I just realised how confusing it is that I used TOKENS for the second parameter. Don't think TOKENS, think DELIMITERS.
Name:
kinghajj!kiNgHAJjDw2010-02-16 13:46
strtok() is shit, as >>2 said. Use gettok() instead.
/*******************************************************************************
* gettok.c -- an improved strtok(). *
* Copyright (c) 2008, Samuel Fredrickson <kinghajj@gmail.com>; *
* All rights reserved. *
* *
* Redistribution and use in source and binary forms, with or without *
* modification, are permitted provided that the following conditions are met: *
* * Redistributions of source code must retain the above copyright *
* notice, this list of conditions and the following disclaimer. *
* * Redistributions in binary form must reproduce the above copyright *
* notice, this list of conditions and the following disclaimer in the *
* documentation and/or other materials provided with the distribution. *
* *
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER ``AS IS'' AND ANY EXPRESS *
* OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED *
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE *
* DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY *
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES *
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR *
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER *
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT *
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY *
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH *
* DAMAGE. *
******************************************************************************/
/* strtok() is a neat little function, but it's somewhat ackward to use, and
* it's not thread-safe. this one is thread-safe, and (I think) makes more
* sense.
*/
/**
* @param start A pointer to a string; keeps track of current position in
* original string.
* @param delims Characters that will deliminate tokens.
* @return A pointer to the first found token, or NULL if none found.
*
* gettok() works by taking in a pointer to a string, searching through that
* string for deliminators, then updating that string so that on the next call,
* it will return the next token.
*
* Like strtok(), gettok() modifies the original string by inserting NULs where
* it finds deliminators.
*
* @code
* void foo(char *str)
* {
* char *start = str, *tok;
*
* while((tok = gettok(&start, " ")))
* processToken(tok);
* }
* @endcode
*/
char *gettok(char **start, char *delims)
{
char *token = NULL;
if(start && *start && **start && delims)
// Find the first occurance of a delimeter.
if(*start = strpbrk(token = *start, delims))
// Nullify consecutive delimeters.
while(**start && strchr(delims, **start))
*(*start)++ = '\0';
// if token is NULL, let it pass. if token is not NULL but points to a '\0',
// then the token hasn't really been found, so recurse to find it.
return (!token || *token) ? token : gettok(start, delims);
}
#ifdef _GETTOK_DEMO_
int main()
{
static char s[] = " ;; This is a;,; ;string;with; ;delims.";
static char output[64];
char *start = s, *tok;
>>9
I just wanted to make clear that I was the author, and therefore one should not just take my word that my work (gettok.c) was good; rather, the work should be independently judged, and hopefully others will like it as I do. (gettok() is one of my favorite functions because it's so simple, and recursive at that.)
And I just wrote this right now:
(defun make-tokenizer (string delimiters &aux (delimiters
(coerce delimiters 'list)))
"MAKE-TOKENIZER returns a tokenizer function, which
will return the next token in the string separated
by a delimiter, if there is no next token, return NIL.
STRING is the input string.
DELIMITERS is a list of character delimiters to look for."
#'(lambda ()
(dolist (delimiter delimiters)
(let ((pos (position delimiter string)))
(when pos
(return
(prog1 (subseq string 0 pos)
(setf string (subseq string (1+ pos))))))))))
>>13
No, I wrote >>7 on my own because I needed it for a personal project. I later extracted it to its own file because I thought it useful for other C projects, current and future.
Name:
Anonymous2010-02-17 1:24
>>1
So, since no one else has posted the answer yet, it's because string literals cannot be modified, and strtok modifies its string argument (it replaces the tokens with null bytes). If you attempt to modify a string literal, poof, segfault.
Wrap it up in a strdup(), like this:
char *buf = strdup("5/90/45");
That gives you a mutable copy. Then free() it when you're done. Adding strdup() to both calls fixes it.
Also, building with g++ or gcc in C99 mode would give you warning: deprecated conversion from string constant to ‘char*’. String literals are const, and strtok takes non-const.
And for everyone who complains about strtok(), it's only unsafe in a multi-threaded environment, and only on some platforms. AFAIK most platforms actually store strtok()'s internal state in a thread-local.
Name:
Anonymous2010-02-17 1:33
>>18
Actually now that I think about it, strtok() can still break if e.g. you use it multiple times in a nested loop. Yeah it's not terribly safe. Here's an implementation of strtok_r() that uses only strchr(), as a drop-in replacement for strtok():