dear world4chan, I wrote this to return a random line read from a file, but I hate it. what would you do?
#define MAX_LINE_SIZE 1024
void readQuote(int sock_desc)
{
//data
FILE* quoteSource; //source file pointer
int numLines; //max number of lines
int rndLine; //random line
int i;
char buff[MAX_LINE_SIZE] = {0};
//initialize the random number generator
srand(time(NULL));
//open file; check for errors
quoteSource = fopen("quotes.txt", "r");
if (quoteSource == NULL)
{
printf("!! Error accessing file.\n");
return;
}
//get the file size, if zero, exit
numLines = 0;
while (!feof(quoteSource))
if (fgetc(quoteSource) == '\n') numLines++;
rewind(quoteSource);
//select a line and read until that line is found
rndLine = (rand() % numLines) + 1;
printf("line %d of %d: ", rndLine, numLines);
for (i = 0; i < rndLine; i++)
fgets(buff, MAX_LINE_SIZE, quoteSource);
printf("%s\n", buff);
fclose(quoteSource);
}
Name:
Anonymous2006-03-06 10:43
$ ruby -ne 'l = $_ if rand < 1/$..to_f; END { puts l }' < /usr/share/dict/words
Name:
Anonymous2006-03-06 10:47
let's stick to C, please.
Name:
Anonymous2006-03-06 11:18
>>1
Ouch, that algorithm hurts. You don't have to know the number of lines to pick a random line. Just do some magic with RAND_MAX and division and you'll manage it. Too bad I can't remember how :p
Name:
Shirizaan2006-03-06 11:42
Hrm. Yeah that's a touch messy but I can't think of a better way to do it...
Name:
Anonymous2006-03-06 16:38
I would pick a random character from the file and work backwards and forwards from it until I find a full line. This means that the probability of picking a particular line is weighted towards the length of the line, but it is a lot quicker than reading them all in.
#define MAX_LINE_SIZE 1024
void readQuote()
{
FILE* quoteSource; //source file pointer
long fileSize = 0;
long currentPos = 0;
int readLength;
int secondReadLength;
// this is a circular buffer, so +2 for the null terminators,
// one in middle and one at end
char buff[MAX_LINE_SIZE+2];
char *endOfLinePtr;
char *startOfLinePtr;
int i;
// initialize the random number generator
srand(time(NULL));
//open file; check for errors
quoteSource = fopen("quotes.txt", "r");
if (quoteSource == NULL)
{
printf("!! Error accessing file.\n");
return;
}
//get the file size, if zero, exit
fseek(quoteSource, 0, SEEK_END);
fileSize = ftell(quoteSource);
// select random position in the file
// if RAND_MAX < filesize then make sure enough rand()s are called to compensate
for (i=0;i<=fileSize/RAND_MAX;i++)
currentPos += rand();
currentPos %= fileSize;
fseek(quoteSource, currentPos, SEEK_SET);
// get and print line that character resides in
readLength = fread(buff, 1, MAX_LINE_SIZE, quoteSource);
endOfLinePtr = strchr(buff, '\n');
if (endOfLinePtr == NULL)
{
// if '\n' wasn't found we either have an entire MAX_LINE_SIZE size
// line or we just read less than that up to the end of the file.
if (readLength < MAX_LINE_SIZE)
{
endOfLinePtr = buff + readLength;
}
}
if (endOfLinePtr != NULL)
{
// fill remainder of buffer up with previous part of line
endOfLinePtr[0] = '\0';
endOfLinePtr++;
secondReadLength = MAX_LINE_SIZE-(endOfLinePtr-buff);
fseek(quoteSource, currentPos-secondReadLength, SEEK_SET);
fread(endOfLinePtr, 1, secondReadLength, quoteSource);
// then search for last newline for start of line
startOfLinePtr = strrchr(endOfLinePtr, '\n');
if (startOfLinePtr == NULL)
{
// if not found, must be entire line in buffer
startOfLinePtr = endOfLinePtr;
}
else
{
startOfLinePtr++; // skip newline
}
// print latter part of buffer
printf("%s", startOfLinePtr);
}
// print from start of buffer
printf("%s\n", buff);
fclose(quoteSource);
}
Name:
Anonymous2006-03-06 18:17
// select random position in the file
// if RAND_MAX < filesize then make sure enough rand()s are called to compensatefor (i=0;i<=fileSize/RAND_MAX;i++)
currentPos += rand();
currentPos %= fileSize;
FAIL for not just scaling it up or down
printf("%s", quote);
fclose(quoteSource);
}
This is the algorithm used in >>4, which is also in perlfaq5, and The Art of Computer Programming by Knuth.
Name:
Anonymous2006-03-13 3:50
>>13
Actually, I used this: rand() < (RAND_MAX / counter)
Name:
Anonymous2006-03-13 19:14
Unfortunately, >>13 fails it somewhat. If the line chosen is line N then (N-1) previous lines have to be read in from the file first - what a waste of time! Thus the complexity is O(N), as compared to >>6 which has complexity O(1) 'cause it just obtains file size and does up to two reads.
Name:
Anonymous2006-03-13 19:41
>>15
as other people have stated, >>6 does not give each line an equal probability of being shown unless all lines are exactly the same length (maybe you could pad with spaces and trim it afterwards -- probably worth the small increase in cpu time)
>>$ ruby -ne 'l = $_ if rand < 1/$..to_f; END { puts l }' < /usr/share/dict/words
Ruby is such a piece of a shit. Look what that code does is for each line on line n chooses that line with a probability of 1/n. So for line 1 1/1, line 2: 1/2, line 3: 1/3 ...
Anyways when you sum up all the probabilities you find that it actually chooses a line uniformly randomly from a stream without the need to know the size before hand.
The reason why ruby is a shit eating fuck tard language is because that code doesn't explain any of this AT ALL.
here's some C pseudo code(make the routines readline and endofline).
>>25
We know you're a pretty shit eating programmer too
Name:
Anonymous2006-03-23 5:17
>>30
Implement Mersenne Twister one more time, yourself, using inline assembly, with AT&T's shitty fugly syntax, because you're hardcore. Then use it to choose between 1 or 1.
Name:
Anonymous2006-03-23 6:47
AT&T's shitty fugly syntax
SAY IT LIKE IT IS!
Why the fuck is it so common in UNIX land? Intel-style please!
Name:
Anonymous2006-03-23 7:49
It's popular among the GNU and Unix people because one or more of these reasons:
- It's easier to parse
- It's not Intel
- It's not what Microsoft uses; what Microsoft uses is evil so they must do things the other way
- It's fucking ugly, and these people sometimes have a taste for the fucking ugly
Name:
Anonymous2006-03-23 13:10
>>35
Fails. While it's true there's the "anything not M$ is teh win" mentality, it's mostly because when gas et al were being written, AT&T syntax was what was being used on the development machines (which I think were VAXen and Suns and other UNIX machines).
Also, GNU was written with UNIX compatibility _specifically in mind_; and guess what syntax is big on real UNIX boxes. AT&T.
When you sir go to linux user group meetings you buy beer and cigarettes. Before arriving home after another awesome LUG meeting you pour the beer all over yourself and surrounds yourself with burning cigarettes in order to convince your parents that you do have a social life. Sometimes you apply lipstick to your hand and gives yourself kisses.
Name:
Anonymous2006-03-23 18:24 (sage)
>>37 is very experienced, please read his words carefully.
Name:
Anonymous2010-09-21 13:23
I wouldn't want to read a random line from a file in C.
>>43
Do we really have to point it out? 38 Name: Anonymous : 2006-03-23 18:24 (sage) >>37 is very experienced, please read his words carefully. 40 Name: Anonymous : 2010-09-21 13:23 I wouldn't want to read a random line from a file in C.