Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

indexing with SQL database

Name: Anonymous 2008-09-15 10:13

there was a thread here not too long ago about searching /prog/ and i think the general consensus was that downloading and indexing the entire /prog/ and then indexing it was probably the best way of going about it.

but, that got me thinking, what is the best way of putting all this info into a database?
would each word in each thread have it's own entry? or would it be entire posts that got stored?
also, how to deal with all the stupid ascii art and silly tags?

Name: Anonymous 2008-09-15 10:30

warning: redundant phrase detected near "indexing it"

Name: Anonymous 2008-09-15 10:38

>>2
indeed, please disregard when reading

Name: Anonymous 2008-09-15 11:44

The HTML backend to the /prog/ site is completely regular. It's easy to strip all container elements leaving nothing but code that conveys the post.

Name: Anonymous 2008-09-15 12:41

Just store the entire posts and use full-text search. The whole thing was only about 8MB last time I looked, IIRC.

Name: Anonymous 2008-09-15 12:50

MAKE SURE TO INDEX IT TWICE GUYS

Name: Anonymous 2008-09-15 13:02

u meen index liek in boox?

Name: Anonymous 2008-09-15 14:18

Name: Anonymous 2010-12-17 1:23

Are you GAY?
Are you a NIGGER?
Are you a GAY NIGGER?

If you answered "Yes" to all of the above questions, then GNAA (GAY NIGGER ASSOCIATION OF AMERICA) might be exactly what you've been looking for!

Name: Anonymous 2011-01-31 20:04

<-- check em dubz

Name: Anonymous 2011-02-02 23:52


Don't change these.
Name: Email:
Entire Thread Thread List