Is an open source search engine backed by donations a viable thing? Some of the processing might even be offloaded to volunteers like those SETI processing programs.
>>12
Make this happen! I wonder how large an index of the web typically is, though. Terabytes? Pitabytes? Yodabytes? Probably not easy to operate a local cache of that.
Name:
Anonymous2012-07-30 6:41
Such a search engine that maintains a current web index doesn't exist. You'll have to maintain your own index.
Name:
Anonymous2012-07-30 6:44
duckduckgo's crawlers can't even find my anus.
Name:
Anonymous2012-07-30 6:45
>>23
iirc duck duck go buys crawler data from yahoo and uses lucene for indexing.
Name:
Anonymous2012-07-30 6:47
You guys could just enhance progscrape a bit, and have it fetch from wikipedia, the first 10 Touhou results on google, Sussman's webpage, losethos.com, some torrent/anime sites and some porn tube sites and you wouldn't be the wiser if the search engine index was complete.
Name:
Anonymous2012-07-30 6:48
>>25
The indexing engine (lucene) is the data processing system for the web index data. Please don't conflate the processing software for the data that needs processing.
Name:
Anonymous2012-07-30 6:49
>>25
>buys crawler data from yahoo
No wonder there isn't anything in there
Name:
Anonymous2012-07-30 6:55
>>27
not following. Uses lucene for indexing = takes the data and puts it into lucene, which lucene then analyzes and indexes. You probably misread me.