>>38 here, >>40 was my first back to x please, you made my day bro.
Name:
Anonymous2010-06-26 1:50
>>40
change the useragent string, problem solved.
oh, and now you have to set verify_trips to true, because xarn couldn't figure out how to change the useragent string.
>>49
The server isn't broken. Scrapers are very intentionally blocked from accessing the HTML interface, and it's just common fucking courtesy to respect that.
If you don't understand that, maybe you should stick to the imageboards.
>>57
The HTML interface is working again for me too. I don't know whether the filter was turned off or an exception was added for progscrape specifically or something else was going on.
The latest subject.txt corruption happened when MrVacBob deleted the 327 spam posts yesterday. He also deleted some threads entirely, so the number of posts in your database may no longer agree with the number of posts subject.txt says there should be.
Name:
Anonymous2010-06-27 4:44
Traceback (most recent call last):
File "./progscrape.py", line 248, in <module>
(thread[0], post, p['name'], p['meiru'], p['trip'], p['now'], p['com']))
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a te
xt_factory that can interpret 8-bit bytestrings (like text_factory = str). It is
highly recommended that you instead just switch your application to Unicode str
ings.
>>68
adding db_conn.textfactory = str after the db_conn = sqlite3.connect(db_name) line seems to have fixed that.
i'd rather not muck around with FIOC's idiotic type system to figure out how to do the "highly recommended" fix instead.
>>74
Namely, because web browsers must conform to hundreds, possibly thousands of unique encodings- not just a few that are ``still in widespread use."
Solution: all strings should be valid XML with an encoding declaration.
Name:
Anonymous2010-06-28 4:11
>>74 [/code]if string is not valid utf8 or ascii
if string is valid shift_jis
convert string from shift_jis to utf8
else convert string from iso 8859-1 to utf8
return string[/code]