/prog/ - smoke Xarn everyday

Name: Anonymous 2010-06-23 2:51

http://cairnarvon.rotahall.org/

Name: Anonymous 2010-06-23 3:22

You know, I like visitors on my website too, but spamming it all over /prog/ is just sad and pathetic. Don't give me this bullshit that you are not him. It's sad.

Name: XARNOSAURUS 2010-06-23 3:27

This may surprise you, but I am the real Xarn.

Name: Anonymous 2010-06-23 3:43

>>1-2
SPAWHBTC

Name: Anonymous 2010-06-23 5:02

This may suprise you, but I am not xarn

Name: Anonymous 2010-06-23 5:59

I Xarn myself everyday

Name: Anonymous 2010-06-24 8:01

he makes programming less annoying

Name: Anonymous 2010-06-24 11:08

This may suprise you, but I wrote ``surprise'' wrong.

Name: Anonymous 2010-06-24 13:41

HAXUS

Name: Anonymous 2010-06-24 14:28

Hax anus everyday.
It makes defecation less annoying

Name: FrozenVoid 2010-06-24 15:15

autism lulz

__________________
Orbis terrarum delenda est

Name: Anonymous 2010-06-24 18:22

>>2
Since nearly all of his posts make it to the front page of r/programming and r/coding nowadays, I don't think Xarn is terribly anxious to get the three or four extra hits being linked to on /prog/ will get him.

Name: Anonymous 2010-06-25 4:06

mode change 100755 => 100644 progscrape.py
If you just want to scrape world4ch's /prog/, you can run the script directly (./progscrape.py, or python2.5 progscrape.py if you have several versions of Python installed).
HIBT?

Name: Anonymous 2010-06-25 6:06

>>14
What's the problem?

Name: [b]Xarn[/b]. 2010-06-25 14:00

Xarn.

Name: Anonymous 2010-06-25 14:03

I've just got back from Kazikxarn. I had a great time

Name: Anonymous 2010-06-25 16:58

>>15
removing executable permissions and then telling people to run it as an executable file.

Name: Anonymous 2010-06-25 17:31

>>33
EXECUTE MY ANUS

Name: Xarn !Rmk.XarnE2!xGIX62dlJesBTK+ 2010-06-25 17:45

>>33
I don't know what git is up to, but I did the reverse of what that log entry says.

Incidentally, this thread is a disease.

Name: Anonymous 2010-06-25 18:14

>>35
Incidentally, this thread is a disease.
I concur.

Name: Anonymous 2010-06-25 18:35

>>35
cancer that is killing /prog/!

Name: Anonymous 2010-06-25 19:56

void main(){for(;;){malloc(1);}}

Name: Anonymous 2010-06-25 20:13

>>38
int main(void){for(uintmax_t i=0;i+(i%10000?0:fork());++i)malloc(1);}

Name: Anonymous 2010-06-25 20:17

>>38-39
Back to /b/, please.

This thread is now about progscrape. Is anyone else getting 403 errors when trying to verify tripcodes, or is it just me? Are scrapers banned from using the HTML interface now?

Name: Anonymous 2010-06-25 20:27

>>40
403 MY ANUS

Name: Anonymous 2010-06-25 21:27

$ git clone git://github.com/Cairnarvon/progscrape.git Initialized empty Git repository in progscrape/.git/ remote: Counting objects: 45, done. remote: Compressing objects: 100% (43/43), done. remote: Total 45 (delta 11), reused 0 (delta 0) Receiving objects: 100% (45/45), 17.53 MiB | 1836 KiB/s, done. Resolving deltas: 100% (11/11), done. $ du 33 ./progscrape/.git/hooks 1 ./progscrape/.git/info 1 ./progscrape/.git/logs/refs/heads 1 ./progscrape/.git/logs/refs 2 ./progscrape/.git/logs 0 ./progscrape/.git/objects/info 17960 ./progscrape/.git/objects/pack 17960 ./progscrape/.git/objects 1 ./progscrape/.git/refs/heads 1 ./progscrape/.git/refs/remotes/origin 1 ./progscrape/.git/refs/remotes 0 ./progscrape/.git/refs/tags 2 ./progscrape/.git/refs 18007 ./progscrape/.git 18031 ./progscrape 18031 .

IHBT

Name: Anonymous 2010-06-25 21:55

>>38 here, >>40 was my first back to x please, you made my day bro.

Name: Anonymous 2010-06-26 1:50

>>40
change the useragent string, problem solved.
oh, and now you have to set verify_trips to true, because xarn couldn't figure out how to change the useragent string.

Name: Anonymous 2010-06-26 2:17

>>44
Did you miss the fact that the same update also changed the User-Agent?

Name: Anonymous 2010-06-26 2:34

>>44
Your insistence on being a knob-gobbler has created a syntax error.

Name: Anonymous 2010-06-26 2:39

>>45
no, what's the point of the how part. he changed it, but not in the correct way.

Name: Anonymous 2010-06-26 2:42

>>47
If you're suggesting progscrape should impersonate browsers, I take strong issue with calling that ``correct''.

Name: Anonymous 2010-06-26 2:44

>>48
so we should just say "oh well, the server is broken." and give up instead of working around shit like that?

Name: Anonymous 2010-06-26 6:31

>>12
Did you just associate Xarn with autism? You're a fag!

Name: FrozenVoid 2010-06-26 6:47

>>50
Did you just associate >>12 with homosexuality? You're quite hypocritical.

__________________
Orbis terrarum delenda est

Name: Anonymous 2010-06-26 11:47

>>49
The server isn't broken. Scrapers are very intentionally blocked from accessing the HTML interface, and it's just common fucking courtesy to respect that.
If you don't understand that, maybe you should stick to the imageboards.

Name: Anonymous 2010-06-26 13:17

>>52
scrapers are blocked while shitposting scripts aren't. i'd consider that "broken".

Name: Anonymous 2010-06-26 23:45

>>53
That's because you're an idiot.

Name: Anonymous 2010-06-27 3:39

>>54
the front page is full of shitposts. are you saying it should be like that?

Name: Anonymous 2010-06-27 3:41

>>55
I'm saying that it has no bearing whatsoever on whether or not progscrape should lie.

Name: Anonymous 2010-06-27 4:09

I'm just running progscrape without receiving any 403. However, I noted that now there are two parsing failures at subject.txt.

Shitchan becomes even more broken every day. Amazing.

Name: Anonymous 2010-06-27 4:13

Also, I completely agree with >>16-30

Does it mean that /prog/ does actually have moderation? Un-fucking-believable!

Name: Anonymous 2010-06-27 4:25

>>57
The HTML interface is working again for me too. I don't know whether the filter was turned off or an exception was added for progscrape specifically or something else was going on.

The latest subject.txt corruption happened when Mr^Vac_Bob deleted the 327 spam posts yesterday. He also deleted some threads entirely, so the number of posts in your database may no longer agree with the number of posts subject.txt says there should be.

Name: Anonymous 2010-06-27 4:44

Traceback (most recent call last): File "./progscrape.py", line 248, in <module> (thread[0], post, p['name'], p['meiru'], p['trip'], p['now'], p['com'])) sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a te xt_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode str ings.

Name: Anonymous 2010-06-27 4:52

>>68
adding db_conn.textfactory = str after the db_conn = sqlite3.connect(db_name) line seems to have fixed that.
i'd rather not muck around with FIOC's idiotic type system to figure out how to do the "highly recommended" fix instead.

Name: Anonymous 2010-06-27 4:53

>>61
s/textfactory/text_factory/

Name: Anonymous 2010-06-27 5:28

>>60-62
The funny part is that 2.6 is not supposed to be the incompatible release.

Name: Anonymous 2010-06-27 7:49

>>61
If switching to Unicode isn't trivial, the program was broken anyway.

Name: Anonymous 2010-06-27 11:32

BREAK MY ANUS

Name: Anonymous 2010-06-27 13:05

>>64
There isn't a single non-trivial program in a single language for which that is true.

Name: Anonymous 2010-06-28 0:32

>>64
yes, FIOC is broken when it comes to unicode.

Name: Anonymous 2010-06-28 1:18

>>16-30
There is a discontinuity in the /prog/trix! Agents are coming! Run!

Name: Anonymous 2010-06-28 1:26

>>67
Not really. It's just that some libraries handle it inconsistently.

Name: Anonymous 2010-06-28 2:16

God Xarn you.

Name: Anonymous 2010-06-28 3:03

>>69
there's no reason not to make all strings unicode, or at least handle the conversions automatically.

Name: Anonymous 2010-06-28 3:06

>>71
handle the conversions automatically
It must be nice to live in a world with only one character encoding.

Name: Anonymous 2010-06-28 3:17

>>72
it's fairly easy to write code that can identify any encodings that are still in widespread use.

Name: Anonymous 2010-06-28 3:33

>>73
What gives you that idea, other than breath-taking ignorance of the issues involved? The terrific jobs web browsers are doing in that department?

Name: Anonymous 2010-06-28 3:38

>>74
Namely, because web browsers must conform to hundreds, possibly thousands of unique encodings- not just a few that are ``still in widespread use."

Name: Anonymous 2010-06-28 3:50

>>75
If anything, programming languages have to support even more.

Name: Anonymous 2010-06-28 3:58

Solution: all strings should be valid XML with an encoding declaration.

Name: Anonymous 2010-06-28 4:11

>>74
[/code]if string is not valid utf8 or ascii
if string is valid shift_jis
convert string from shift_jis to utf8
else convert string from iso 8859-1 to utf8
return string[/code]

Name: Anonymous 2010-06-28 4:58

>>78
I like how even with your magically determine encoding functions your implementation is ridiculously insufficient.

Name: Anonymous 2010-06-28 5:09

DecoderFactory.CreateDecoder(EncodedString.encoding).Decode(EncodedString,NULL,NULL,NULL,NULL);

Name: Anonymous 2010-06-28 5:13

>>79
it's trivial to determine if a particular byte sequence is not valid ascii (any bytes with 8th bit set), utf8 (any bytes with 8th bit set that aren't part of a valid utf8 multibyte character), or shift_jis (any bytes other than 00-0F,A1-DF that aren't part of a valid shift_jis double-byte character). no magic necessary.

Name: Anonymous 2010-06-28 5:22

>>71,73,78,81
Morons like this are precisely why Unicode support is so shitty in most high-level languages: on the one hand there's the assumption that character encoding are easy to get right, and on the other there's the belief that you can reasonably hide character encoding details from the programmer.
Both are obvious bullshit, and history has borne this out again and again.

Name: Anonymous 2010-06-28 7:38

>>82
history has Bjarne this out again and again
Fixed that for you.

Name: Anonymous 2010-06-28 8:29

What I'd like a programming language to do is allow me to convert a string between encodings I tell it to, that is: no more "unrecognized literal at position ..." in Python, where I can't even DO anything with the string.

Name: Anonymous 2010-06-28 10:45

>>84
Configure Python correctly, then.

Alternately, Ruby has no proper notion of Unicode, the best thing it has is converting from one encoding to another. So if you want to keep your head in the sand go use that.

Name: Anonymous 2010-06-28 11:48

>>84
Most people who complain about Unicode support in language X (where X is anything besides PHP) just haven't read the documentation. You're no exception.

Name: Anonymous 2010-06-28 12:09

>>86
:<
No, I do read the documentation when needed. I think that I was able to fix most of my problems by using codecs with the errors='replace' option, or by wrapping a stream in a stream decoder.
Still, it's really annoying when you just want to write a simple script for something (though I should learn Perl for that), or when (as >>69 said) the problem is caused by a library you're using.

Name: Anonymous 2010-06-28 12:11

Oh, also: Python's glob has the clever behaviour of changing the return type based on the input type, that is returning unicode strings when you do glob(u'*'), and ASCII strings (which are incorrect) when you do glob('*').

Name: Anonymous 2010-06-28 12:47

>>88
Except file paths that cannot be decoded to unicode are still returned as byte strings.

Name: Anonymous 2010-06-28 13:10

>>87
And many of those libraries come standard.

>>> urllib.quote(u'\N{snowman}')

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "/usr/lib/python2.6/urllib.py", line 1222, in quote

    res = map(safe_map.__getitem__, s)

KeyError: u'\u2603'

Granted 3.x fixed a lot of the Unicode idiocy, but at the expense of making broken filenames completely invisible and inaccessible, and I'm not sure that was the best tradeoff.

Name: Anonymous 2010-06-28 15:27

>>90
Weren't they going to have a dual bytestring and unicode interface? And then they were going to add some dangerously magical auto-quoting to the unicode interface as well.

Name: Anonymous 2010-06-28 15:48

>>91
There's a bytes type, which is actually quite useful and sensible -- individual elements are numeric, so b'ABCDE'[1] == 66. Works a lot like char * in C, actually.

I'm not sure what sort of auto-quoting you're referring to.

smoke Xarn everyday

1 Name: Anonymous 2010-06-23 2:51

2 Name: Anonymous 2010-06-23 3:22

3 Name: XARNOSAURUS 2010-06-23 3:27

4 Name: Anonymous 2010-06-23 3:43

5 Name: Anonymous 2010-06-23 5:02

7 Name: Anonymous 2010-06-23 5:59

8 Name: Anonymous 2010-06-24 8:01

9 Name: Anonymous 2010-06-24 11:08

10 Name: Anonymous 2010-06-24 13:41

11 Name: Anonymous 2010-06-24 14:28

12 Name: FrozenVoid 2010-06-24 15:15

13 Name: Anonymous 2010-06-24 18:22

14 Name: Anonymous 2010-06-25 4:06

15 Name: Anonymous 2010-06-25 6:06

31 Name: [b]Xarn[/b]. 2010-06-25 14:00

32 Name: Anonymous 2010-06-25 14:03

33 Name: Anonymous 2010-06-25 16:58

34 Name: Anonymous 2010-06-25 17:31

35 Name: Xarn !Rmk.XarnE2!xGIX62dlJesBTK+ 2010-06-25 17:45

36 Name: Anonymous 2010-06-25 18:14

37 Name: Anonymous 2010-06-25 18:35

38 Name: Anonymous 2010-06-25 19:56

39 Name: Anonymous 2010-06-25 20:13

40 Name: Anonymous 2010-06-25 20:17

41 Name: Anonymous 2010-06-25 20:27

42 Name: Anonymous 2010-06-25 21:27

43 Name: Anonymous 2010-06-25 21:55

44 Name: Anonymous 2010-06-26 1:50

45 Name: Anonymous 2010-06-26 2:17

46 Name: Anonymous 2010-06-26 2:34

47 Name: Anonymous 2010-06-26 2:39

48 Name: Anonymous 2010-06-26 2:42

49 Name: Anonymous 2010-06-26 2:44

50 Name: Anonymous 2010-06-26 6:31

51 Name: FrozenVoid 2010-06-26 6:47

52 Name: Anonymous 2010-06-26 11:47

53 Name: Anonymous 2010-06-26 13:17

54 Name: Anonymous 2010-06-26 23:45

55 Name: Anonymous 2010-06-27 3:39

56 Name: Anonymous 2010-06-27 3:41

57 Name: Anonymous 2010-06-27 4:09

58 Name: Anonymous 2010-06-27 4:13

59 Name: Anonymous 2010-06-27 4:25

60 Name: Anonymous 2010-06-27 4:44

61 Name: Anonymous 2010-06-27 4:52

62 Name: Anonymous 2010-06-27 4:53

63 Name: Anonymous 2010-06-27 5:28

64 Name: Anonymous 2010-06-27 7:49

65 Name: Anonymous 2010-06-27 11:32

66 Name: Anonymous 2010-06-27 13:05

67 Name: Anonymous 2010-06-28 0:32

68 Name: Anonymous 2010-06-28 1:18

69 Name: Anonymous 2010-06-28 1:26

70 Name: Anonymous 2010-06-28 2:16

71 Name: Anonymous 2010-06-28 3:03

72 Name: Anonymous 2010-06-28 3:06

73 Name: Anonymous 2010-06-28 3:17

74 Name: Anonymous 2010-06-28 3:33

75 Name: Anonymous 2010-06-28 3:38

76 Name: Anonymous 2010-06-28 3:50

77 Name: Anonymous 2010-06-28 3:58

78 Name: Anonymous 2010-06-28 4:11

79 Name: Anonymous 2010-06-28 4:58

80 Name: Anonymous 2010-06-28 5:09

81 Name: Anonymous 2010-06-28 5:13

82 Name: Anonymous 2010-06-28 5:22

83 Name: Anonymous 2010-06-28 7:38

84 Name: Anonymous 2010-06-28 8:29

85 Name: Anonymous 2010-06-28 10:45

86 Name: Anonymous 2010-06-28 11:48

87 Name: Anonymous 2010-06-28 12:09

88 Name: Anonymous 2010-06-28 12:11

89 Name: Anonymous 2010-06-28 12:47

90 Name: Anonymous 2010-06-28 13:10

91 Name: Anonymous 2010-06-28 15:27

92 Name: Anonymous 2010-06-28 15:48

94 Name: Anonymous 2010-12-28 5:02