>Plan 9
Wrapping all system calls around network protocol is considered not harmful at all.
Name:
Anonymous2013-02-25 7:40
dey like go cuz rob pike made it
Name:
Anonymous2013-02-25 8:25
plan 9 is just rob pike's "muh cloud" pipedream where tiy doesn't have any local storage, it's all managed by an external entity that you can TOTALLY trust.
>>1
LEEEEEEEEEEEEEEL
>LE 2013
>NOT LE USING LE /G/
XDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
LEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEL
>>6
If you're going to play silly quoting games and make posts that are filled with pointless yammering and ad hominem, you ought to jump back into The Herd of Retards on /g/. You don't belong here.
Name:
Anonymous2013-02-26 2:01
>>9,11
Sure showed him guys! That negative attention is definitely going to have him running back to /g/ with no consequences here! He definitely won't do it again to spite you. And it certainly isn't a theme troll like LEL-cunt, so we are in the clear.
Thanks for looking out for /prog/ guys. Go ahead and pat yourselves on the back.
>>1
Had to agree that autoconf/automake is a nightmare. Stallman invented it to torture goyim.
Name:
Anonymous2013-02-26 4:48
>>18
Every time I read an article about a mathematical concept named after someone I just can't help reading about the author themself as well, and if they're dead I get all sad and depressed about how death indiscriminately took away yet another brilliant mind, and how the work they left behind serves as a reminder of one's mortality; and if they're not dead, they're usually really old and it makes me sad that that soon, they too will be lost to the nothingness and there's nothing anyone can do about it.
And it makes me realize that the same will happen to me, that no matter how many papers I write and how many discoveries I make, I will slowly degrade and eventually cease existing. I look at other people and wonder how they deal with the futility of their own lives, how they deal with their knowingly limited existence. Maybe they simply grew accustomed to the idea, they became desensitized to the plight of being human, of being mortal. Or maybe they simply never thought about it, they never considered past a few months or years in the future. Maybe they believe god will swoop them from the forever hungry claws of oblivion and ensure their continued existence, be it heaven, hell, or anything in between; perhaps this is some sort of denial, perhaps it is one of those delusions one creates and revels in to avoid dealing with the excruciatingly painful truth. I wish I could be so careless, I wish I could believe there is a saviour.
Alas, I am doomed to lead a bounded life, to do a finite number of things, too few things, much fewer than I would have wanted to, and to cease existing before year 2100.
And no I'm not the panic attack guy. He asked me to fill in for him today.
>>13
lel i hate reddit but i fuckin/g/ love E/G/IN MEMES XDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD LE STAY MAD /G/ENTOO/G/ROS XDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
Name:
Anonymous2013-02-26 17:30
go back to /r/programming and stay there
Name:
Anonymous2013-02-26 18:04
>>31
LLLLLLLLLLLLLLLLLLEEEEEEEEEEEEEELLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
>U AR LE RETARDED XDDDDD
>EGIN
>>35
Nobody takes cat-v.org seriously there either. Perhaps you meant cat-v.org.
Name:
Anonymous2013-02-26 19:34
I agree with >>2,4 on the security side of things, and hopefully making a flag for disabling and managing might be an awesome addition, kinda like pf, but for system calls too, that way you can have client servers, and masters.
But I really want to crack all the things on that site, and give reasons why or why not. Some of them I disagree, but the rest are ok. You /prog/s are welcome to contribute, and see a reasoning why or why not:
Harmful < harmful alt Reason|Objection
SGML,XML,YAML JSON,CSV,ndb6,UTF8 txt duh, all markup is already in UTF8
NFS,SMB,AFS,WebDAV 9p um, I know NFS & LDAP were bad hacks, but a distrubuted FS, 9p? Why not bigtable for data, and SSH in a jail for flat files?
C++,Java,Vala,D,Python,Ruby C,Go,Limbo No explanation needed here(NENH). The point is made obvious if OOP _must_ be used.
pthreads(PoSixthreads) CSP-style concurrency:[e.g.] Agreed --examples
Perl,Ruby. rc, awk HAHAHA, I love the joke too. But so true, these are better alt.
PCRE SRExps | regexps (e.g.) Fine. As long as there is master standard for regex
Bash, tcsh, zsh rc, pdksh, ash/dash. add mksh & awk to the mix. scsh when ready☺
GNU Coreutils Plan 9 from User Space well, anything that does the basic *nix. No one wants to reinvent the wheel.
GNU Screen tmux NENH. But Nicholas, why did you made the default bindings so weird? Leave a ^A default too
GNU info Man pages < nuff said.
GCC 8c,tcc I appluad their effort, but the examples are not complete. LLVM is already ahead, and it is not finished.
glibc ucLibc,DietLibc add musl
GNUauto{conf,make*} mk|portable makefiles NENH. See Line 2
Glib libc (↑),p9p Clibs ↸
GTK,Qt,VxWindows. Tk,textualinterfaces GTK+ for the GUI needy. ncurses or termbox for *
Vim,Emacs,etc... Acme, Sam, ed. ex-vi. sed | m4 for more wild edits ex cannot do.
UTF-*,other enc UTF-8 Obvious, but I never liked UTF-8, I always thought we only needed ASCII, and code page 437 when it came out.
iSCSI,FCoE AoE duh. And SATA3 for more bandwith.
PAM Factotum Not my area, but I know PAM is a bad implementation.
Jabber & XMPP IRC,STOMP ALL OF MY YES. pf for protection
IMAP SMAP I need to try SMAP. But I know IMAP has too much stupid things.
SQL db Tutorial D,pq,BigTable,*nix fs If we are talking about syntax, HELL YEAH. If we are talking about ACIDity, WTF?!? Prominently for ACID any SQL is fine, pgSQL being best, Berkeley_DB 2nd, SQLite being a contenter. I can also tolerate good Structs.
svn Git,hg,CVS,.tar Add Fossil to the list, any DB from the top.
FreeBSD,NetBSD,Solaris OpenBSD Um, and no inferno or plan9 advertising? FreeBSD is needed for the Enterprise shings. OpenBSD as a great personal OS, certainly; maybe so server applications. NetBSD for the other devices *nix hasn't ported. This is one of the necessary evils.
Apache,lighttpd thttpd,OpenBSD apache,nginx,noHTTP I can only agree more. gopher is mainly what 70% of people need.
SVG PS(PostScript) Certianly. And to expand, pic,ideal,grn(old), & grap are better alt.
PDF PS(PostScript), DjVu ↖ditto.troff++
EPUB DjVu ↵
ALSA OSS4 yep. And Midi for music format.
GPL,LGPL,Apache,MPL,CC ISC,MIT/X,[Free]BSD,CC0,public domain certainly, they make the most logical and technical sense. --public domain
head sed 11q nice hack. I use most, and cat. This almost goes with line 17.
Name:
Anonymous2013-02-26 20:41
If OpenBSD is so great, how come it can't do wifi?
I don't have an opinion on the miscellaneous unix apps Uriel listed, but the bias is obvious.
On the programming language section: Did he even try D or did he lump it in with C++ because it uses as many semicolons? Complete lack of Lisp or at least Scheme which is far more simple than anything else in this list. Omits Erlang as a programming language and then lists it last (Go first, of course) in the less harmful threading models to not look like he's hyping Go hard when goroutines is basically the same thing along with the typed Limbo channels. Also on that note, I despise Go and its insistence on wasting my time with unused variable errors. Limbo is just Oberon with more curlies and Go's obvious previous attempt, nobody uses Limbo at all, every implementation of Scheme sees more active use than Limbo. Might as well advocate an obscure Wirth language which is the exact same thing down to the variable declaration syntax, the same Wirth whose language the C guys bashed in a 1981 paper. C++, Java, Python, Ruby are shitty for various reasons though, but I haven't tried Vala so I don't have an opinion on that one (someone here used to mention it occasionally).
Harmful Less harmful alternatives
SQL databases. [...], BigTable, plain old hierarchical filesystems.
Good grief.
Harmful Less harmful alternatives
IMAP. SMAP(Simple Mail Access Protocol).
Nobody uses SMAP. Considering SMTP already exists for the simple way and the scope that IMAP was trying to accomplish, IMAP is just Lisp over the wire, theres even mention of gensym in the RFC. Or, is it because Crispin made fun of le UNIX philosophy? http://web.archive.org/web/20060110153507/http://panda.com/tops-20/
>>37 I never liked UTF-8, I always thought we only needed ASCII
Then how Russians Israelis are supposed to intermix Hebrew and Cyrillic symbols in their code, you schlemazl?
Name:
Anonymous2013-02-27 5:57
So, I have two questions.
Is Plan9 an elaborate joke? I mean, I went to look at http://swtch.com/plan9port/man/man1/cat.html, for reasons that will become obvious later. It doesn't support any interesting flags (including "-v"), but all right, simplicity and shit.
But: "Read always executes a single write for each line of input, which can be helpful when preparing input to programs that expect line-at-a-time data." -- what the fuck? Did they break the fundamental pipe abstraction, so it's now kind of like streams but actually sometimes sort of messages?
Second, do I understand correctly that "cat-v" stands for "cat -v" and means that they intend to make visible the ugly stuff? Then, why doesn't Plan9 support "-v" switch, and what's the opposite of "-v" switch, the reverse operation?
>>46
Blocking until you see a full line of input doesn't "break the fundamental pipe abstraction". It's actually a great example of how the abstraction works - read tokenizes input so the next process in the pipeline doesn't have to.
The -v switch is what's ``considered harmful'' in the title. cat concatenates files, so having a flag like -v that turns it into a filter makes no sense.
>>46
Pretty sure UNIX has handled text line-by-line for decades. Run cat, type ``hello'', hit Return, it prints ``hello''.
The original ``cat -v'' paper (hosted on the site at http://harmful.cat-v.org/cat-v/unix_prog_design.pdf) recommends writing a separate program, vis, whose job it is to print non-visible characters. This way it can be used in conjunction with any program, not just cat.
Name:
Anonymous2013-02-27 14:47
>>49
This PDF appears to be malformed. Page 3: (The file copy program found on operating systems like or is an example.)
Name:
Anonymous2013-02-27 14:59
>>46
How is the -v flag for cat ``interesting''? I'll never understand.
>>40,42
Totally agree. The only Minus I have of POP3 is that you have to Fetch ALL the email. And you are right that SMAP is not used, it is in the Beta stages still. It needs testing. I'd prefer a STOMP protocol for implementing a simpler IMAP.
>>45
I am not against UTF-8, for exactly those reasons. The thing that I dislike in the UTF-8 Character set is things like: ⁇﹖⁈⁉‽‼❕❗❢❣ꜝꜞꜟ﹗!ᵃ ᵇ ᶜ ᵈ ᵉ ᶠ ᵍ ʰ ⁱ ʲ ᵏ ˡ ᵐ ⁿ ᵒ ᵖ ʳ ˢ ᵗ ᵘ ᵛ ʷ ˣ ʸ ᶻ⁰ ¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ⁺ ⁻ ⁼ ⁽ ⁾ ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ ₊ ₋ ₌ ₍ ₎
Basically unnecessary things, esp. things that can be represented using simple circumventions and discourse. I dislike the most things that are repeated. What is wrong with ☺ when ^.^ is just enough. I just saved 2 bytes whoopee!
>>46
I think it is more of a satirical piece. But there are some truths in there. I tried cracking only the page on this discussion.
>>47
Hahaha, so right. I would have used gofish just to make the readers cry to make it more ironic.
>>49
Or sed for that matter. Their hack made my day.
>>52
Um, character set does not determine how connection control is handled. HTTP, POP3, IMAP, NNTP, etc. have only used US-ASCII bytes to handle their connection controls, since you do not need that many bytes to represent flow. The only minus is markup in the flow. HTTP is the most cluttered protocol of them all, using paragraph long specifications for a simple GET and POST. I rather support Waka than live with the cluttered specifications of HTTP/1.1. Gopher does it right, in that you only need a byte to represent something. One Byte, nothing more.
Also Latin-1 is well within UTF-8, so much it is backwards compatible!
>>53 The thing that I dislike in the UTF-8 Character set is things like: [...]
Blame Unicode and the Unicode Consortium. They try too hard to make their character set a complete typesetting and graphics package. I guess they ran out of useful glyphs a while ago.
UTF-8 itself is fine. It's a good encoding.
Name:
Anonymous2013-02-27 17:39
>>53
Uh, so you leave accented letters, Cyrillic, JEWS' alphabet, Nikita's alphabet, hanzi, hiragana/katakana/kanji and hangul out just because Unicode went too far?
>>54,55
Woah, you /prog/s misread. UTF-8 is absolutely fine, it is a great encoding. I am barking about some characters that are stupid, not the encoding. Of course I welcome all alphabets and glyphs, esp. the Chinese ones that have not been added since their are too many. There is even lots of room for more in case we make more glyphs, and we can change them at any time. I especially like the Klingon Unicode character set.
Name:
Anonymous2013-02-27 17:48
>>56
Then call it the Unicode character set. UTF-8 is an encoding, not a character set.
>>53 Gopher does it right, in that you only need a byte to represent something. One Byte, nothing more.
You can't seriously be this full of shit. Mail protocols come with de facto and actual-RFC standards to turn base64 back into useful information, at the cost of 33% space overhead. Gopher has no such thing; servers serve up whatever, clients always treat it as Latin-1 and try to display it as such.
Also Latin-1 is well within UTF-8, so much it is backwards compatible!
Plain ASCII is literally the only overlap between UTF-8 and Latin-1.
Actual gopher users (as opposed to hipsters who pretend to like gopher because it's ``obscure'', but don't actually use it) recognise this is a problem, and one Gopher+ doesn't address. Exactly one server implementation and exactly one client can use cap files to specify encodings, but that still leaves output mangled on every other client, and requires (in principle) an additional connection per request.
UTF-8-by-default would go a long way toward fixing gopher, but explicit control semantics (yes, like HTTP's headers) would be even better.
>>58
Gopher as well as HTTP supports MIME, and use it to support other datasets. The client only need to call the file in gopher, and read the MIME of what it will do to interpret the file. HTTP does the same, with more Markup and verbosity than needed in the headers. MIME is fine on its own.
You are blaming client programs for their stupidity, not the gopher protocol. Where have you seen UTF-8 characters/bytes in HTTP/1.1? I sure would like to see some. I do not see it in the standard: http://tools.ietf.org/html/rfc2616
>>59
Your inane MIME drivel misses the point. Gopher, as a protocol, specifies Latin-1 for text encoding, and is designed to transfer text across the Internet. This is a deficiency in the protocol, and the fact that you could hypothetically pile on other protocols is the same sort of harmful bullshit that got us HTML email and XML configuration files.
This page should be ignored. The authors of the page clearly don't know what they are talking about. There are a few good suggestions in there, like using JSON instead of XML. However, the good is vastly outweighed by the bad.
Tk instead of Qt?
8c instead of GCC?
Sam instead of Vim?
It's contrarian for the sake of being contrarian. That in itself makes the page more harmful than many of the apparently harmful things listed.
>>62
Certainly, the more I look on its pages. At least suckless.org has their content correct.
But Qt? GTK+ is more cleaner and comprehensible. Even then, I just use ncurses and libcaca if GUIs are needed.
>>61
I wish I could believe you're just trolling, but there are a lot of genuine morons on the Internet and you're definitely behaving like one of them.
I read it. That is MIME.
That is MIME built into the standard, dipshit.
Gopher uses it too, but we call it BinHex and UUEncode:
Bullshit. That doesn't even slightly resemble the flexibility of MIME.
Suggesting you use it to try to hide gopher's broken approach to character encoding means not using 0 or 1 files, and (more importantly) never using links, because 4 and 6 aren't parsed as gophermaps. So, you know, the things gopher was actually designed for.
Only a suggestion is made to use Latin-1 is for those stupid client programs.
RFC 1436 isn't one of those ones that painstakingly defines ``should'' and ``may'' and ``must''. Here, ``should'' is ``must''.
Those stupid client programs are literally all of them.
All I am saying use the BEST tool for a job.
And I'm saying that that tool is never gopher, because gopher is broken.
>>66
Yeah, if needed. I deal with data, using PostgreSQL and Berkeley_DB, so I do not have to make many GUIs. If I do for businesses, I use ncurses, I am done. You do not need much to run a business with.
>>67
I see, so we both agree that UTF-8 encoding and bytes are not used in http, but MIME's specification of Content-Type:text/plain;charset=UTF-8;encoding=B;encoded-text=gyYjOTU2O2sgJiM1NDE7ZQ==;
We know MIME is integrated in multiple protocols, but why the other headers. And what is wrong with placing MIME on a UUEncoded labeled file? Should not a protocol be ignorant of a datatype and encoding of a file, and allow the client to process the file through description headers? And we are not talking about protocols with the exact purpose to know what the data transfer and encoding should be like RTP, libpq, SHOUTcast, SSH, etc..
What kind of job do you do that may not use gopher? I seldom need it myself, but when comparing to what most web sites use and need, gopher is enough. A real broken protocol is something like SVN and XMMP. Even FTP if we want to talk about how vulnerable it is.
>>69
You're even dumber than Cudder. Go play in traffic.
Name:
Anonymous2013-02-28 7:20
>>48,49 Blocking until you see a full line of input doesn't "break the fundamental pipe abstraction". It's actually a great example of how the abstraction works - read tokenizes input so the next process in the pipeline doesn't have to.
It's not that it blocks until it sees a full line, it's that it guarantees that it will output the full line with a single `write` call, presumably so that it will be returned by a single `read` call to the next program in the pipeline. Which means that those programs rely on a broken stream abstraction: suddenly fragmentation is no longer an implementation detail but an essential side-channel, the stream protocol is transformed into a datagram protocol, in a completely ad-hoc, unreliable way.
What's even more ridiculous, this can't possibly work correctly on UNIX because there's no `read` function variant that allocates the buffer for you, as large as necessary. So the downstream programs that relies on its `read`s returning entire lines every time passes a fixed-size buffer, and guess what happens when the line is longer that that buffer: either the program just breaks, or it reallocates the buffer and tokenizes input again -- in other words, duplicates the work that `read` program did, making it unnecessary.
Obviously most real programs will do the former, because lines longer than 8k don't real. This bullshit is one of the most perfect examples of the smelly unix hacker culture I've ever seen.
Name:
Anonymous2013-02-28 7:48
>>49 Pretty sure UNIX has handled text line-by-line for decades. Run cat, type ``hello'', hit Return, it prints ``hello''.
NO!
That's the terminal's line buffering. Read gets absolutely nothing until you press enter. If you pipe shit into cat instead of having it bind to line-buffered stdin, read gets input in blocks, which are independent from where the '\n' is.
Making a general I/O system call internally line-buffered is idiotic.
>>75 It's not that it blocks until it sees a full line, it's that it guarantees that it will output the full line with a single `write` call, presumably so that it will be returned by a single `read` call to the next program in the pipeline. Which means that those programs rely on a broken stream abstraction: suddenly fragmentation is no longer an implementation detail but an essential side-channel, the stream protocol is transformed into a datagram protocol, in a completely ad-hoc, unreliable way.
The fact that read(1) makes this guarantee does not mean that downstream filters are compelled to rely on it. It is still perfectly sensible to continue writing programs that can handle a partial read(3) for compatibility with older Unix. However if you need to work with a broken program you at least have the option.
I also don't accept that relying on one-write-per-read behavior is necessarily unreliable. If the reader and the writer agree on the format of the data to be exchanged, they can easily structure it in a way that allows its correctness to be verified by the reader. There's no guaranteeing that a bad writer will produce correct-looking input by coincidence, but there's no way to avoid that in all cases anyway.
>>75
Also, if you need to use a program that expects to read(3) only once per line to filter output from a program that fragments its writes, the solution is simple: just pipe the input through read(1) first.
>>88
The solution is to rewrite the broken program. Many of the standard *nix utilities don't even need to hold entire lines, like cut. (How to implement this is left as an exercise for the reader. Anoncoreutils might have some inspiration.)
Name:
Anonymous2013-03-01 7:32
>>89 How to implement this is left as an exercise for the reader
triez man. like da kool kidz do.
Name:
202013-03-01 7:36
oh neffermind. looks like we have a counter kid on our hands
Name:
Anonymous2013-03-01 7:38
>>91
LOL TWENTY? Looks like we have a caterpillar on our hands.
Name:
Anonymous2013-03-01 7:52
>>87,88
That read(1) writes entire lines is a symptom, I argue that the OS itself must be retarded to guarantee that reads are matched to writes.
This breaks an abstraction of a stream protocol, converting it to a somewhat message-oriented protocol.
This is bad in practice because now if you want to transparently pipe stuff through a real stream protocol, like tcp/ip, suddenly the OS has to implement an actual message-oriented protocol on top of that, on the off-chance that some moron depends on the guaranteed property.
What's worse, the guaranteed property is useless since read(2) can't allocate the buffer itself, so it is impossible to utilize it reliably in a general-purpose utility, so it's going to be used by morons writing unreliable programs only, indeed. And! And it encourages morons to write broken programs!
And if you're writing a special purpose program, you can use a real message-oriented format, maybe? And a library function like strtok for tokenizing your shit, instead of an external utility? Because you can't have external utilities all the way down, at some point you have to use a library function for parsing the output of an external utility?
Any guarantee is a liability. Any additional programs exposing and complimenting the guarantee increase the burden. A guarantee completely unrelated to the abstraction at hand is bad. Making a guarantee that is not only mostly useless, but also impossible to use correctly is insane.
I'm saying that with a fucking baobab like that in their collective eye, Plan9 developers have no business Considering stuff Harmful.
PS: don't put the empty line after a quote, this is not reddit.
>>93 I argue that the OS itself must be retarded to guarantee that reads are matched to writes.
The OS isn't doing that. read(1), the user program, does that; other programs can do as they like.
This is bad in practice because now if you want to transparently pipe stuff through a real stream protocol, like tcp/ip, suddenly the OS has to implement an actual message-oriented protocol on top of that, on the off-chance that some moron depends on the guaranteed property.
Again, you don't have to do what read(1) does. It would be insane to suggest otherwise.
What's worse, the guaranteed property is useless since read(2) can't allocate the buffer itself, so it is impossible to utilize it reliably in a general-purpose utility, so it's going to be used by morons writing unreliable programs only, indeed. And! And it encourages morons to write broken programs!
Caller-allocates-memory is standard practice for C programs, and for a language that operates at a systems level it's the only sane thing to do. The read routine should not be responsible for allocating memory when the caller knows best where the read buffer ought to be.
Also, saying that it's impossible to use read properly is just total bullshit. Allocate a fixed size buffer, call read with the size of the buffer, check the return value. That can't overflow, ever.
Name:
Anonymous2013-03-01 18:35
I argue that the OS itself must be retarded to guarantee that reads are matched to writes. The OS isn't doing that. read(1), the user program, does that; other programs can do as they like.
ur tarded. You can't implement read(1) guaranteeing atomic `read` on an OS that doesn't guarantee atomicness of reads and writes. The fact that they flaunt the tarded decision by implementing userspace programs that reinforce the guarantee means that they're all retarded.
Again, you don't have to do what read(1) does. It would be insane to suggest otherwise.
Not me, nobody has to do that, nobody wants to do that, and yet they guarantee that.
Caller-allocates-memory is standard practice for C programs, and for a language that operates at a systems level it's the only sane thing to do. The read routine should not be responsible for allocating memory when the caller knows best where the read buffer ought to be.
Why do you explain the obvious? Do you believe that if you explain the obvious, no, wait a second
Also, saying that it's impossible to use read properly is just total bullshit. Allocate a fixed size buffer, call read with the size of the buffer, check the return value. That can't overflow, ever.
Ah! You are retarded in truth! Like, I'm not trying to offend you, but I realize that you're actually retarded and unilaterally terminate the discussion.
I like you're custom cowsay, >>101-san. It is quite kawaii[1] if I do say so myself.
____________________________________ [1] Translator's note: ``kawaii'' means ``cute''.
>>105
C# is a better Java. and Scala is best than both.
Name:
Anonymous2013-03-02 9:41
>>97
I understand that you might have gotten confused by the documentation, but you make too many assumptions and criticize it based on that. You suggest that the single write(2) call that read(1) performs after it has found the line break will be matched by a single read(2) by the next process in the pipe line, but this is not the case. It says that it's helpful, and that's all it is, and I'll tell you why.
The process that receives what read(1) wrote can have a buffer of any size and can call fgets(3) or read(2) directly to get as much as possible of the line at a time. If the buffer fills, then read(2) more. What the receiver doesn't have to do is to look for the line break. This does not mean that the whole line was received in one read(2) call.
When read(2) returns a non-negative number less than the size of the buffer, the receiver knows that it has gotten one line. Does this mean that the receiver has to assume that its input came from read(1)? No, what the line-at-a-time program will do is parse the buffer again on its end and look for a line break (with the help of fgets(3), for instance). The parsing starts when read(2) returns, and wouldn't you know it, the input looks exactly like you want it to; there's not even anything past line break, and nothing in the buffer has to be moved for the next read(2). Of course, if the line is too long and the buffer is too small, read(2) will return many times and the buffer has to be resized.
The processes that receive their input from read(1) will also get the input as soon as it's available, which is both helpful and useful for interactive shell scripts.
>>107 >>97's point is that requiring write(2) to always complete on the first attempt places an undue burden on the OS. If the output device or file cannot accept writes over a certain size, the OS is now obligated to buffer the whole input and fragment it itself, rather than just accepting as much as will fit and relying on the caller handle the rest.
The argument is that this added complexity is pointless as in the general case the reader will end up fragmenting the data anyway because it can't guarantee all of it will fit in the buffer it allocates for read(2).
>>113
jesus christ... did you actually read anything more than the last 5 posts? did you SERIOUSLY read the thread? or are you just advertising your own post?
>>112
In the write(2) syscall, keeping track of how much has been written is very little added complexity for much gain. Also, when you write(2) to a pipe, it doesn't go directly to the read(2)er's buffer, no matter how big or small it is, but to the pipe buffer (you can start writing before there's a reader). You only write to a file descriptor, not to a receiving read(2) call.
On UNIX, when you write(2) to a file descriptor (say, a socket) that isn't ready to receive the whole buffer at once, the call might return early with a partial write success. What will the caller do now, when only some data has been written? Call write again and again until everything has been written, or write(2) returns a negative value. Is there any other valid action to take than to continue writing? You can never know (or should never have to care) what's small enough for the file descriptor you're writing to so that write(2) is guaranteed to succeed.
It's no problem for an operating system to perform this loop itself. It knows everything about the file descriptor, things that user space programs shouldn't have to care about. It knows when it's possible to write again and how it's most efficiently done. When a socket or pipe buffer is full, the thread can wait until it is writable again, and yield to the dispatcher in the mean time. When it's possible to read again, resume the write by moving a new range of bytes from user memory to the IO device's buffer page, and repeat until everything's written and return to user space or if there's an IO error return early. Why is it better to have every program include a write(2) loop, rather than having a better write(2) syscall?
Plan9 does writing this way. The manual states that if write returns anything less than what was intended, it should be considered an error (http://plan9.bell-labs.com/magic/man2html/2/read). Now, say you run read(1) from plan9port on linux. There's a single write(2), which might not write everything the first time, and the receiver will only get as much as fit in the pipe. read(1) has a bug when run on this system, although it won't appear very often.
>>107,112,117
No, wait, you're discussing wrong things.
* I don't have a problem with write blocking until all data was written, in fact I think that's the Right behaviour.
* After some consideration, I think that it should be guaranteed that `read` will return early if there is insufficient data in buffer, because it is necessary for interactive programs to work and is a more or less natural, expected behaviour.
* It must not be guaranteed that read will never return less data than the corresponding write has written; in other words the OS should be allowed to introduce additional fragmentation.
Not giving such guarantee is useful because most other stream protocols do not give it either, so not allowing OS to introduce additional fragmentation for pipes means that instead of sending data directly to the tcp/ip driver it should wrap every `writ`ten buffer in a message with length and shit, send that, collect the entire message in a dynamically reallocated buffer on the other side, and only then give it to the program waiting on read.
Giving such guarantee is useless because you have to check if your buffer ends with newline or whatever other delimiter you expect your source to use when read returned sizeof(buffer) anyway, so just check buffer[len-1] always.
Giving such guarantee is harmful because it encourages programmers to assume that lines longer than 8192 or whatever bytes don't real instead of doing the aforementioned check.
* It probably should not be guaranteed that read will never return more data than a single write has written; in other words the OS should be allowed to remove fragmentation when it has enough data available.
This is useful because tcp/ip does this (for performance reasons, which are important to us as well), as far as I know. Not forcing OS to do retarded things to stream pipes over tcp/ip is good.
Having such guarantee is mostly useless, because why don't you play safe and properly search the data for the delimiter you want, instead of having an unwritten requirement that something earlier in the pipe should split output using that delimiter? The only thing that you can't do otherwise is, well, consume a single line from input, but idk, that actually sounds harmful. That's not how piping should work, in my opinion. You want to send parts of the stream to different program, use xargs or something, that shit is too rarely used and too magical to be allowed to contaminate the core abstraction.
Name:
Anonymous2013-03-03 10:45
I LIKE MY OPERATING SYSTEMS LIKE I LIKE MY WOMEN
IF THEY DON'T FINISH I JUST JAM IT IN AGAIN
* I don't have a problem with write blocking until all data was written, in fact I think that's the Right behaviour.
Okay, good. This is what plan9 does specifically, and what probably all serious UNIX operating systems normally do, even though POSIX allows for early return.
* After some consideration, I think that it should be guaranteed that `read` will return early if there is insufficient data in buffer, because it is necessary for interactive programs to work and is a more or less natural, expected behaviour.
This is what read(2) always has done. It fills the buffer with bytes read from the file descriptor, and if it's less than the size of the buffer, either you've reached end-of-file, or you're reading from a pipe, terminal or socket. However, read(2) might block until there's at least something to read, which might happen when the writing process block buffers its output. If you want to check whether reading will block, poll the file or perform an explicit nonblocking read (which might return nothing).
* It must not be guaranteed that read will never return less data than the corresponding write has written; in other words the OS should be allowed to introduce additional fragmentation.
There is no corresponding write to a read. You don't write(2) into a read(2), you write to file descriptors and read from file descriptors. Once written to a file descriptor there's no information whether it was written in one call to write(2) or ten.
Nobody does what you are criticizing, but if they did, you'd be correct in that it'd be harmful. But this is not what the "one write" that read(1) promises is about. read(1) will write once, one line, which will be read in full and parsed again for newline on the receiving end. Whether this is read with one call to read(2) or ten doesn't matter. If you run read(1) with the option of reading two lines, it might write both before any single line has even been read.
* It probably should not be guaranteed that read will never return more data than a single write has written; in other words the OS should be allowed to remove fragmentation when it has enough data available.
Yes, of course, and nobody does that. Again, there's no mapping between individual calls to write(2) and read(2), you only read and write to file descriptors, usually asynchronously.
Name:
Anonymous2013-03-03 11:53
I LIKE MY WOMEN LIKE I LIKE MY FILE-SYSTEMS
REISERFS
* I don't have a problem with fuck blocking until all sticky cum was came, in fact I think that's the Right behaviour.
Okay, good. This is what plan9 does specifically, and what probably all serious UNIX operating systems normally do, even though POSIX allows for early return.
* After some consideration, I think that it should be guaranteed that `receive cum` will return early if there is insufficient sticky cum in vagina, because it is necessary for interactive programs to work and is a more or less natural, expected behaviour.
This is what get_fucked(2) always has done. It fills the vagina with bytes received from the bitch descriptor, and if it's less than the size of the vagina, either you've reached end-of-bitch, or you're receiving cum from a pipe, terminal or socket. However, get_fucked(2) might block until there's at least something to receive cum, which might happen when the writing process block vaginas its output. If you want to check whether receiving cum will block, poll the bitch or perform an explicit nonblocking receive cum (which might return nothing).
* It must not be guaranteed that receive cum will never return less sticky cum than the corresponding fuck has came; in other words the OS should be allowed to introduce additional fragmentation.
There is no corresponding fuck to a receive cum. You don't fuck(2) into a get_fucked(2), you fuck to bitch descriptors and receive cum from bitch descriptors. Once came to a bitch descriptor there's no information whether it was came in one call to fuck(2) or ten.
Nobody does what you are criticizing, but if they did, you'd be correct in that it'd be harmful. But this is not what the "one fuck" that get_fucked(1) promises is about. get_fucked(1) will fuck once, one line, which will be receive cum in full and parsed again for newline on the receiving end. Whether this is receive cum with one call to get_fucked(2) or ten doesn't matter. If you run get_fucked(1) with the option of receive cuming two lines, it might fuck both before any single line has even received cum.
Name:
Anonymous2013-03-03 17:26
>>121 * It must not be guaranteed that read will never return less data than the corresponding write has written; in other words the OS should be allowed to introduce additional fragmentation. Nobody does what you are criticizing, but if they did, you'd be correct in that it'd be harmful. read(1) will write once, one line, which will be read in full and parsed again for newline on the receiving end.
If the OS is allowed to introduce fragmentation, then when you execute read <8k_char_line.txt | myprogram, and the pipe actually pipes data through a network socket, then `myprogram` will receive an 8k character line produced by a single write(2) by read(1) in 5 or so 1400 byte reads(2), even if it passes an 8k buffer to it.
This is sane behaviour.
Plan9 documentation strongly implies that Plan9 is guaranteed to implement insane behaviour, where upon execution of write(2) by read(1) the OS will actually send the buffer as a message, with a header saying that it's 8k characters long, and on the other end Plan9 will patiently wait until it gets all data, dynamically reallocating the rcv buffer, and only then will allow `myprogram` to read(2) from it, and will return the entire buffer at once if `myprogram` supplies a sufficiently large buffer of its own.
This is insane behaviour.
http://swtch.com/usr/local/plan9/src/cmd/read.c strongly suggests that the documentation is correctly documenting the insanity of the approach, otherwise read(1) could use a static buffer just like cat(1) and split the output across several writes(2), because why not if the OS is allowed to do that anyway?
>>124 http://swtch.com/usr/local/plan9/src/cmd/read.c strongly suggests that the documentation is correctly documenting the insanity of the approach, otherwise read(1) could use a static buffer just like cat(1) and split the output across several writes(2), because why not if the OS is allowed to do that anyway?
Avoiding syscall overhead, perhaps? In some cases it might be better to do multiple realloc's (since many of these will return immediately without brk'ing into the system) to avoid doing multiple writes.
Name:
Anonymous2013-03-04 8:23
>>130 Avoiding syscall overhead, perhaps?
Why doesn't cat do the same then? Why don't they use a much bigger buffer? Nah, that's grasping at straws, they do exactly what they say they do, and what they say is a harmful thing.
First, the file descriptor (arg[0]) is turned into a channel, then the channel device's write function is called directly with the same arguments. There's nothing special the OS does directly, rather it's all handled by the device. The device writes the code in a suitable way, depending on its nature. devtab is made at compile time with the mkdevc script.
There's nothing special the OS does directly, rather it's all handled by the device. The device writes the code in a suitable way, depending on its nature.
So if the device does not preserve the guarantees made in the documentation, then documentation ends up being wrong, the retardation you can see in the read(1) code ends up being pointless, and all userspace programs written by naive morons who trusted the documentation are broken.
I don't quite follow what you're trying to prove.
Name:
Anonymous2013-03-04 17:17
>>133
Look, you do see what is the difference between cat(1) and read(1), correct? Every character is checked so the newline can be found and be written immediately. When the output is piped to another process, it'll be able to process that line directly. Another program, for instance cat(1) would sometimes wait until an 8k buffer is filled or EOF before writing anything. That is inconsequential for batch processing, but bad for interactive use. Then you want your line as soon as it's available.
You're getting hung up on the "single write" in the man page, and read too much into it. A program that read and wrote a single character at a time could be used for the same purpose as read(1). The line would be available straight away. The only difference is that read(1) buffers until a whole line has been read. One write instead of n. Is the dynamic buffer size bad and overkill? Maybe. Ask the author why a static buffer and multiple writes weren't used. Again, most lines are shorter than 8k, the buffer size cat(1) uses, and most lines will fill up quickly.
If your input is line unbuffered or line buffered already, you don't gain anything. If the process that writes to read(1) buffers its output for more than a line, this is also very little help. But can you now tell what the difference between these two lines are?
And you seemed convinced that the operating system did this or that based on your interpretation of the manual of a user land program. Don't be so quick to jump to conclusions just because Uriel claiming cat -v wad bad made you sad. Look at the sources directly and you can see what write(1) does and doesn't do.
>>134
tl;dr read(1) limits buffering to 1 line to avoid blocking the next process in an interactive pipe; cat doesn't care about this; >>133 has his panties in a twist. Sounds good.
Name:
Anonymous2013-03-04 22:12
C is fucking shit.
Name:
Anonymous2013-03-05 6:21
>>134 Ask the author why a static buffer and multiple writes weren't used.
Because the documentation says that `read` uses a single write. It doesn't say that read issues a write immediately after encountering newline, it's pretty clear about what it does.
But can you now tell what the difference between these two lines are?
I don't think you meant cat/read my-file, you wanted them to read from the terminal, no? With all that interactivity and stuff... Now you seem to be missing the fact that cat does not read the entire 8k buffer before writing, `read(2)` returns immediately and `cat(1)` writes current input immediately as soon as it gets to the end of data you gave it interactively so far. Meditate on that bro.
There's no difference between
cat | my-interactive-program
read -m | my-interactive-program
assuming that my-interactive-program is not retarded. None whatsoever.
And you seemed convinced that the operating system did this or that based on your interpretation of the manual of a user land program.
You seem convinced that the OS works in a non-retarded way despite all available evidence in form of documentation and read(1) code, written by the same guy who wrote the OS itself by the way. That is funny.
`cat(1)` writes current input immediately as soon as it gets to the end of data you gave it interactively so far.
Oh, you are a troll. cat cannot know how big the input is, it doesn't look for newlines, so unless it gets an EOF it will happily sit and wait for that 8K buffer to fill up.
If you think the OS is cooking the the input to read(1), you're insane. If that were true there would be no need to have read(1) and cat(1) be different programs as under your insane presumption they would behave identically for lines under 8K in length.
>>140
lisp may be dead, it's irrelevant, i like python more and it's much alive
manual memory management makes me cry ><
Name:
Anonymous2013-03-05 13:01
>>139 `cat(1)` writes current input immediately as soon as it gets to the end of data you gave it interactively so far. Oh, you are a troll. cat cannot know how big the input is, it doesn't look for newlines, so unless it gets an EOF it will happily sit and wait for that 8K buffer to fill up.
Holy shit. Go run cat(1) from your terminal and observe that it echoes your input immediately, line by line.
You're an idiot bro. Not because you didn't know that, but because it should have taken you about five seconds to check, but you didn't.
The reason cat on GNU/Linux spits out your line upon "enter" is due to the way the kernel is managing I/O events and buffers, not due to the cat program itself. Christ.
read(2) will block until there's some input, it doesn't block by block-size. The kernel will attempt to wait for certain block sizes, but it isn't going to wait forever if there's data available but it doesn't match the block size.
It's a detail of the libc / kernel / environment of the OS, not a detail of the userland program.
Name:
Anonymous2013-03-06 8:26
>>148 It's a detail of the libc / kernel / environment of the OS, not a detail of the userland program.
Read the fucking thread, please. Especially >>139, and then earlier comments to see what the fuss is about.
Only the original UNIXv7 cat ever set a buffer, with setbuf(stdout, stdbuf).
BSD cat uses -u to set the buffer to NULL, but in the code there it doesn't ever seem to set the buffer otherwise, so it's likely always unbuffered, or buffered to the kernel's discretion.
I've read the fucking thread, ``faggot'', seeing as I posted when this discussion started, here >>76.
I'm not sure what the fuck you retards are even still arguing about.
The kernel does try to do block ``optimizations'' on read, even if it's set to be unbuffered.
You can check by sending 1 byte of data to another process with a delay between sends, and sending blocks equivalent to whatever the kernel's buffer is set to with a higher delay.
As a last post, you're both retarded and neither of you have any idea about what you're saying, because:
`cat(1)` writes current input immediately as soon as it gets to the end of data you gave it interactively so far.
Oh, you are a troll. cat cannot know how big the input is, it doesn't look for newlines, so unless it gets an EOF it will happily sit and wait for that 8K buffer to fill up.
Both those statements are wrong. There is no "end of the data" you give it interactively. There's just what read(2) has returned. Regardless of whatever fucking OS you're on, and the Plan 9 cat you linked follows this same behaviour.
It knows the output has "finished" on EOF. Which is either CTRL+D in the terminal, or as returned through an actual file entry. Any program that uses read in blocking mode isn't aware of "waiting" for anything. It just gets data as the OS passes it to read.
Discussion over.
Name:
Anonymous2013-03-06 10:13
OK, nobody understands what the argument is about anymore.
Suppose you want to implement piping between local processes. The most straigtforward way: when one process calls write it blocks, then when another process calls read on the receiving end, the data is copied directly from the buffer supplied to write, and the write call doesn't return until all data is copied away. One pleasant property of this approach is that you're guaranteed interactivity, there will never be any data stuck in some internal OS buffer, because there are no internal OS buffers.
Another property is that as long as the buffer used for reading is big enough, each chunk returned by read corresponds directly to a chunk written by write, so you can write a tokenizing program like read(1) with -m option, which splits input into lines and writes each line with a single write call, then each chunk of data returned by read(2) on the other end corresponds to a line and the program there doesn't need to strtok it again or anything.
Then there is buffering, like when you are sending data over a network and it would be inefficient to send a packet every time a program executes write, what if it writes data byte by byte, better to accumulate it in an internal OS buffer and send all at once. Unfortunately this would mean that while you get better throughput, you mostly lose interactivity.
A feeble mind might then conclude that these are only two available options, so since we value interactivity in our pipes, we can assume that a functional equivalent of the first approach, with all its side effects guaranteed.
This is wrong. The guarantee of interactivity is much weaker than the guarantee of 1-to-1 correspondence between writes and reads. Consider this: the OS guarantees that it will always mark written data for immediate sending, and always return all available data with read, but: when the network interface becomes available, the OS sends data immediately, but also all data it has accumulated so far, and only as much data as can fit in one packet, obviously. So suppose you execute write(1 byte), write(1000 bytes), write(1000 bytes), this results in the network packets containing 1 byte, 1400 bytes, 600 bytes.
You still get your interactivity, the OS never introduces any delays, you always get your data across as fast as the channel latency and throughput allow, but when you fill the channel capacity you get all the joys of a buffered channel, both in performance and in the fact that the OS cuts and splices your writes however it wants.
Therefore, a sane stream abstraction should guarantee interactivity but should not guarantee any correspondence between writes and reads, because that's an implementation detail produced by a particular naive implementation, and is not in any way or shape required for providing interactivity. The moment you want to pipe shit between programs running on two cores, you should be able to switch to a buffered implementation and have your programs run in parallel while minimizing synchronization frequency (but still preserving interactivity, of course!).
Now, back to our cats. Plan9 cat implementation guarantees interactivity because it doesn't do any buffering of its own, it always calls write as soon as read returns, therefore the guarantees provided by the OS regarding those are preserved. The fact that cat uses a fixed size buffer means that it might introduce fragmentation, but that doesn't matter because a sane OS doesn't guarantee the lack of fragmentation anyway. The same should apply to read(1), it should use the same 8k buffer and write it out whenever it fills up. The only important guarantee regarding read(1) is that when it encounters newline, it doesn't read past it, calls write and returns immediately. read -m should, on a sane OS, be in all respects identical to cat, and therefore shouldn't exist.
The fact that Plan9 read does have the -m option, goes out of its way to preserve the non-sane guarantee that it will not call write before it sees the newline or EOF, and has this behaviour documented, means that Rob Pike and whoever else wrote/reviewed it don't realize how harmful this shit is, that they added a documented misfeature to the stream abstraction, a feature that has nothing to do with it, is by and large useless, and makes distributed piping unnecessarily complicated and inefficient. Until they pull that particular log out of their collective eye their circlejerk about things Considered Harmful is ridiculous.
when the network interface becomes available, the OS sends data immediately, but also all data it has accumulated so far, and only as much data as can fit in one packet, obviously. So suppose you execute write(1 byte), write(1000 bytes), write(1000 bytes), this results in the network packets containing 1 byte, 1400 bytes, 600 bytes.
Are you describing a theoretical scenario, or alluding to the fact that this actually happens? Because it does not, at least, not over TCP, boundaries for send or write are not preserved in TCP packets. Neither are they preserved when doing a read or recv on a socket descriptor. It'll send whatever data was in the send buffer all at once, and compiles the packets based on that. They're preserved in UDP, but there you lose the 1-to-1 correspondence anyway, due to the fact that the packets aren't guaranteed to arrive.
when the network interface becomes available, the OS sends data immediately, but also all data it has accumulated so far, and only as much data as can fit in one packet, obviously. So suppose you execute write(1 byte), write(1000 bytes), write(1000 bytes), this results in the network packets containing 1 byte, 1400 bytes, 600 bytes. Are you describing a theoretical scenario, or alluding to the fact that this actually happens? Because it does not, at least, not over TCP, boundaries for send or write are not preserved in TCP packets.
Are you drunk?
Name:
154 2013-03-06 12:21
By the way, I was pretty close, writing 1, 1000, 1000 over an actual network socket (not the loopback adapter!) results in 1, 1460, 540 bytes received.
>>160
there exists an abelson such that it smells bad
Name:
Anonymous2013-03-07 15:59
The same should apply to read(1), it should use the same 8k buffer and write it out whenever it fills up. The only important guarantee regarding read(1) is that when it encounters newline, it doesn't read past it, calls write and returns immediately. read -m should, on a sane OS, be in all respects identical to cat, and therefore shouldn't exist.
No, the difference is that cat calls read(fd, buffer, sizeof buffer), while read(1) calls read(fd, &c, 1). This is the difference between cat and read -m. The operating system has no notion of a line. The terminal device, on the other hand, might have.
Of course, read(1) could use a fixed buffer and write as soon as it's full, instead of dynamically resizing it. That's true.
Name:
Anonymous2013-03-08 4:57
No, the difference is that cat calls read(fd, buffer, sizeof buffer), while read(1) calls read(fd, &c, 1). This is the difference between cat and read -m.
The difference should be in observable behaviour. That's the point of discussing whether or not differently implemented operations are equal.
There's no difference in observable behaviour between cat and read -m on a sane OS.