/prog/ - considered harmful

Name: Anonymous 2013-02-25 6:06

http://harmful.cat-v.org/software/
>2013
>using harmful software

Name: Anonymous 2013-03-02 9:41

>>97
I understand that you might have gotten confused by the documentation, but you make too many assumptions and criticize it based on that. You suggest that the single write(2) call that read(1) performs after it has found the line break will be matched by a single read(2) by the next process in the pipe line, but this is not the case. It says that it's helpful, and that's all it is, and I'll tell you why.

The process that receives what read(1) wrote can have a buffer of any size and can call fgets(3) or read(2) directly to get as much as possible of the line at a time. If the buffer fills, then read(2) more. What the receiver doesn't have to do is to look for the line break. This does not mean that the whole line was received in one read(2) call.

When read(2) returns a non-negative number less than the size of the buffer, the receiver knows that it has gotten one line. Does this mean that the receiver has to assume that its input came from read(1)? No, what the line-at-a-time program will do is parse the buffer again on its end and look for a line break (with the help of fgets(3), for instance). The parsing starts when read(2) returns, and wouldn't you know it, the input looks exactly like you want it to; there's not even anything past line break, and nothing in the buffer has to be moved for the next read(2). Of course, if the line is too long and the buffer is too small, read(2) will return many times and the buffer has to be resized.

The processes that receive their input from read(1) will also get the input as soon as it's available, which is both helpful and useful for interactive shell scripts.

Name: Anonymous 2013-03-02 17:27

>>107
>>97's point is that requiring write(2) to always complete on the first attempt places an undue burden on the OS. If the output device or file cannot accept writes over a certain size, the OS is now obligated to buffer the whole input and fragment it itself, rather than just accepting as much as will fit and relying on the caller handle the rest.

The argument is that this added complexity is pointless as in the general case the reader will end up fragmenting the data anyway because it can't guarantee all of it will fit in the buffer it allocates for read(2).

Name: Anonymous 2013-03-03 5:33

>>112
In the write(2) syscall, keeping track of how much has been written is very little added complexity for much gain. Also, when you write(2) to a pipe, it doesn't go directly to the read(2)er's buffer, no matter how big or small it is, but to the pipe buffer (you can start writing before there's a reader). You only write to a file descriptor, not to a receiving read(2) call.

On UNIX, when you write(2) to a file descriptor (say, a socket) that isn't ready to receive the whole buffer at once, the call might return early with a partial write success. What will the caller do now, when only some data has been written? Call write again and again until everything has been written, or write(2) returns a negative value. Is there any other valid action to take than to continue writing? You can never know (or should never have to care) what's small enough for the file descriptor you're writing to so that write(2) is guaranteed to succeed.

It's no problem for an operating system to perform this loop itself. It knows everything about the file descriptor, things that user space programs shouldn't have to care about. It knows when it's possible to write again and how it's most efficiently done. When a socket or pipe buffer is full, the thread can wait until it is writable again, and yield to the dispatcher in the mean time. When it's possible to read again, resume the write by moving a new range of bytes from user memory to the IO device's buffer page, and repeat until everything's written and return to user space or if there's an IO error return early. Why is it better to have every program include a write(2) loop, rather than having a better write(2) syscall?

Plan9 does writing this way. The manual states that if write returns anything less than what was intended, it should be considered an error (http://plan9.bell-labs.com/magic/man2html/2/read). Now, say you run read(1) from plan9port on linux. There's a single write(2), which might not write everything the first time, and the receiver will only get as much as fit in the pipe. read(1) has a bug when run on this system, although it won't appear very often.

considered harmful

1 Name: Anonymous 2013-02-25 6:06

107 Name: Anonymous 2013-03-02 9:41

112 Name: Anonymous 2013-03-02 17:27

117 Name: Anonymous 2013-03-03 5:33

Name: Anonymous 2013-02-25 6:06

Name: Anonymous 2013-03-02 9:41

Name: Anonymous 2013-03-02 17:27

Name: Anonymous 2013-03-03 5:33