Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

scanf field width

Name: Anonymous 2013-09-20 10:45

For the %s conversion specifier, is the field width in bytes or characters? The C standard is suggesting that the field width is always measured in characters, but there's this small note that confuses me:
246) No special provisions are made for multibyte characters in the matching rules used by the c, s, and [ conversion specifiers — the extent of the input field is determined on a byte-by-byte basis. The resulting field is nevertheless a sequence of multibyte characters that begins in the initial shift state.
Also, the opengroup documentation seems to be suggesting (see the description given for the %c specifier as well) it's in bytes if no l length modifier is present. Otherwise, it's in characters.

plz help me lambda‼

Name: Anonymous 2013-09-20 10:55

/prog/ is dead. move along.

Name: Anonymous 2013-09-20 11:36

install gentoo

Name: Anonymous 2013-09-20 13:04

>>1
You should know that %characters are interpreted bytes in an address thanks to the preprocessor information in your stdio.h function. IOW, you can express %s with 0x6865785f76616c756573.

Also, ask these questions in the proper new site.

Name: Anonymous 2013-09-20 13:21

>>4
What the fuck?

Name: Anonymous 2013-09-20 13:40

Name: Anonymous 2013-09-20 19:33

Name: Anonymous 2013-09-21 7:19

>>7
Good job on failing to understand the question. Try reading it again.

Name: Anonymous 2013-09-21 8:06

>>8
"
>pedophile autist
>understanding the question
"

Choose one.

Name: Anonymous 2013-09-21 8:10

>>9
Sorry?

Name: Anonymous 2013-09-21 8:57

tfw no typesafe string operations

hold me /prog/!

Name: L. A. Calculus !!wKyoNUUHDOmjW7I 2013-09-23 6:29

>>1
IF YA USE DA 'l' LENGTH MODIFIER IT'S DA NUMBER OF MULTIBYTE CHARACTERS DAT R READ, NOT BYTES.

>>4
REED DA FUCKIN STANDARD, STACK BOY.

>>6,7
GO STICK UR DINGDONGS INTO A PICTURE OF UR FAVOURITE TOOHOOS AND GET DA FUCK OUT OF MY THRED, YA JAPANIMATION-LOVING RETOIDS.

>>11
WAT DA FUCK IS UR PROBLEM?

Name: L. A. Calculus !!wKyoNUUHDOmjW7I 2013-09-23 6:31

>>6,7
OR MAYBE U COULD DRESS UP AS DA OTHER PERSON'S FAVOURITE TOOHOO AND FUK EACH OTHER. NOW DER'S A FUCKIN PLAN. GET TO IT.

Name: Anonymous 2013-09-23 6:58

IF YA USE DA 'l' LENGTH MODIFIER IT'S DA NUMBER OF MULTIBYTE CHARACTERS DAT R READ, NOT BYTES.
Yeah, that's what opengroup was suggesting. The standard isn't clear about it, though (unless it's because I'm reading the draft).

Name: L. A. Calculus !!wKyoNUUHDOmjW7I 2013-09-23 7:21

DIS WEBSITE'S A PILE OF SHIT.

>>14
WELL, IT'S ONLY NORMAL DAT IT WUD USE DA NUMBER OF MULTIBYTE CHARACTERS RATHER THAN BYTES, CUZ IT ACCEPTS A 'wchar_t *' INSTED OF A 'char *'. IF UR COMPUTER USES UTF-8 AND U HAVE AN ARRAY OF 10 wchar_t ELEMENTS, U KNOW U SHUD USE DA "%9ls" FORMAT SPECIFIER. THIS WAY U CAN BE SURE UR READING AT MOST 10 CHARACTERS, REGARDLESS OF HOW MANY BYTES DAT ACTUALLY IS ON stdin.

U CUD TRY EXPERIMENTING WITH DIS CODE IF U WANT:

#include <locale.h>
#include <stdio.h>
#include <wchar.h>

int main(void)
{
    wchar_t buf[10];

    setlocale(LC_ALL, "");
    while (scanf("%9ls", buf) == 1)
        printf("%ls\n", buf);
}


The standard isn't clear about it, though (unless it's because I'm reading the draft).

DA STANDARD'S WRITTEN BY A BUNCH OF FUCKIN HYENAS WHO SUCKED DA LIFE OUT OF DEANIS RICKY. IT'LL MAKE SENSE ONCE U FAMILIARISE URSELF WITH DER LAWYERESE GIBBERISH.

Name: Anonymous 2013-09-23 7:31

L.A. CALCULUS do you live in Los Angeles?

Name: L. A. Calculus !!wKyoNUUHDOmjW7I 2013-09-23 8:06

>>16
DAT'S LAMBDA ARTHUR CALCULUS, YA RETOID.

Name: Anonymous 2013-09-23 9:09

>>15
I meant the standard isn't clear about the whole thing i.e. the other case (without the l modifier) included. The only hint that ``character" in that context means single-byte character is that ``multibyte" isn't explicitly mentioned (and perhaps the note I quoted).

Side note: I wonder how multibyte character to wide character conversions are done in Windows. The wchar_t type is 16 bits and uses UTF-16. If a multibyte character is outside Unicode's BMP (is it possible with ANSI code pages? If not, char* could still use UTF-8, right?) then it has to be coded as a surrogate pair.

Name: L. A. Calculus !!wKyoNUUHDOmjW7I 2013-09-23 10:09

>>18
IF U OMIT DA 'l' MODIFIER DEN IT DOESN'T TRANSLATE MULTI-BYTE SEQUENCES AT ALL. SAME AS FUCKIN fgets. U CAN CHECK DA PAGE FOR fscanf IN DA STANDARD, DA SUBSECTIONS ON DA 'c', 's', AND '[' MODIFIERS, >>18.

I wonder how multibyte character to wide character conversions are done in Windows.
AHAHAHAHHAHA
DATS A QUESTION FOR BILL FUCKING GATES. DROP HIM A LINE. IM PRETTY SURE IT'S ONLY UTF-16 THO. I WAS SURPRISED DAT WRITING SHIT TO stdout DIDNT PRODUCE A FUCKIN WORD(R)(TM)(C) DOCUMENT BACK WEN I TRIED PROGRAMMING WITH DAT SHIT.

Name: L. A. Calculus !!wKyoNUUHDOmjW7I 2013-09-23 10:14

is it possible with ANSI code pages?
WAT DA FUK R DEY?

If not, char* could still use UTF-8, right?
YEA, BUT U PROBABLY NEED TO WRITE UR OWN FUCKIN LIBRARY (OR DOWNLOAD ONE) TO TRANSLATE DAT SHIT. Plan9 AND I THINK OpenBSD BOTH HAVE DECENT LIBS FOR DAT.

Name: Anonymous 2013-09-23 11:23

>>20
Multibyte characters in Windows are encoded using whatever the active code page is by default, e.g. Latin-1. http://en.wikipedia.org/wiki/Windows_code_page. Yeah, I got the name wrong. They're ``OEM code pages" apparently... probably picked it up from here: http://alfps.wordpress.com/2011/11/22/unicode-part-1-windows-console-io-approaches/.

Name: Anonymous 2013-09-23 17:44

>>20,21
Use musl: http://www.musl-libc.org/

Also join use at progrider.

Name: Anonymous 2013-09-23 21:44

>pedorider

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List