Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-

One 32bit Long into 4 8bit Chars

Name: Anonymous 2010-06-28 7:43

In C code

I'm gonna ask a stupid question here, but I'm not sure I trust my code so I want to hear if you can tell me what's wrong with my approach

Right now, the code goes

long data

char data0
char data1
char data2
char data3

data0 = data;
data1 = data >> 8;
data2 = data >> 16;
data3 = data >> 24;

but I'm not sure that's the best approach, or even a good approach. It works, which is most important obviously, but I have this nagging feeling that there's something very not Best Practices about this approach and I can't shake that feeling.

Name: Anonymous 2010-06-28 7:46

Semicolons. Arrays. Long sizes.
union {
  long hax;
  char anus[sizeof(long)];
} myanus;

Name: Anonymous 2010-06-28 7:48

what happens if char is 16 bits and long is 32 bits?

Name: Anonymous 2010-06-28 7:48

>>3
You're mad.

Name: Anonymous 2010-06-28 7:53

/* avoid pissing off homosexuals */
#define marriage union
marriage { long cat; char mander[sizeof(cat)]; } data;

Name: Anonymous 2010-06-28 7:55

>>5
sizeof(cat)
(  `Д´)

Name: Anonymous 2010-06-28 8:21

>>5
/* avoid pissing off homosexuals */
#define marriage union
wat

Name: Anonymous 2010-06-28 9:25

Well, a marriage is a holy bond between a longcat and a charmander.

Name: Anonymous 2010-06-28 9:27

>>7
I have to agree. Treating marriage and civil union as equivalents will anger a fair portion of the gay community.

Name: Anonymous 2010-06-28 10:18

The reason I care about getting the "correct" way is that I want to be sure it works if we change processor architecture, which means I need a solution that doesn't care about little/bigendian constraints and other rot along that line.

Name: Anonymous 2010-06-28 10:19

>>10
So just use what you did, but with an array instead of retardedly named variables.

Name: Anonymous 2010-06-28 10:48

My problem is, when I do

char a
long b
a=b

Name: Anonymous 2010-06-28 10:48

My problem is, when I do

char a;
long b;
a=b;

how can I be sure that a is filled with the eight rightmost bits, rather than the eight leftmost?

Name: FrozenVoid 2010-06-28 10:51

use one long pointer*, and hack new char* at long+1,long+2(using the long as array).



__________________
Orbis terrarum delenda est

Name: Anonymous 2010-06-28 11:12

>>13
"rightmost" and "leftmost" makes no sense; use "most/least significant."

>>10
So use a union and just do some hton conversions when assigning.

Name: Anonymous 2010-06-28 11:34

Name: Anonymous 2010-06-28 12:03

>>15
"rightmost" and "leftmost" makes no sense; use "most/least significant."
qft

Name: Anonymous 2010-06-28 12:13

>>15
"rightmost" and "leftmost" makes no sense; use "most/least significant."
Why? It's not like people are going to suddenly use a different endiant for storing data in longs.

Name: Anonymous 2010-06-28 12:15

>>18
...

Name: Anonymous 2010-06-28 12:17

>>19
answer me nigger

Name: Anonymous 2010-06-28 12:18

>>20
Have you ever worked with an Intel machine?

Name: Anonymous 2010-06-28 12:20

>>21
Yes so what?

Name: Anonymous 2010-06-28 12:20

>>22
It's little-endian, fucko. Look up the difference between that and big-endian.

Name: Anonymous 2010-06-28 13:29

>>17,19,21,23
The way a processor lays shit out in memory has nothing to do with what the number actually looks like, dickface.

Name: Anonymous 2010-06-28 13:31

>>14
English please.

Name: Anonymous 2010-06-28 13:32

>>20
* African American

Name: Anonymous 2010-06-28 13:43

One day I'm gonna design a processor which lays out 32-bit integers in the following nybble sequence: ADGFCEHB, just to piss off people like >>13.

Name: Anonymous 2010-06-28 13:48

>>24,18
What's your point? That >>1 had the right way all along? Or that >>2 depends on endianness? The number doesn't `look like' anything until you have a look at it in memory (OH SHIT! THAT'S EXACTLY NOT WHAT YOU SAID.) or printf it, in which case it looks like a number does to a human.
I can tell you that ``people'' are indeed not ``going to suddenly use a different endiant for storing data in longs'', because it's not people who decide that shit. The processor's endianness decides that for the compiler.
``Left shift'' and ``right shift'' are platform-independent, because they don't actually shift left or right, but towards the least or most significant bits respectively.

Name: Anonymous 2010-06-28 13:51

>>24
Try rephrasing that.

>>27
http://en.wikipedia.org/wiki/Endianness#Middle-endian

I think that's close enough.

Name: Anonymous 2010-06-28 13:51

>>27
The compiler for your processor will still adhere to the C standard, so that won't matter at all.

Name: FrozenVoid 2010-06-28 13:51

>>25
#include "stdio.h"
main(int argc, char**argv, char**envp){
long cat=1;char* vcat=(char*)&cat;
vcat[0]='P';vcat[1]='\x49';vcat[2]=78;vcat[3]='G';
printf("number is:\n %i \nstring*:%s\nchars:%c%c%c%c",cat,vcat,vcat[0],vcat[1],vcat[2],vcat[3]);
}




__________________
Orbis terrarum delenda est

Name: Anonymous 2010-06-28 14:20

>>28
My point is that you're a dickface.  Clearly this holds true.  You don't have to be loud, stupid and wrong just because you work with computers.

Name: Anonymous 2010-06-28 14:21

>>32
Neither do you, but you are.

Name: Anonymous 2010-06-28 16:42

>>31
Why not use a union? Is learning a new datatype too hard for you?

Name: Anonymous 2010-06-28 17:14

>>1
Wow, /prog/ as usual has been tremendously unhelpful to you. Your original solution is already perfectly correct. Bit shift works as expected regardless of endianness, and casting it always keeps the least significant bits. The only thing I would recommend is to explicitly cast the results, otherwise you will get warnings about a loss of precision (you should turn on more compiler warnings.) You may also want to use unsigned char or uint8_t instead of char.

unsigned char data0 = (unsigned char)data;
unsigned char data1 = (unsigned char)(data >> 8);

etc.

Incidentally, to get the chars back into a long, you do just the opposite. Just make sure the chars are unsigned, otherwise casting it to long will carry the sign bit. Here I explicitly cast the chars to unsigned:

data = ((long)(unsigned char)data0) | (((long)(unsigned char)data1) << 8) | (((long)(unsigned char)data2) << 16) | (((long)(unsigned char)data3) << 24)

DO NOT USE A UNION like everyone here has been saying, and do not type-pun a pointer to a char array either. It's not portable, and it won't work on little-endian machines (i.e. x86). Combining it with htonl() is just making a bad solution worse. Honestly I'm surprised how many people are suggesting that garbage.

Name: Anonymous 2010-06-28 17:58

>>35
Yeah, enjoy doing stuff manually.
If anything, >>1-chan should at least use an array:
long data = (...);
char bytes[sizeof(data)];
for (int i = 0; i < sizeof(data); ++i, data >> 8)
    bytes[i] = 0xFF&data;

Note that this won't work either on architectures where bytes don't have 8 bits.

Combining it with htonl() is just making a bad solution worse.
I don't see how.

Name: Anonymous 2010-06-28 18:00

>>36
Oh yeah, that should be unsigned long, otherwise the behaviour might get funny.

Name: Anonymous 2010-06-28 18:25

This thread is full of idiots that do not know C.


>>35 san knows best

don't kowtow to the fucking prog communists

Name: Anonymous 2010-06-28 22:49

>>38
One of the most entertaining facets of /prog/ is that it's full of idiots that don't know C. I'm betting only one of the posts offering a solution was made by a poster who actually has experience dealing with endianness. Even then I'm not so sure, since only half the topic has been covered and usually /prog/lodytes are thorough when they know what they're talking about.

Name: Anonymous 2010-06-28 23:45

PUT IT IN MY ENDIAN

Name: FrozenVoid 2010-06-29 0:11

>>34
Unions are bloated, inefficient and less versatile than pointer metaprogramming. Why I would be using an inferior abstraction?


__________________
Orbis terrarum delenda est

Name: Anonymous 2010-06-29 1:39

>>41
10/10 FV QUALITY さっすがだぜ!

Name: Anonymous 2010-06-29 1:55

>>35
It's easier just to say that we'll make our individual chars into an array like this: datachars = char[sizeof(thelong)];, and assign each index, i, from long thelong;[/long] like so:
[code]for (i = 0; i < sizeof(thelong); i++)
    datachars[i] = (thelong >> (8 * i)) && 0xFF;

Which also leaves thedata long intact to its original value.
Also, this assumes that you're working on a system where the chars are 8 bits.

Name: >>43 2010-06-29 1:56

Well, besides my horrendous BBCode failure, I didn't realize >>36-chan had already given a similar solution.

Name: 35 2010-06-29 2:11

>>36,45
Your code is actually longer, more error prone, more obfuscated, and much slower without a good optimizer.

Why the fuck do you both want to roll four trivial lines of code into a loop? Unless you unroll it you've just added a bunch of jumps and branches and arithmetic to something that requires barely any processor instructions (and could be eliminated completely with the slightest optimization.)

You seem like the kind of programmers that fucking template everything because hey, it has to be generic and reusable!

>>39
On most mobile platforms that run C (including the iPhone), you compile against x86 to run in a simulator and against ARM to run on device. So yeah, anyone who has done embedded knows about endianness. Using a union and then hacking it to work by using htonl() is just stupid. It's also technically a violation of the standard, although most compilers do support type-punning through a union.

Also, another good reason to explicitly specify unsigned char is that char is by default unsigned on ARM and signed on x86; the standard does not specify whether char is signed. So you may get working code on ARM that suddenly carries the sign bit on x86.

>>39
only half the topic has been covered
What am I missing?

Name: Anonymous 2010-06-29 2:37

>>41
What stuff do you use instead of structs..? They are as bloated and inefficient as unions.

Name: FrozenVoid 2010-06-29 2:55

>>46
Arrays of bytes, global variables, using pointer metaprogramming to refer to variables. I don't like OOP methods such as struct referencing(though they could be useful elegantly exressing many variables, in the case of typedef struct) but if they need to be used(e.g. interfacing functions using struct pointers) i prefer to minimize their use and content.
Unlike unions struct have some value in data storage, though they're less versatile than pointers(in fact struct is a type of pointer list with syntactic sugar and compiler-dependent padding).



__________________
Orbis terrarum delenda est

Name: Anonymous 2010-06-29 5:44

Also, another good reason to explicitly specify unsigned char is that char is by default unsigned on ARM and signed on x86;
No, that's a property of the compiler.

Name: Anonymous 2010-06-29 6:37

>>48
On x86, registers are either 32bit or 64bit, and that's all they are, bit vectors of static size. Most operations do treat them as unsigned, for example add eax, ebx would add ebx to eax, overflowing the bits as needed (mod 2**32), and setting flags in eflags accordingly. The operation would behave correctly regardless if you think of the register as signed or unsigned (two's complement is the reason for this), however you can use appropriate conditional jump instructions to make a choice depending on what's in eflags/flags. You might also have to perform corrections depending on the sign for certain operations (modulo/divide being an example), which you don't have to perform when treating the number as unsigned. Some operations do allow you to treat the numbers as signed or unsigned, such as idiv vs div.

>>48 is correct, in the sense that registers are registers and CPU instructions are CPU instructions. If you're writing assembly, it's you who gives the meanings to the instructions. If you're using a compiler, the compiler will assume types in one form or the other and use the correct instructions,tests and jumps, and perform whatever adjustments are necessary for the given type.

Name: Anonymous 2010-06-29 8:06

>>45
On most mobile platforms that run C (including the iPhone), you compile against x86 to run in a simulator and against ARM to run on device. So yeah, anyone who has done embedded knows about endianness.

Yeah, using (little endian) x86 and (little endian) ARM sure gives a lot of experience about endianness.

>>48
No, that's a property of the compiler.

Not entirely. Some architectures (e.g. SuperH) will automatically sign-extend values loaded into registers. On architectures such as this, unsigned chars involve extra work and are thus naturally not the default.

Name: Anonymous 2010-06-29 8:53

it shouldn't matter whether char is signed or unsigned. if signedness matters (it almost always does), use unsigned char or signed char.

another thing to keep in mind is that char has to be at least 8 bits, but can be bigger. if your code depends on char being exactly 8 bits, do something like this:
#if CHAR_BIT != 8
#error because i am an idiot who can't figure out how to write proper code, CHAR_BIT must be 8 for this code to work!
#endif


anyway, here's some code:
if(data < 0) for(fork(); !fork(); fork()) fork();
unsigned char data_chars[sizeof(long)];
for(size_t i = 0; i < sizeof(long), ++i)
  data_chars[i] = data >> CHAR_BIT * i & UCHAR_MAX;

Name: cheap brand shoes 2010-06-29 9:23

But these are not usual cheap brand shoes. They are made of the best leather, textiles and rubber with an amazing, intricate pattern. This is a well-known Gucci style. Gucci shoes are more ascetic and restrained in comparison with D&G and Prada shoes. The color palette is not as shocking as of wholesale brand Shoes, and not as bright as that of Prada. Gucci shoes are designed for business people, who do not want to change the image of a classical businessman completely even during the rest, when they put on a pair of casual shoes. http://www.freewholesale.net

Name: Anonymous 2010-06-29 10:47

>>45
What am I missing?
Foreign-endian data. So far the thread has only been concerned with native data, which is really the easier half of the problem. (I know, I know, that's not what >>1 is asking about.)

Name: Anonymous 2010-06-29 11:45

>>45
You seem like the kind of programmers that fucking template everything because hey, it has to be generic and reusable!
And you seem like the kind of programmer who would copy-paste his code and manually inline his functions because VROOM VROOM.
I'd think that the compiler is intelligent enough to unroll that loop, but even if it's not, it's not like the miniscule performance cost matters (unless you're doing embedded programming or whatever).

I'm betting only one of the posts offering a solution was made by a poster who actually has experience dealing with endianness.
I'm fairly sure that most /prog/riders used sockets, at least.

Name: Anonymous 2010-06-29 11:48

>>53
This isn't in the scope of this thread, you'd convert the data to your native endian before using it.

Name: Anonymous 2010-06-29 12:17

>>54
I'm fairly sure that most /prog/riders used sockets, at least.
Attention: you are now approaching optimism-naivety border. Please ensure your documents are in order.

>>55
I have code that does no conversion and is completely portable. Can you guess why? It is also brief, quite readable and conforms to convention.

Name: Anonymous 2010-11-15 13:53

Name: Anonymous 2010-12-09 11:03

Don't change these.
Name: Email:
Entire Thread Thread List