for(c=1;c<inpsize;c++){
l0=getb0(input[c]);
l1=getb1(input[c]);
l2=getb2(input[c]);
l3=getb3(input[c]); //opt
if(l0^3==l3){ total++;
switch(startnib){//
case 0:;bitsave|=(l0);bitsave|=(l1<<2);bitsave|=(l2<<4);;startnib=3;break;
case 1:;bitsave|=(l0<<2);bitsave|=(l1<<4);bitsave|=(l2<<6);
fputc(bitsave,datafile);bitsave=0;startnib=0;break;
case 2:;bitsave|=(l0<<4);bitsave|=(l1<<6);
fputc(bitsave,datafile);bitsave=0;bitsave|=(l2);startnib=1;break;
case 3:;bitsave|=(l0<<6);fputc(bitsave,datafile);bitsave=0;
bitsave|=(l1);bitsave|=(l2<<2);startnib=2;break;;default:break;;}
output[opos]|=(1<<(bitpos));}else{//save all
switch(startnib){//no change in start nib
case 0:;bitsave|=(l0);bitsave|=(l1<<2);bitsave|=(l2<<4);bitsave|=(l3<<6);
fputc(bitsave,datafile);bitsave=0;;break;
case 1:;bitsave|=(l0<<2);bitsave|=(l1<<4);bitsave|=(l2<<6);
fputc(bitsave,datafile);bitsave=0;bitsave|=(l3);;break;
case 2:;bitsave|=(l0<<4);bitsave|=(l1<<6);
fputc(bitsave,datafile);bitsave=0;bitsave|=(l2);bitsave|=(l3<<2);;break;
case 3:;bitsave|=(l0<<6);fputc(bitsave,datafile);bitsave=0;
bitsave|=(l1);bitsave|=(l2<<2);bitsave|=(l2<<4);break;default:break;;;}
>>2
I formatted it for easy reading, added comments and posted with [code] tags, what is wrong with it?
Name:
Anonymous2011-10-12 11:59
>>3
You forgot the int from main(int argc,char**argv){
It might be valid C-code, but very hard to read.
Name:
Anonymous2011-10-12 12:21
>C89
Why? Why are you stuck in the 80s?
>u1-8
WHY?
Name:
FrozenVoid2011-10-12 12:24
>>4
Its superfluos. >>5
u8 is easy to type instead of "unsigned long long" its 2 vs 27 chars(a x13 increase in space).
Its easy to discern:
u8= unsigned 8 bytes= quad
u4= unsigned 4 bytes= Int
u2= unsigned 2 bytes= short
u1= unsigned 1 bytes= char
Name:
FrozenVoid2011-10-12 12:25
Here is the complete type defines from void.h
#define u1 unsigned char
#define u2 unsigned short
#define u4 unsigned int
#define u8 unsigned long long
#define s1 signed char
#define s2 signed short
#define s4 signed int
#define s8 signed long long
#define f2 short float
#define f4 float
#define f8 double
#define f10 long double
Unconventional because the numbering typically counts the bits. u32, for example.
Unnecessary because there are uintN_[least/fast/]t in stdint.h already.
Wrong because, for example, unsigned long is not 32-bits in size, but at least 32-bits in size.
Also, if you're using C89, long long is not defined.
Also, why not typedefs instead of macros?
IOW: extremely shitty code.
Name:
Anonymous2011-10-12 12:31
Lacks some defines, here
#define uf10 unsigned long double
#define uf8 unsigned double
#define uf6 unsigned long float
#define uf4 unsigned float
#define uf2 unsigned short float
>>8
i want to easily define some quantity: i don't think in terms of bits
bit are not directly addressable via a pointer, only bytes.
I want to allocate variable which can be defined in bits, i don't think such exist: in fact i have solved over a dozen corruption errors which all stem from the fact bits cannot be addressed individually.
There is no uint12bits or whatever nonsense. There bytes, just bytes and ways to address them. To get real 12bits, you have to define a struct with bitfields, which contain byte-level variables.
And in practice, its not worth the abstraction if you can write bithacks for fast and optimized code which uses shifts/masks.
Name:
FrozenVoid2011-10-12 12:46
>>12
Its easy to understand:
1. every byte is split into 4 parts 2 bits each
2. if part0^3==part3 (part3 is part0 inversed)
save only 3/4 of byte and set 1 bit in bitmap
2.if part0^3!=part3 save the entire byte, set 0 in bitmap.
4.the bitmap can be compressed conventionally for further 20-40% reduction in size.
5. the data with saved bytes, is a separate file which can be recursively compressed again(about 3-10% reduction in size).
>>15
It does not violate any "pigeonhole principle" it just wouldn't compress some data(that which has unlinked bit parts as in >>14 .
Name:
FrozenVoid2011-10-12 13:29
The idea of compressing single byte is about redudancy of the representation as described in http://dis.4chan.org/read/prog/1318181159/
The xor part is about avoiding relying on static properties of data,
e.g. compressing all 00's or 101's, it shifts the generation on data context to be more dynamic(if you notice, the "removed" bit parts are essentially random, since they reflect the underlying data structure from which they are reconstructed)
Assuming homogenous distribution,
- 25% of the time (when parts match), you save 7 bits per byte;
- 75% of the time (when they don't), you save 9 bits per byte.
That is, on average, 8.5 bits per byte. Awesome compression.
Also, why the fuck do you need to "invert" parts? For "fast and optimized code"?
Seriously, just stop trying. You visibly has no knowledge in either C or information theory.
And to not pose as a completely mean guy, read about Huffman encoding. It has been proven optimal for entropy encoding, and it is easy to implement. Oh, unless your algorithm is rather "specialized" for some "special data format" I'm unaware of.
Name:
FrozenVoid2011-10-12 13:41
>That is, on average, 8.5 bits per byte. Awesome compression.
The filesize is reduced
1. a completely incompressible file of 1 million binary digits is
now 337kb(with 51kb bitmap which is further compressible since it contains redundancy) instead of 415kb. http://marknelson.us/2006/06/20/million-digit-challenge/
Name:
FrozenVoid2011-10-12 13:59
2. Since the encoder is context sensitive it does not care if there are
25% of Possible values(that is theoretical freq for static,single value match),it just xors and matches data from stream.
I will demonstrate why your thinking is deficient:
1. the "invert" is xor generation step
2. it generates data from context: if data was the same all the time, such as 00's it shows as no matches. it is meant for more random data.
3. there is a small pool of 4 possible bitstrings for 2 bits:
xor^3 flips all bits in a bitstring: 00^3=11, 01^3=10,10^3=01,11^3=00. this ensures the 2bits of last part
are different and in random data there is distance of 4 bits from part0 to part3 which further makes this more random.
4. The value system is meaningless: there is no static partX which is found and eliminated, it scans for "change",
The bytes are only used as container for end and start: the scheme could be reworked with bitfields and more optimal generation, but the principle remains the same:
Context can generate its own data,as if it were a fractal.
Name:
Anonymous2011-10-12 16:08
Someone just tell this nigger what a fast fourier transform is already.
Name:
FrozenVoid2011-10-12 18:55
>>21
This is about lossless compression. Infinite lossless compression.