Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

New and revolutionary data comression scheme!

Name: Anonymous 2009-06-13 17:17

Infinite compression?

I've always was interested in how compressed files worked and why the compression factor is so low.
The entropy explanation isn't something i would accept without tinkering with the data. The idea of my compression algorithm(currently under development) is to
use algebraic properties of files to express the data in more concise way.
The thing might sound trivial, but its implementation is not.
Normal compression works by splitting FILE and finding redundant pieces to express them in minimum volume.
Imagine a FILE read and converted to arbitrary precision integer. Now consider all the myriad ways to generate said integer. Sounds difficult? Not so much.
1.lets take a Prime Number,e.g. 2 and raise it to power closest to our integer, e.g. 20000. Note the difference from the (FILE-result).
2.Get the smallest difference with the powers available,
and proceed to next step:
3.If the difference is negative: find 2 raised to the power
of X which would close the gap,by substracting it from #1
If the difference is positive just add 2 with closest power to the difference .
The end result could be something like
2^6+2^5-2^3+2^1-2^0=Integer=FILE
Its can be improved further by using other prime numbers powers with the idea of expression taking the least space.
The same thing can be accomplished with arbitrary length floating point numbers like 2^123712.1282 which would converge faster,but will require some fixing to convert to a stable binary result.
Posted by FrozenVoid at 15:37

Name: Anonymous 2009-06-15 12:24

The only way I see this compression scheme working and being feasible in polynomial time is when the data is sufficiently large and sparse.
The header could consist of the original file length (and perhaps the number used for decompression, but let's assume that this is assumed to be 2).
Then the compressed file would look like {header}{pos}{pos}{pos}.... The length (in bits) of a single {pos} would be ceiling $ logBase 2 originalLen.
Essentially, this scheme would specify where the 1 bits are, so a file 000000100000001000010000000000000000000000000000001 would be compressed as header + 4 positions.
The scheme could also specify whether the positions specify 0s or 1s. Using a number p other than 2 is possible too, but in that case the packet length will increase to represent the state of the digit (ceiling $ logBase 2 $ originalLen * (p - 1)).
Additionally, if p was, say, 8, but the only digits appearing (aside from 0, being the base digit) were, say, [1, 2, 5, 7] (notice length digits == 4), the number of states would be 4, so the length of one {pos} would increase by 2 bits, not 3.
That's just expanding on >>91, though.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List