Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

New and revolutionary data comression scheme!

Name: Anonymous 2009-06-13 17:17

Infinite compression?

I've always was interested in how compressed files worked and why the compression factor is so low.
The entropy explanation isn't something i would accept without tinkering with the data. The idea of my compression algorithm(currently under development) is to
use algebraic properties of files to express the data in more concise way.
The thing might sound trivial, but its implementation is not.
Normal compression works by splitting FILE and finding redundant pieces to express them in minimum volume.
Imagine a FILE read and converted to arbitrary precision integer. Now consider all the myriad ways to generate said integer. Sounds difficult? Not so much.
1.lets take a Prime Number,e.g. 2 and raise it to power closest to our integer, e.g. 20000. Note the difference from the (FILE-result).
2.Get the smallest difference with the powers available,
and proceed to next step:
3.If the difference is negative: find 2 raised to the power
of X which would close the gap,by substracting it from #1
If the difference is positive just add 2 with closest power to the difference .
The end result could be something like
2^6+2^5-2^3+2^1-2^0=Integer=FILE
Its can be improved further by using other prime numbers powers with the idea of expression taking the least space.
The same thing can be accomplished with arbitrary length floating point numbers like 2^123712.1282 which would converge faster,but will require some fixing to convert to a stable binary result.
Posted by FrozenVoid at 15:37

Name: Anonymous 2009-06-15 3:02

What if the integer representation of the data happens to be the product of two prime numbers, I think you're going to be waiting a really long time to produce any usable results.
Summary of the problems I found:
* No way to represent leading zeros
* Not uniquely-decodable unless some additional information is added (number of terms, length and position of each term - ie. you're always bounded by the entropy of the source)
* The factorization problem places bounds upon how quick this thing can work. This is virtually unusable for anything such as stream coding (DVD-ROM, GSM, WiFi, etc...), as well as any practicle sized data sets.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List