/prog/ - New and revolutionary data comression scheme!

Name: Anonymous 2009-06-13 17:17

Infinite compression?

I've always was interested in how compressed files worked and why the compression factor is so low.
The entropy explanation isn't something i would accept without tinkering with the data. The idea of my compression algorithm(currently under development) is to
use algebraic properties of files to express the data in more concise way.
The thing might sound trivial, but its implementation is not.
Normal compression works by splitting FILE and finding redundant pieces to express them in minimum volume.
Imagine a FILE read and converted to arbitrary precision integer. Now consider all the myriad ways to generate said integer. Sounds difficult? Not so much.
1.lets take a Prime Number,e.g. 2 and raise it to power closest to our integer, e.g. 20000. Note the difference from the (FILE-result).
2.Get the smallest difference with the powers available,
and proceed to next step:
3.If the difference is negative: find 2 raised to the power
of X which would close the gap,by substracting it from #1
If the difference is positive just add 2 with closest power to the difference .
The end result could be something like
2^6+2^5-2^3+2^1-2^0=Integer=FILE
Its can be improved further by using other prime numbers powers with the idea of expression taking the least space.
The same thing can be accomplished with arbitrary length floating point numbers like 2^123712.1282 which would converge faster,but will require some fixing to convert to a stable binary result.
Posted by FrozenVoid at 15:37

Name: Anonymous 2009-06-14 5:25

>>18
Just learn Haskell. I've done much of your "work" for you.



module Main where

import Data.Bits

import Data.Word

import qualified Data.ByteString.Lazy as BS



bsToInt :: BS.ByteString -> Integer

bsToInt = foldl (\i -> \w8 -> i * 256 + (fromIntegral w8)) 0 . BS.unpack



intToBs :: Integer -> BS.ByteString

intToBs i = BS.pack $ (reverse . word8s) i

  where word8s 0 = []

        word8s n = fromIntegral (n .&. 255) : word8s (div n 256)



fvFactor :: Integer -> Integer -> [Integer]

fvFactor 0 _ = []

fvFactor n e

  | 2^e > n = (e - 1) : fvFactor (n - 2^(e - 1)) 0

  | otherwise = fvFactor n (e + 1)



main = do

  conts <- BS.getContents

  putStrLn $ show $ bsToInt conts

This program will convert its input into a large integer. fvFactor will factor n from e with powers of two, which basically just tells you how many 1 bits there are in the integer.

For example, let's take this file "This is just a test.\n". It's integer is 123362224183149454760125818890661635634932860857866.

fvFactor 123362224183149454760125818890661635634932860857866 0
==
[166,164,162,158,157,155,150,149,147,144,142,141,140,137,
136,133,126,125,123,120,118,117,116,113,112,109,102,101,
99,97,94,93,92,90,88,86,85,84,81,80,78,77,76,74,69,62,61,56,
53,46,45,44,42,38,37,34,32,30,29,28,25,24,22,21,20,18,13,11,
10,9,3,1]

That means that the file is equal to 2^166 + 2^164 + 2^162 + ... + 2^9 + 2^3 + 2^1.

Like someone said above, you're just expressing binary in a different way.

Using higher numbers is actually less efficient, as the difference between your b^e form and n is greater if your base b is greater.

New and revolutionary data comression scheme!

1 Name: Anonymous 2009-06-13 17:17

30 Name: Anonymous 2009-06-14 5:25

Name: Anonymous 2009-06-13 17:17

Name: Anonymous 2009-06-14 5:25