Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

FIOC is SAF (Slow as Fuck)

Name: Anonymous 2008-10-13 20:16

While reading my SICP, I found a neat fibonacci function that has log n operations. I looks something like this.


(define (fib n)
  (fib-iter 1 0 0 1 n))

(define (fib-iter a b p q count)
  (cond ((= count 0) b)
        ((even? count)
         (fib-iter a
                   b
                   (+ (* p p) (* q q))     ; compute p'
                   (+ (* q q) (* 2 p q))   ; compute q'
                   (/ count 2)))
        (else (fib-iter (+ (* b q) (* a q) (* a p))
                        (+ (* b p) (* a q))
                        p
                        q
                        (- count 1)))))


So I implemented it in Haskell, just for fun.


fib n = fib' 1 0 0 1 n
  where fib' a b p q 0 = b
        fib' a b p q n | even n =
          fib' a b (p * p + q * q) (q * q + 2 * p * q) (n `div` 2)
        fib' a b p q n =
          fib' (b * q + a * q + a * p) (b * p + a * q) p q (n - 1)


To calculate and print fib(1000000) and fib(10000000) (i.e. the millionth and ten-millionth fibonacci numbers) takes about 3.26 seconds with the Haskell version (when compiled with GHC), and 43.24 seconds with the Scheme version (when interpreted by plt-r5rs, cause I can't figure out how to get it to be r6rs compliant, nor can I figure out how to make the compiler accept r5rs code :( )

Just to have some more fun, I also implemented it iteratively in Python.


def fib(n):
    a = 1
    b = 0
    p = 0
    q = 1

    while n > 0:
        if n % 2 == 0:
            new_p = p * p + q * q
            new_q = q * q + 2 * p * q
            n = n / 2
            p = new_p
            q = new_q
        else:
            new_a = b * q + a * q + a * p
            new_b = b * p + a * q
            n = n - 1
            a = new_a
            b = new_b

    return b


This takes CPython a whopping 1926.49 seconds to calculate the same numbers. Holy fuck, /prog/, maybe you're right about FIOC after all.

Here are the other implementations I've written and their times.

Ruby (995.35 seconds)

def fib(n)
    a = 1
    b = 0
    p = 0
    q = 1

    while n > 0
        if n % 2 == 0:
            new_p = p * p + q * q
            new_q = q * q + 2 * p * q
            n = n / 2
            p = new_p
            q = new_q
        else
            new_a = b * q + a * q + a * p
            new_b = b * p + a * q
            n = n - 1
            a = new_a
            b = new_b
        end
    end
    b
end


C using GMP (3.24 seconds)

void fib(unsigned long n)
{
    mpz_t a, b, p, q, tmp1, tmp2, tmp3;

    mpz_init(a);
    mpz_init(b);
    mpz_init(p);
    mpz_init(q);
    mpz_init(tmp1);
    mpz_init(tmp2);
    mpz_init(tmp3);

    mpz_set_ui(a, 1);
    mpz_set_ui(b, 0);
    mpz_set_ui(p, 0);
    mpz_set_ui(q, 1);

    while(n > 0)
    {
        if(n % 2 == 0)
        {
            // p * p + q * q
            mpz_mul(tmp1, p, p);
            mpz_mul(tmp2, q, q);
            mpz_add(tmp3, tmp1, tmp2);

            // q * q + 2 * p * q;
            mpz_mul_ui(tmp1, p, 2);
            mpz_add(tmp1, tmp1, q);
            mpz_mul(tmp1, tmp1, q);

            mpz_set(p, tmp3);
            mpz_set(q, tmp1);
            n >>= 1;
        }
        else
        {
            // b * q + a * q + a * p;
            mpz_mul(tmp1, b, q);
            mpz_mul(tmp2, a, q);
            mpz_mul(tmp3, a, p);
            mpz_add(tmp3, tmp3, tmp2);
            mpz_add(tmp3, tmp3, tmp1);

            // b * p + a * q;
            mpz_mul(tmp1, b, p);
            mpz_mul(b, a, q);
            mpz_add(b, tmp1, b);

            mpz_set(a, tmp3);
            n--;
        }
    }

    gmp_printf("%Zd\n", b);

    mpz_clear(a);
    mpz_clear(b);
    mpz_clear(p);
    mpz_clear(q);
    mpz_clear(tmp1);
    mpz_clear(tmp2);
    mpz_clear(tmp3);
}

Name: Anonymous 2008-10-13 22:50

>>11
Ah, your problem there is that Python's (and Ruby's, for that matter) str(bigint) conversion is slow as fuck and is causing the bottleneck. Your fibonacci algorithm may be O(log(N)), but binary to decimal conversion is a O(n2)1 operation which takes up the majority of the computation time.

That said, the other languages do the same conversion much faster. I suspect it's a performance bug of some sort (or just an accepted limitation) in the Python and Ruby libraries.

                                 
References: [1] http://mail.python.org/pipermail/python-list/2004-July/270235.html

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List