Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Fast positive integer power function

Name: Anonymous 2012-04-15 6:00

I'm trying to figure out how to compute an integer power of a double efficiently with as little overhead as possible (trying to avoid loops and recursion to minimize the amount of calls and jumps). Calculation speed is the priority.
So far I got this trivial piece of code:

double f(double a,char b){
if(b==2)return a*a;
if(b==3)return a*a*a;
if(b==4)return a*a*a*a;
if(b==5)return a*a*a*a*a;
if(b==6)return a*a*a*a*a*a;
if(b==7)return a*a*a*a*a*a*a;
if(b==8)return a*a*a*a*a*a*a*a;
if(b==9)return a*a*a*a*a*a*a*a*a;
        return a*a*a*a*a*a*a*a*a*a;
}


...which of course assumes the exponent is at most 10. The code can easily be expanded if higher exponents are needed.
I'm thinking this is probably optimal, but I know you /prog/riders have some magic tricks up your sleeve, so hit me with them.

Name: Anonymous 2012-04-15 6:08

Stop programming right now . Don't ever program again . Leave this board and never speak of what has occurred today.

Name: Anonymous 2012-04-15 6:16

>>2
Why?

Name: Anonymous 2012-04-15 6:20

here you go op:


>>> def power(n,p):
...   return eval("n" + "*n"*(p-1))
...
>>> power(2,4)
16
>>> power(256,3)
16777216
>>>

Name: Anonymous 2012-04-15 6:33

>>1
By god, you've successfully made it slower than even a non-tail call optimized recursive function

Name: Anonymous 2012-04-15 6:41

>>5
That's incorrect, it's implementation dependent which is faster.

Name: Anonymous 2012-04-15 6:42

>>5
OP here. It is not slower than recursion. I just tested it. It is about two times faster.

Name: Anonymous 2012-04-15 6:45

>>7
Wrong again, implementation dependency, if you specify a compiler and a version you may say which is faster.

Name: Anonymous 2012-04-15 7:03

>>8
Well, if ey just tested it, then ey must have an compiler to test it with.

Name: Anonymous 2012-04-15 7:12

I think the fastest implementation involves decomposing the exponent into a sum of powers of two, and then performing repeated squaring of the base

Name: Anonymous 2012-04-15 7:13


f :: Num a => a -> Int -> a
f a b | b == 2 = a2
      | b == 3 = a3
      | b == 4 = a4
      | b == 5 = a4 * a
      | b == 6 = a3 * a3
      | b == 7 = a3 * a3 * a
      | b == 8 = a4 * a4
      | b == 9 = a4 * a4 * a
      | True   = a4 * a4 * a2
      where a2 = a  * a
            a3 = a2 * a
            a4 = a2 * a2

Name: Anonymous 2012-04-15 7:13

Name: VIPPER 2012-04-15 7:27

Do those few nanoseconds matter? Wouldnt it just be better to make an iterative function?

Name: Anonymous 2012-04-15 7:35

>>11
sweet

Name: Anonymous 2012-04-15 7:40

>>9
Yes but are you certain that his compiler is the same as >>5's compiler?

Name: Anonymous 2012-04-15 7:54

a2n = (a2)n

double pow(double x, int n) {
  double x2;
  if (n == -1) return 1 / x;
  if (n < 0) return 1 / pow(x, -n);
  if (n == 0) return 1;
  if (n == 1) return x;
  x2 = x * x;
  if (n == 2) return x2;
  if (n % 2 == 0) return pow(x2, n / 2);
  return x * pow(x2, (n - 1) / 2);
}


This seems O(log n).

Name: Anonymous 2012-04-15 8:01

>>16
It'll do O(log(n)) multiplication operations, but if you are doing arbitrary precision arithmetic, you would also have to account for the amount of time it takes for each multiplication operation, as the numbers became larger and larger. But since it's just a double, multiplication time is bounded by a constant.

Name: Anonymous 2012-04-15 8:13

>>15
nope, just that there is a compiler C_1 running on platform P_1, and a compiler C_2 running on a platform P_2, such that there exists two non tail call optimized implementations, NTC1 and NTC2, and there exists a reasonably small epsilon, such that:

time(C_1, P_1, >>1) > time(C_2, P_2, NTC1)

and

| 2*time(C_1, P_1, >>1) - time(C_2, P_2, NTC2) | < epsilon

Name: Anonymous 2012-04-15 8:13

>>16
double g(double a, unsigned b)
{
    double c, d = 1;
    unsigned e;

    for (; b > 0; b -= e) {
        c = a;
        for (e = 1; (e<<1) <= b; e <<= 1) { c *= c; }
        d *= c;
    }

    return d;
}


My test indicates that your function gives imprecise results.
f:   0.207229: 9999990000004500129961189827839490010025633955437521662187637223456768.000000
g:   1.470572: 9999990000004500129961189827839490010025633955437521662187637223456768.000000
pow: 1.312694: 9999990000004498597465648961950631651678606805128338043448515039854592.000000


That's for 9999999^10

Name: Anonymous 2012-04-15 8:14

wait, fixed:

time(C_1, P_1, >>1) > time(C_1, P_1, NTC1)

and

| 2*time(C_2, P_2, >>1) - time(C_2, P_2, NTC2) | < epsilon

Name: Anonymous 2012-04-15 8:31

#include <math.h>

double f(double x, double y)
{
    return pow(x, y);
}


Thread over.

Name: Anonymous 2012-04-15 10:09

dubs check them

Name: Anonymous 2012-04-15 10:26

>>21
Slower than a dead frozen snail at the bottom of a cement pool.

Name: Anonymous 2012-04-15 10:43

>>23
Stop using glibc.

Name: kodak_gallery_programmer !!kCq+A64Losi56ze 2012-04-15 11:11

>>21
The function is kind of slow because you call pow(). Nice job. You're still below average. Now go scrub another toilet.

Name: Anonymous 2012-04-15 11:25

>>18,20
And what happens if the compiler does TCO on the non tail call optimized version (you know, like a modern version of GCC would do with optimization)?

And what happens if the compiler creates separate functions for each branch in >>1's function and chooses the corresponding function to call at compile time so the whole procedure becomes branch-less (you know, like a modern version of GCC would do with optimization)?

Your first inequality fails as long as the second optimization is performed, the second of course can never fail since you haven't given any bounds to your epsilon (I presume it is at least greater than zero), which also renders the whole inequality totally worthless since it teaches us nothing new beyond that an absolute value is less than some other value.

Name: Anonymous 2012-04-15 11:42

double f(double a, char b){
  while (1) {
    double try_result = rand() / 128.0;
    if (char == 2 && sqrt(try_result) == a)
      return try_result;
    else if (pow(a, b) == try_result)
      return try_result;
  }
}

Name: Anonymous 2012-04-15 12:16

>>26
The (maximum value of) epsilon for double is given in the C standard.

Name: Anonymous 2012-04-15 12:17

>And what happens if the compiler does TCO

You need to tell us what TCO stands for. Sorry, but I can't read the mind of a fucking idiot.

Name: Anonymous 2012-04-15 12:20

>>29
Anus Call Optimization

Name: Anonymous 2012-04-15 12:24

>>30
>>26 is a fucking moron.

Name: Anonymous 2012-04-15 13:43

>>33
nice dubs, mr. sage-no-text

Name: Anonymous 2012-04-15 13:48

>>29
You haven't read SICP, but you think you can call other people idiots?

Name: Anonymous 2012-04-15 13:49

>>32
Close, but no cigarillo.

Name: Anonymous 2012-04-15 14:30

>>33
Not only have I read SICP, but I've also done almost all of the exercises in that that book.

Name: Anonymous 2012-04-15 14:57

>>26
And what happens if the compiler does TCO on the non tail call optimized version (you know, like a modern version of GCC would do with optimization)?

Well, seeing how both posters were comparing >>1 to a non tail call optimized version of the program, I can only trust that they know for a fact that compiler C_i did not produce a tail call optimized version of implementation NTC_i, for platform P_i. We could account for unreliability there by assigning a probability to the event of NTC_i being tail call optimized.

And what happens if the compiler creates separate functions for each branch in >>1's function and chooses the corresponding function to call at compile time so the whole procedure becomes branch-less (you know, like a modern version of GCC would do with optimization)?

Then that would be a pretty bitchen compiler. Inlining and constant folding ma nigga. But even then that is still describable using this model. Because the compiler and source file are both present in the time function, the compiler is free to generate whatever binary for the platform it wants. The time function only measures the time taken for the generated binary.

Your first inequality fails as long as the second optimization is performed,

maybe, maybe not. You never know.

the second of course can never fail since you haven't given any bounds to your epsilon (I presume it is at least greater than zero), which also renders the whole inequality totally worthless since it teaches us nothing new beyond that an absolute value is less than some other value.

There exists a reasonably small epsilon such that the sentence holds true. The value of epsilon depends on >>5-kun's accuracy in perception of time, and is probably something like a tenth of a second, or maybe a half of a second. Come to think of it, it's probably linearly related to the amount of time ey waits, so it might be better to describe it like that in general.

Name: Anonymous 2012-04-15 14:59

>>36
maybe, maybe not. You never know.
Given the premises then it obviously will you stupid piece of shit.

There exists a reasonably small epsilon such that the sentence holds true.
This is not a mathematically precise statement you dumb shit, any estimate you give will be wrong though until you define a compiler.

Read the C standard.

Name: Anonymous 2012-04-15 15:01

>>28
Irrelevant, the epsilon we're referring to is a mathematically unbounded epsilon not a C double.

>>29
You must be mentally challenged, can't you read you stupid piece of shit?

>>35
So you can't read properly, get real eyes or a real brain you fucking mouth breather.

Name: Anonymous 2012-04-15 15:03

>>29
Actually now that I think about it this is actually kind of funny, we're talking about tail call optimization but this genius can't figure out what TCO means?

Name: Anonymous 2012-04-15 15:10

So is kodak still trying to pass calculus 101?

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List