I'm trying to figure out how to compute an integer power of a double efficiently with as little overhead as possible (trying to avoid loops and recursion to minimize the amount of calls and jumps). Calculation speed is the priority.
So far I got this trivial piece of code:
...which of course assumes the exponent is at most 10. The code can easily be expanded if higher exponents are needed.
I'm thinking this is probably optimal, but I know you /prog/riders have some magic tricks up your sleeve, so hit me with them.
Given the premises then it obviously will you stupid piece of shit.
No really, the model doesn't tell you any information about how C_i does its compiling.
This is not a mathematically precise statement you dumb shit, any estimate you give will be wrong though until you define a compiler.
Exists(epsilon) with reasonably_small_amount_of_time_in_seconds_relative_to_human_perception_of_time(epsilon), such that P(epsilon)
And reasonably_small_amount_of_time_in_seconds_relative_to_human_perception_of_time(epsilon) might be defined as epsilon < .25 seconds, or something similar depending on what one find appropriate for describing the sentence in >>7.
It's true that a compiler has not been defined yet, making these statements not very useful. But they are justifiable in the sense that they are simply collecting truths from the statements provided in >>5 and >>7.
Name:
Anonymous2012-04-15 15:16
>>41
Congratulations, you don't know basic things like TCO, I bet your mom is proud of you mastering how to shitpost you fucking moron.
>>42
You're retarded, I'm going to stop talking to you.
You may also want to have Mathematica or Maple handy, because you will need to generate polynomial coefficients.
Name:
Anonymous2012-04-16 0:37
TCO (only pending computation is in power for 1 call), algorithm optimized, could be made iterative.
double powerEven(double x, int exp, int limit) {
if(exp != limit)
return powerEven(x * x, exp * 2, limit);
else
return x;
}
No, because most hardware doesn't have a built in ^ operator. You need to extract the mantissa and exponent bits directly from the IEEE754 encoding and do some bit twiddling and take the logarithm in base 2.
Name:
Anonymous2012-04-16 1:40
>>53
eh, I was messin with the exponent bits there.
double = (sign, mantissa, exponent)
x : double
y : int
pow(x,y) =
match x with (sign, mantissa, exponent in
(sign, mantissa, exponent * y)
I think that's it, although one would need to handler properly extracting the exponent, and converting from bias 127 or whatever it is and doing the multiplication. Overflows and underflows should take the number +/- inf or zero, depending on where it went. And then there's the cases when x is +/- inf or zero, or nan.
why don't just create n functions for every possible exponent given for your data type? Then you'll avoid the overhead needed for running into all those ifs
Don't forget to include your frozenvoid.h
double pow1(double a) { return a; }
double pow2(double a) { return a*a; }
double pow3(double a) { return a*a*a; }
//...
Name:
Anonymous2012-04-16 4:27
>>51
See >>23
In less than a minute, I can write my own pow function that calculates a non-integer power almost two times faster than the standard pow calculates an integer power.
Guess what niggers, because the standard C floating point arithmetic library is slow, narrow and inconsistent as fuck, I am now going to write my own fucking floating point library with operator overloading and everything. Maybe I will even make it arbitrary-precision.
Multi-valued functions? Superfunctions? Fuck you, I'll define everything however I want. I will make pi equal niggerfaggot if I must. I will implement superpositions and all sorts of crazy shit.
None of you care, but it felt good to say this in public. Why? Because I can pretend some of you care, even if no one does.
>>57
I care. I think it would be fun to overload every seeples operator to do something interesting for numerical stuff. But what's a superposition in this context?
Name:
Anonymous2012-04-16 4:52
TCO now stands for Total Cunt Off
This thread has become a TCO
Name:
Anonymous2012-04-16 5:00
>>58 I care. I think it would be fun to overload every seeples operator to do something interesting for numerical stuff.
I do C, not Sepples. But what's a superposition in this context?
I don't know yet, but I have a feeling I will need it when dealing with infinities, superfunctions, and you know, the unknown. I am faware that math is inherently flawed (according to Gödel) so I will constantly be correcting errors, but never quite get everything right. Then I will implement automatic error correction algorithms, algorithms that correct those algorithms, and so on.
I am so autistic of my nature that I enjoy every second of doing all that shit with no goal, meaning or purpose. As long as it doesn't require conscious effort, it is work that I will do for free. I define conscious effort to be any effort required to hinder oneself. Note that I consider self-destruction and self-domination to be two entirely distinct concepts.
Name:
Anonymous2012-04-16 5:09
>>60
Oh, and remember, kids: If the facts don't fit the theory, change the facts. -Albert Einstein
A Nehalem can do an F2XM1 and FYL2X together in less than 150 clock cycles. (They fucked it up with Sandy Bridge, which takes almost 800 clock cycles.) That's around the range you'll need to beat. A multiplication is about 5 cycles and add is 3 cycles. Exponentiation by squaring will probably do the trick.
If you use the exponentiation by squaring technique, and implement it using this:
#define NUM_BITS sizeof(long long)*8
double powerlonglong(double x, long long n) {
double powers[NUM_BITS]; // powers[i] = x^(2^i)
double mul;
int index;
powers[0] = x; // x^(2^0) = x^1 = x
for(index = 1; index < NUM_BITS; index++) {
powers[index] = powers[index-1]*powers[index-1]; // x^(2^i) = x^((2^i-1)*2) = (x^(2^i-1))^2
}
// x^n = x^(sum c_i2^i) = mul x^(c_i2^i) = mul x_i where x_i = 1 if c_i = 0,
// x_i = x^(2^i) = powers[i] if c_i = 1
mul = 1.0;
for(index = 0; index < NUM_BITS; index++) {
if(n&(1<<index)) {
mul *= powers[index];
}
}
return mul;
}
NUM_BITS is the number of bits taken by the exponent.
It will take NUM_BITS multiplications to generate the powers array. Then the needed elements of the powers array are multiplied together to form the final product. In the worst case (if the exponent is 0b1111111...), then every term in the powers array will be multiplied together again, giving NUM_BITS multiplications again.
So the whole thing can be done usng a worst case 2*NUM_BITS amount of multiplications. So accounting for time in multiplication alone, with would take 10*NUM_BITS clock cycles in the worst case, which would hypothetically give better performance than 150 clock cycles when NUM_BITS < 15, and it could beat sandy bridge when NUM_BITS < 80. So Sandy bridge seems like it would definitely be possible, although you'd have to get the control flow to be efficient.
>>66
index should have started at zero there I think:
got rid of the index variable:
double power3longlong(double x, unsigned long long n) {
double xs = x; // always x^(2^i) power on the ith run of the loop with i starting at zero.
double mul = 1.0;
for(; n; xs = xs*xs, n>>=1) {
if(n & 1) {
mul *= xs;
}
}
return mul;
}
>>72
nothing to wwww about.
what that guy is not-so-subtly telling you is that it's impossible to tell if an algorithm is wrong until it actually goes wrong. to make matters worse, you yourself decide what is wrong. to make matters even worse, if you want to find out the answer from somewhere other than yourself, you will have to deconstruct yourself. to make matters still worse, all self-deconstruction really does is change the definition of you.
Why not just use a lookup table? This why you only need to add the value of the number to the pointer and dereference it. You don't even need to call a function. HIBT?
Name:
Anonymous2012-04-17 8:34
>>75
double is 64-bit floating point variable. That means there are 2^64 different double variables.
The size of lookuptable should be 2^64 * 2^8 * 8 bytes. That quite a lot. Time to buy more RAM.