In a lot of videogame code, whenever a lot of numbers are to be divided by one number, I usually see this:
inv = 1.0 / divisor;
...and then all the numbers * by inv
rather than all the numbers being / by divisor.
I'm guessing, at the CPU level, division is slightly slower and less efficient than multiplication? I can't really think of any other reason why someone would do this.
Name:
Anonymous2006-09-25 17:25
If divisor is a constant, and your compiler is not smart enough, this should increase performance. Division is more costly than multiplication. Depending on the processor design, the difference can be very large.
Name:
Anonymous2006-09-25 17:44
>>2
>If divisor is a constant, and your compiler is not smart enough
That must be some compiler.
Name:
Anonymous2006-09-25 22:43
its because a computer thinks in base 2, while we do it in base 10. So some simple base 10 decimals are harder to convert as base 2 decimals
Name:
Anonymous2006-09-25 22:54
>>4
God, just don't even post if you're going to say something this stupid.
>>4
It has nothing whatsoever to do with base conversions. Don't be so retarded.
Name:
Anonymous2006-09-26 16:50
>>1
What the fuck kind of question is this? Of COURSE division is more work than multiplication. Ever used a fucking pencil and paper before? Ever done long division?
>>4
Fucking tard, it's a lot EASIER to do division in base 2. Lnoe smallest prime factor, as opposed to the awful 2x5 decimal system.
You're all idiots.
Name:
Anonymous2006-09-26 17:12
>>1-7 $ cat multtest.c;echo --;cat divtest.c
int main() {
float inv;
float i;
inv=1.0/42;
for(i=0;i<1048576;i++)
printf("%f\n",i*inv);
return 0;
}
--
int main() {
float i;
for(i=0;i<1048576;i++)
printf("%f\n",i/42);
return 0;
}
$ gcc -O3 -msse3 -ffast-math multtest.c -o multtest
$ gcc -O3 -msse3 -ffast-math divtest.c -o divtest
$ ./multtest|md5
f17a45565039beca9f5ed6ff27b17afd
$ ./divtest|md5
f17a45565039beca9f5ed6ff27b17afd
$ time -h ./multtest>/dev/null
4.07s real 2.97s user 0.01s sys
$ time -h ./divtest>/dev/null
4.07s real 2.97s user 0.01s sys
Name:
Anonymous2006-09-26 17:20
>>9
Awesome, gcc for teh winrar. Now do it in Fortran.
>>11
Yes, because we all know video game programmers compile without optimization.
Name:
Anonymous2006-09-26 19:04
>>9
oh wow...how did you get it to compile to the same assembly? i think you are cheating.
Name:
Anonymous2006-09-26 19:31
>>11
ok... compiled with only -msse3: $ /usr/bin/time -h ./multtest > /dev/null
4.75s real 2.96s user 0.01s sys
$ /usr/bin/time -h ./divtest > /dev/null
4.27s real 3.02s user 0.00s sysbut >>12 makes a good point. anyone who cares enough about performance to worry about this is probably going to be using optimization.
>>13
they are functionally equivalent. any good optimizing compiler should compile both to the same assembly.
>>12 >>14
You shouldn't be depending on the compiler to optimize your code for you, just in case it doesn't catch it.
Name:
Anonymous2006-09-26 22:33
>>18
To play devil's advocate, I'd argue that obfuscating code in the hopes of optimizing it can prevent the compiler from optimizing it properly in ways you didn't think of.
Name:
Anonymous2006-09-27 1:23
Compilers get better at optimization with every generation, but our cognitive capacities remain exactly the same.
I used to leave nothing to chance, and did all optimization myself. There was just one problem: a few years later none of that was necessary, yet I still couldn't read a shred of what I'd written.
Moral of the story: write clear code. Please. :(
Name:
Anonymous2006-09-27 12:35
Optimizing code is still very neccessary, for example applying a pattern to an array in blocks rather than going through each element row by row to reduce cache misses. I don't think compilers will ever be able to do that, at least not in the near future.
>>22
If you have a large 2D array, and you are moving left to right through all of the rows calculating a value based on adjacent values, each time you miss the cache for the row you are on, you'll also miss it for the rows above and below it. However, if you divide the problem into blocks based on the cache word size, then you will only have one miss instead of three each time you move to a new row. The only miss will be the row below it, as the current row and the row above it should already be in cache.
Name:
Anonymous2006-09-28 12:21
>>10 $ cat multtest.f90;echo --;cat divtest.f90
program multtest
real :: inv,i
inv=1.0/42
do i=0,1048676
write(*,*) i*inv
end do
end program multtest
--
program divtest
real :: i
do i=0,1048676
write(*,*) i/42
end do
end program divtest
$ gcc -O3 -msse3 -ffast-math multtest.f90 -o multtest-f90.s -S
In file multtest.f90:4
do i=0,1048676
1
Warning: Obsolete: REAL DO loop iterator at (1)
$ gcc -O3 -msse3 -ffast-math divtest.f90 -o divtest-f90.s -S
In file divtest.f90:3
do i=0,1048676
1
Warning: Obsolete: REAL DO loop iterator at (1)
$ diff multtest-f90.s divtest-f90.s
1c1
< .file "multtest.f90"
--- .file "divtest.f90"
4c4
< .string "multtest.f90"
--- .string "divtest.f90"
33c33
< movl $5, -276(%ebp)
--- movl $4, -276(%ebp)
same assembly, therefore same speed.
much slower than the c versions, though: $ time -h ./multtest-c>/dev/null
3.64s real 2.96s user 0.00s sys
$ time -h ./multtest-f90>/dev/null
55.25s real 25.09s user 1.54s sys
Name:
Anonymous2006-09-29 13:20
LOOK, FORGET ALL THIS CACHE SHIT.
IN LIKE 2 YEARS, THEY WILL COME OUT WITH SRAM DIMMS. THEN ALL RAM WILL RUN AT THE SAME (HIGH) SPEED, AND ALL THIS SHIT ABOUT CACHE WON'T MATTER.
STUPID HARVARD ARCHITECTURE.
Name:
Anonymous2006-09-29 13:28
LOOK, FORGET ALL THIS PROGRAMMING SHIT.
IN LIKE 2 YEARS, THEY WILL COME OUT WITH COMPUTERS THAT YOU CAN PROGRAM WITH YOUR MIND. THEN ALL PROGRAMS WILL RUN AT THE SAME (HIGH) SPEED, AND ALL THIS SHIT ABOUT PROGRAMMING WON'T MATTER.
STUPID HARVARD PROGRAMMING.
Name:
Anonymous2006-09-30 13:22
LETS FORGET ABOUT /prog/ because they are worthless game programmers who will never finish any program ever.
Name:
Anonymous2006-09-30 20:20
>>27
Hey look, it is that idiot that doesn't know what a game programmer is.