Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

division... slow?

Name: Anonymous 2006-09-25 16:25

In a lot of videogame code, whenever a lot of numbers are to be divided by one number, I usually see this:

inv = 1.0 / divisor;
...and then all the numbers * by inv

rather than all the numbers being / by divisor.

I'm guessing, at the CPU level, division is slightly slower and less efficient than multiplication?  I can't really think of any other reason why someone would do this.

Name: Anonymous 2006-09-25 17:25

If divisor is a constant, and your compiler is not smart enough, this should increase performance. Division is more costly than multiplication. Depending on the processor design, the difference can be very large.

Name: Anonymous 2006-09-25 17:44

>>2
>If divisor is a constant, and your compiler is not smart enough

That must be some compiler.

Name: Anonymous 2006-09-25 22:43

its because a computer thinks in base 2, while we do it in base 10. So some simple base 10 decimals are harder to convert as base 2 decimals

Name: Anonymous 2006-09-25 22:54

>>4
God, just don't even post if you're going to say something this stupid.

Name: Anonymous 2006-09-26 5:29

>>4
Whut

Name: Anonymous 2006-09-26 11:05

>>4
It has nothing whatsoever to do with base conversions. Don't be so retarded.

Name: Anonymous 2006-09-26 16:50

>>1
What the fuck kind of question is this? Of COURSE division is more work than multiplication. Ever used a fucking pencil and paper before? Ever done long division?

>>4
Fucking tard, it's a lot EASIER to do division in base 2. Lnoe smallest prime factor, as opposed to the awful 2x5 decimal system.

You're all idiots.

Name: Anonymous 2006-09-26 17:12

>>1-7
$ cat multtest.c;echo --;cat divtest.c
int main() {
        float inv;
        float i;
        inv=1.0/42;
        for(i=0;i<1048576;i++)
                printf("%f\n",i*inv);
        return 0;
}
--
int main() {
        float i;
        for(i=0;i<1048576;i++)
                printf("%f\n",i/42);
        return 0;
}
$ gcc -O3 -msse3 -ffast-math multtest.c -o multtest
$ gcc -O3 -msse3 -ffast-math divtest.c -o divtest
$ ./multtest|md5
f17a45565039beca9f5ed6ff27b17afd
$ ./divtest|md5
f17a45565039beca9f5ed6ff27b17afd
$ time -h ./multtest>/dev/null
        4.07s real              2.97s user              0.01s sys
$ time -h ./divtest>/dev/null
        4.07s real              2.97s user              0.01s sys

Name: Anonymous 2006-09-26 17:20

>>9
Awesome, gcc for teh winrar. Now do it in Fortran.

Name: Anonymous 2006-09-26 17:50

>>9
Try it without any optimization flags

Name: Anonymous 2006-09-26 18:31

>>11
Yes, because we all know video game programmers compile without optimization.

Name: Anonymous 2006-09-26 19:04

>>9
oh wow...how did you get it to compile to the same assembly? i think you are cheating.

Name: Anonymous 2006-09-26 19:31

>>11
ok... compiled with only -msse3:
$ /usr/bin/time -h ./multtest > /dev/null
        4.75s real              2.96s user              0.01s sys
$ /usr/bin/time -h ./divtest > /dev/null
        4.27s real              3.02s user              0.00s sys
but >>12 makes a good point. anyone who cares enough about performance to worry about this is probably going to be using optimization.

>>13
they are functionally equivalent. any good optimizing compiler should compile both to the same assembly.

Name: Anonymous 2006-09-26 19:52

>>14

Confirmed. Running gcc 3.1 with the -S option to generate assembly code issued two files that differed only in local label names.

Name: Anonymous 2006-09-26 20:19

>>15
True but it did not compile to the same binary. I have gcc 3.4.2 mingw-special.

Name: Anonymous 2006-09-26 21:11

>>16

That may be because the debugging information references the names of the source files / compiled program. For example:

~$ cat > test1.c
int main(void) { printf("prog"); }
~$ cp test1.c test2.c
~$ gcc test1.c -o test1
~$ gcc test2.c -o test2
~$ diff test1 test2
Binary files test1 and test2 differ
~$ strip test1 test2
~$ diff test1 test2
~$

Name: Anonymous 2006-09-26 22:03

>>12
>>14
You shouldn't be depending on the compiler to optimize your code for you, just in case it doesn't catch it.

Name: Anonymous 2006-09-26 22:33

>>18
To play devil's advocate, I'd argue that obfuscating code in the hopes of optimizing it can prevent the compiler from optimizing it properly in ways you didn't think of.

Name: Anonymous 2006-09-27 1:23

Compilers get better at optimization with every generation, but our cognitive capacities remain exactly the same.

I used to leave nothing to chance, and did all optimization myself. There was just one problem: a few years later none of that was necessary, yet I still couldn't read a shred of what I'd written.

Moral of the story: write clear code. Please. :(

Name: Anonymous 2006-09-27 12:35

Optimizing code is still very neccessary, for example applying a pattern to an array in blocks rather than going through each element row by row to reduce cache misses.  I don't think compilers will ever be able to do that, at least not in the near future.

Name: Anonymous 2006-09-27 12:45

>>21

Explain the array / cache misses thing please

Name: Anonymous 2006-09-28 10:16

>>22
If you have a large 2D array, and you are moving left to right through all of the rows calculating a value based on adjacent values, each time you miss the cache for the row you are on, you'll also miss it for the rows above and below it.  However, if you divide the problem into blocks based on the cache word size, then you will only have one miss instead of three each time you move to a new row.  The only miss will be the row below it, as the current row and the row above it should already be in cache.

Name: Anonymous 2006-09-28 12:21

>>10
$ cat multtest.f90;echo --;cat divtest.f90
program multtest
  real :: inv,i
  inv=1.0/42
  do i=0,1048676
   write(*,*) i*inv
  end do
end program multtest
--
program divtest
 real :: i
 do i=0,1048676
  write(*,*) i/42
 end do
end program divtest
$ gcc -O3 -msse3 -ffast-math multtest.f90 -o multtest-f90.s -S
 In file multtest.f90:4

  do i=0,1048676
     1
Warning: Obsolete: REAL DO loop iterator at (1)
$ gcc -O3 -msse3 -ffast-math divtest.f90 -o divtest-f90.s -S
 In file divtest.f90:3

 do i=0,1048676
    1
Warning: Obsolete: REAL DO loop iterator at (1)
$ diff multtest-f90.s divtest-f90.s
1c1
<       .file   "multtest.f90"
---
      .file   "divtest.f90"
4c4
<       .string "multtest.f90"
---
      .string "divtest.f90"
33c33
<       movl    $5, -276(%ebp)
---
      movl    $4, -276(%ebp)


same assembly, therefore same speed.
much slower than the c versions, though:
$ time -h ./multtest-c>/dev/null
        3.64s real              2.96s user              0.00s sys
$ time -h ./multtest-f90>/dev/null
        55.25s real             25.09s user             1.54s sys

Name: Anonymous 2006-09-29 13:20

LOOK, FORGET ALL THIS CACHE SHIT.

IN LIKE 2 YEARS, THEY WILL COME OUT WITH SRAM DIMMS.  THEN ALL RAM WILL RUN AT THE SAME (HIGH) SPEED, AND ALL THIS SHIT ABOUT CACHE WON'T MATTER.

STUPID HARVARD ARCHITECTURE.

Name: Anonymous 2006-09-29 13:28

LOOK, FORGET ALL THIS PROGRAMMING SHIT.

IN LIKE 2 YEARS, THEY WILL COME OUT WITH COMPUTERS THAT YOU CAN PROGRAM WITH YOUR MIND.  THEN ALL PROGRAMS WILL RUN AT THE SAME (HIGH) SPEED, AND ALL THIS SHIT ABOUT PROGRAMMING WON'T MATTER.

STUPID HARVARD PROGRAMMING.

Name: Anonymous 2006-09-30 13:22

LETS FORGET ABOUT /prog/ because they are worthless game programmers who will never finish any program ever.

Name: Anonymous 2006-09-30 20:20

>>27
Hey look, it is that idiot that doesn't know what a game programmer is.

Name: Anonymous 2006-09-30 20:41

what the fuck have you retards done to my thread?

Name: Anonymous 2006-09-30 23:20

>>29
welcome to 4chan

Name: Anonymous 2006-10-02 3:51

>>29
You can't expect us to respect a thread without posting memes, taking it mercilessly offtopic, or starting a flame war.

Name: Anonymous 2010-11-15 4:59

Name: Sgt.Kabu迍턛kiman쓌䫑 2012-05-28 20:11

Bringing /prog/ back to its people
All work and no play makes Jack a dull boy

Don't change these.
Name: Email:
Entire Thread Thread List