Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Normalized vectors vs. Angles

Name: Anonymous 2011-06-09 6:38

Hello /prog/riders,
I've noticed, while developing a seihou clone in C, that replacing angles (4-bytes float) with normalized vectors (2 4-bytes floats) gives a bit better performance, as I can access the sine and cosine values directly.
However, my ``bullet'' structure is already pretty big, and I wondered if the cache penalty would offset the benefits of avoiding transcental arithmetic in some conditions.

Name: Anonymous 2011-06-09 7:42

That's a problem you can worry about when you get to it. Unless you've profiled your program and have determined you have a performance problem, you don't have a performance problem.

Name: Anonymous 2011-06-09 7:47

IF YOU DON'T HAVE A PERFORMANCE PROBLEM,
WELL
YOU DON'T HAVE A PERFORMANCE PROBLEM.

Name: Anonymous 2011-06-09 8:02

>>2
>>3
I know, premature optimization is the root of all evil. But the changes are extremely localized and small and don't obscure the code at all.
Anyway, my question is theorical. What's the more costly between computing sine and cosine, and a cache miss?
From my own research, it would appear that it's better to use vectors unless more than about a half of the accesses miss the L2 cache, which is quite unlikely.

Name: Anonymous 2011-06-09 9:46

But the changes are extremely localized and small and don't obscure the code at all.

Precisely why you shouldn't worry about it now. If you later suspect it's a genuine problem and want to try the other approach, you can do it then without breaking anything.

Name: Anonymous 2011-06-09 10:13

If you really really really want to find out, code both versions and run them through measurement tools like cachegrind. Be aware that the results may vary from one chip architecture to the next.

Name: Anonymous 2011-06-09 12:29

Cachegrind? Whoa.

Name: Anonymous 2011-06-09 12:36

valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind valgrind --tool=cachegrind /bin/echo dicks

Name: Anonymous 2011-06-09 12:44

lol as fast as C xD

Name: Anonymous 2011-06-09 13:33

>>8
Still running.

Name: Anonymous 2012-01-20 20:27

У меня есть дубс

Name: Anonymous 2012-01-20 20:49

if you rewrite it in python it will be ``fast enough''

Name: Anonymous 2012-01-20 21:13

>>1
You don't need 32bits of precision per float. Problem solved.

(You do need more than 8 though.)

Name: Anonymous 2012-01-21 5:13

OP, are you profiling them both?

Name: Anonymous 2012-01-21 5:57

It might also run faster because the compiler was able to use specialized instructions to handle two floats that are handled somewhat similarly.

Name: Anonymous 2012-01-21 6:14

Every coordinate system can be efficiently implemented with quaternions. In your case a simple complex datatype would suffice.

Name: Anonymous 2012-01-21 7:30

>>16
>simple complex datatype
>wtfamireading.cpp

Name: Anonymous 2012-01-21 8:14

simply complex

Name: Anonymous 2012-01-21 9:00

>>18
silly me, right
... 'simply' omit that

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List