/prog/ - Sneaky Internal State

Name: Anonymous 2012-03-28 23:59

I've boiled what I want out of a programming language down to a single example. Problems like this bite me in the ass often and I haven't used a language, yet, that helps.

Suppose I'm writing a library for working with vectors. For simplicity, it's not fancy n-dimensional vectors, etc., let's say just three floats. So I write some code to perform operations on vectors, like add, subtract, dot product, cross product, rotate, scale... and then I come to "normalize."

Here, I come up with this optimization that vectors could also store a flag indicating whether or not they are normal. If that flag is set, then normalize has no work to do. The catch is that I have to decide which operations clear the flag, which operations set it, and which operations leave it unchanged. So I do the work and end up with a nice, efficient vector library.

I use my library for a long time -- long enough that I stop caring how that optimization worked. Eventually, months later, I decide to add some new functions to my vector library. Of course, I've forgotten all about the "normal" flag, and I forget to set/clear it in the new functions. Of course, there's no compile error or even a run-time error. I just observe strange behavior when the application runs and I have to spend hours stepping through floating-point math to figure out what's wrong.

How does your language of choice improve this situation?

Name: Anonymous 2012-03-29 0:06

I try to keep track of such things from the application using the library, and not from within library itself. I'd have operations that only work for normalized vectors, and more general operations that perform the normalization. Then in the application code, I will find spots where I know that a vector must be normalized. I'll then call the library function that assumes the input is already normalized. This way, there are no run time checks, which are complicated and error prone like how you described, but also can invoke inefficiency if the run time checks are performed often enough. In the optimized routines in the library that assume normalized inputs, I may put in assertions that calculate the length of the vector and make sure it is within a reasonable bound of 1.0 though.

Name: Anonymous 2012-03-29 0:07

Is the standard being held as "it normalizes itself every time, but only when, you make the function call?"

Name: Anonymous 2012-03-29 0:11

OOP TO THE RESCUE



class Vector

class NormalizedVector

normalize always returns a NormalizedVector. any operation on a NormalizedVector with a non-NormalizedVector returns a regular Vector

use python

Name: Anonymous 2012-03-29 0:29

>>2
But then the library isn't really doing much for me. For example, I'd have to know under what conditions a dot-product operation returns a normal vector.

>>4
Then every operation has to copy one entire object to another. Probably slower than the original, unoptimized vector library.

Name: Anonymous 2012-03-29 0:39

I don't get it.
Why not just leave a comment or write it in the info to say, 'hey fucker, remember this shit.'

Name: Anonymous 2012-03-29 0:46

How often does normalization happen?
How time consuming is normalization?

There are probably better things to optimize.

Name: Anonymous 2012-03-29 0:51

Always reset the flag, unless the function is const, but then you can't do anything with the flag anyway.

Name: Anonymous 2012-03-29 0:54

>>5
>But then the library isn't really doing much for me.
It should do only what you ask it to do. It should be simple and efficient.

>For example, I'd have to know under what conditions a dot-product operation returns a normal vector.
dot product returns a scalar, not a vector. And you wouldn't need to know, it would just need to be documented. Although it would be cool if the compiler could do some algebraic optimizations to your algorithm, and call better library functions. I bet lisp could do it.

>>7
sqrts are supposed to be expensive. But I don't know. If all you are otherwise doing is adding and multiplying fixnums, then you could probably feel the difference with the same thing with sqrts added in a bunch.

Name: Anonymous 2012-03-29 1:04

>>6
Because then all your code would just be full of those comments and you'd get used to ignoring them.

>>7
This is just an example to illustrate some hand-holding that I'd like my programming language to do for me. However, this particular vector-normalization-optimization thing is extremely common in 3D.

>>9
Right, sorry for the bad example... So, let's take vector addition, instead. Sometimes, adding two vectors returns a vector that just happens to be normal. If I'm the application writer, I don't want to have to do all the math to determine whether it is or isn't. I just want to use the vector in more functions, and I want my vector library to know when it needs to be re-normalized and when it doesn't.

Name: Anonymous 2012-03-29 1:05

Sneaky Internal Dubs

Name: Anonymous 2012-03-29 1:10

>>10
>If I'm the application writer, I don't want to have to do all the math to determine whether it is or isn't.
>I just want to use the vector in more functions, and I want my vector library to know when it needs to be re-normalized and when it doesn't.

You should use a programming language that allows you to specify algebraic optimizations for your library functions. Then the compiler can automatically prove the needed preconditions on your vectors in your application code to invoke the algebraic optimizations. Shit would be teight.

But the flag for checking if it is normalized is shitty. Think about all the time that will be spent updating and checking that flag.

Name: Anonymous 2012-03-29 1:17

>>12
But the flag for checking if it is normalized is shitty. Think about all the time that will be spent updating and checking that flag.
Nah, that's exactly the point of the optimization... It's a single memory comparison. Without the flag, figuring out whether or not a 3D vector is normalized involves floating-point math and a lot more than a single memory comparison.

Regardless, let me say again that this one optimization is just an example, and what I'm looking for is a programming language feature that somehow makes "sneaky internal state" like this more explicit or less error prone.

Name: Anonymous 2012-03-29 1:29

U MENA HASKALL

Name: Anonymous 2012-03-29 1:41

>>13
It's fucking shit. When adding two arbitrary vectors, they will almost never result in a unit length vector. That means that the flag will almost always be false, meaning your optimized code will almost always do an unneeded check against a boolean flag, before observing that it is false and then proceed to perform the square root you seek to avoid. By adding checks like this, you increase the cost of the worst case behavior, and if the worst case behavior almost always happens, you are just slowing shit down. It doesn't matter if it saves you time a couple times if it slows shit down 200 million other times. But if it just happens that you are always adding two parallel vectors with magnitudes that sum to one, then a really cool compiler might be able to detect that, and inline that optimization into the generated code, and throw away the normalization for that case. But until you have such a thing at your disposal, you are better off hand coding these optimizations in the code.

Name: >>15 2012-03-29 1:55

>>13
sorry for raging. I know it's just an example and all.

But my answer for the more general question about sneaky internal state is to avoid unneeded internal state, and modularize the state that is necessary.

Name: Anonymous 2012-03-29 2:16

>>15
Ok, so "add" isn't as likely to produce a normal vector, but what about the "normalize" function? It produces a normalized vector 100% of the time. Other functions are somewhere in between.

a really cool compiler might be able to detect that
I think you're confusing compile-time and run-time. A compiler can't know what vectors you're going to be adding in the "add" function before run-time.

Name: Anonymous 2012-03-29 2:37

>>17

Let me write an example here:



f(vector x, vector y)

{

  vector nx = normalize(x);

  return nx.dot(y);

}

main()

{

  vector u = normalize(readVector());

  vector v = vector(1.3, 2.3, 1.0);

  print(f(u,v));

}

If the compiler can prove that u is normalized when f is called on it, at compile time, then it can call an alternate definition of f that does not create a normalized version of the first parameter:



fZomgOptimized(vector nx, vector y)

{

  return nx.dot(y);

}



f(vector x, vector y)

{

  return fZomgOptimized(normalize(x), y)

}

main()

{

  vector u = normalize(readVector());

  vector v = vector(1.3, 2.3, 1.0);

  print(fZomgOptimized(u,v));

}

Here a normalization is omitted, and there is no need to check a flag. Because like you said, the normalize function will return a normalized vector, 100% of the time, there is no need to check to see if it is normalized or not at run time. The compiler can infer this at compile time.

With adding vectors it is a lot less intuitive. It's usually not going to be unit length, unless there is some kind of pattern. And maybe this pattern can be seen at compile time. But if it is usually not going to be unit length, then you may as well always normalize it. Even if you sometimes normalize a vector that is already normalized, you could save more time doing that in the cases that it happens than checking a flag in every single case.

Name: Anonymous 2012-03-29 2:47

C++ is shit ecksdeeeeeeeeeeeeeeeeeeeeeeeee.

Name: Anonymous 2012-03-29 2:56

>>18
If the compiler can prove that u is normalized when f is called on it, at compile time
Ah, I think you're just using awkward terminology... the time "when f is called on it" is always run-time. I'm guessing that you mean to say that the compiler might identify some essentially "const" or "idempotent" code and just remove it.

For example:



   int x = 3;

   if (x != 3)

      DoSomething();

A really sophisticated compiler might strip that code out entirely. Usually, compilers don't even attempt this sort of code analysis, because it's known to be equivalent to solving the halting problem in the general case.

Even so, if you had an ultra-sophisticated compiler that basically executed your code while compiling it, it would not help you in the case of this "normalize" optimization. The vectors you'd be working with would be generated at run-time, and they could come from anywhere (maybe user input, maybe some 3D model loaded from disk, etc...), so no "const" optimization is possible.

Name: Anonymous 2012-03-29 3:19

>>20
A lot of these types of compiler optimization depends on operations whose general solutions will reduce to the halting problem. But that doesn't mean solutions don't exist for special cases and forms, and these can still be used. For instance, in the example in >>18, it is obvious to me and you that u will be a normalized vector when f is called on it from the main function. In this particular case, an algorithm can trivially prove that u will be of unit length. This will hold true regardless of what value u gets at run time. Of all the values it could possibly take on, we know at compile time that it must have a normalized value. Thus, the compiler can generate code that will deal with u appropriately as a unit vector, without depending on a run time check.

And in the code that you posted, the transformation is trivial:



int x = 3;

if (x != 3)

  DoSomething();

replace occurrences of x with the constant 3 (up until the first statement that modifies x).



if (3 != 3)

  DoSomething();

constant folding



if (0)

  DoSomething();

if(0) found. Inline the false branch (empty in this case).

Non of this requires any notion of run time. It's all just automated proving techniques that allow for optimizations. It can all happen at compile time.

Name: Anonymous 2012-03-29 3:32

>>20
>The vectors you'd be working with would be generated at run-time, and they could come from anywhere (maybe user input, maybe some 3D model loaded from disk, etc...),

That's a good point. In such a case I would either explicitly normalize the vector, and then the compiler would know that it was normalized. Or I would trust that the file contained normalized vector, and I would need to establish a way to communicate this to the compiler. In a language like seeples, I'd probably make normalized vector a derived class of vector, and override the operations to skip the normalization part. And then have a method for directly reading a normalized vector from a file, without verifying its length (because at that point, you may as well just normalize it).

Name: >>22 2012-03-29 3:33

Actually overloading would be perfect.

Name: Anonymous 2012-03-29 3:56

>>22
I would either explicitly normalize the vector, and then the compiler would know that it was normalized.
Dude, I gave you the benefit of the doubt here for like five posts, but seriously, you are apparently not understanding what a compiler does.

It's like this:
1) I compile my program. THIS IS COMPILE-TIME
2) I run my 3D modeling program and generate a file full of data.
3) I run my program. THIS IS THE BEGINNING OF RUN-TIME
4) My program loads the file.
5) A bunch of functions from my vector library are called to manipulate the data.

Everything that the compiler does happens in step 1. I could delete the compiler from my hard drive, entirely, right after step 1 and my program would still run. The compiler cannot make any decisions during step 5.

And not every vector needs to be normalized... You seem to also be suggesting that the answer is just to normalize them all. It's just that if you call "normalize" on a vector that happens to be (0, 1, 0), you want the function to just exit without doing anything. If you call "add" with (5, 2, 7) and (10, 10, 10), you want to get a result of (15, 12, 17). No normalization happens there on the inputs or the outputs.

Name: Anonymous 2012-03-29 6:21

>>24
I think he's implying a compiler can trace all functions called upon an object and remove redundant calls, although that would require some form of indication that f x == f (f x). Either way, that would be an impossibly complex, intensive and nigh on moot optimization.

Bro, your vector will be normalize()'d twice. deal with it.

Name: Anonymous 2012-03-29 6:35

NORMALIZE MY ANUS

Name: Anonymous 2012-03-29 15:11

>>25
your vector will be normalize()'d twice
Yeah, doesn't seem like a big deal, right? But when working with arbitrary 3D data, it's usually "bro, your vectors will be normalize()'d a few dozen times". And "vectors" means gigabytes of model data. So this optimization is worthwhile.

Still, this is just an example. If it is more familiar, take the example of strings in C. If you need to know the length of a string, you can call strlen(s) every time, or you can build a struct that contains both the char *s and an unsigned int length that saves you from walking the string until you reach the NULL. This is the magic of std::string in C++. But now, if you ever want to change the way std::string works, you had better remember to manage that length field.

Name: Anonymous 2012-03-29 22:49

>>25
It's not really that difficult. It is just a matter of having a formal system of preconditions and post conditions that the compiler can be aware of. Tools for static code analysis do this to help identify bugs. There's no reason why it couldn't also be used for optimizations. OP wants to be able to do vector math and get the benefit of not having to redundantly normalize vectors, without having to be smart enough to just fucking know when a vector is normalized, and call the fucking appropriate functions that take parameters that are assumed to already be normalized. So I suggested what would be needed for a compiler to make these optimizations for the programmer. If the programmer isn't smart enough to avoid redundant calls to the normalize method, then maybe the compiler can be.

>>24
What I'm saying is that intelligently written code doesn't need to check if it is normalized or not. It can be proven to be normalized, and then handled as such. The compiler can generate code that correctly handles inputs that are assumed to be normalized. Yes, the compiler can be deleted. That doesn't matter. It generated code which then runs at run time, and processes the vectors as efficiently as possible.

>>27
If you are writing code that consistently normalizes the same vector a few dozen times, you are a fucking moron that has no business with programming. If you are writing code that consistently checks to see if a vector is normalized a few dozen times, you are still a fucking moron that has no business with programming. If you normalize the vector once, and then use that normalized vector in a context that makes sense for normalized vectors, a few, dozen times, then you are doing only what is necessary.

Name: Anonymous 2012-03-29 23:07

editing lovrary
what is vector.normalizean
Never mind, I'm too stupid to remember what I wrote and will ignore this member

Name: Anonymous 2012-04-05 21:54

>>28

So stupid.

If you are writing code that consistently normalizes the same vector a few dozen times

You should take a peek at the Blender source code and see if you can count how many different ways a vector might be normalized. You sound like you've never worked on a large software project, and I have no idea why you feel the need to participate in this discussion.

You might as well say "if you write code that checks the length of a string more than once, you're a moron." Sometimes repetitive things happen in code, if you write anything that actually has functions. It's not like programmers are worried about optimizing this case, you idiot:



func()

{

  strlen(s);

  strlen(s);

  strlen(s);

  strlen(s);

  strlen(s);

  strlen(s);

  strlen(s);

}

The problem is that lots of different functions call strlen() for lots of different strings, and those calling functions call each other... It's just not a solvable problem, and optimizations like this are just a fact of life. Fuck.

Name: Anonymous 2012-04-07 3:32

>>29
That's actually kind of the truth. With large projects, shit happens. Sometimes you want your programming language to help you.

Sneaky Internal State

1 Name: Anonymous 2012-03-28 23:59

2 Name: Anonymous 2012-03-29 0:06

3 Name: Anonymous 2012-03-29 0:07

4 Name: Anonymous 2012-03-29 0:11

5 Name: Anonymous 2012-03-29 0:29

6 Name: Anonymous 2012-03-29 0:39

7 Name: Anonymous 2012-03-29 0:46

8 Name: Anonymous 2012-03-29 0:51

9 Name: Anonymous 2012-03-29 0:54

10 Name: Anonymous 2012-03-29 1:04

11 Name: Anonymous 2012-03-29 1:05

12 Name: Anonymous 2012-03-29 1:10

13 Name: Anonymous 2012-03-29 1:17

14 Name: Anonymous 2012-03-29 1:29

15 Name: Anonymous 2012-03-29 1:41

16 Name: >>15 2012-03-29 1:55

17 Name: Anonymous 2012-03-29 2:16

18 Name: Anonymous 2012-03-29 2:37

19 Name: Anonymous 2012-03-29 2:47

20 Name: Anonymous 2012-03-29 2:56

21 Name: Anonymous 2012-03-29 3:19

22 Name: Anonymous 2012-03-29 3:32

23 Name: >>22 2012-03-29 3:33

24 Name: Anonymous 2012-03-29 3:56

25 Name: Anonymous 2012-03-29 6:21

26 Name: Anonymous 2012-03-29 6:35

27 Name: Anonymous 2012-03-29 15:11

28 Name: Anonymous 2012-03-29 22:49

29 Name: Anonymous 2012-03-29 23:07

30 Name: Anonymous 2012-04-05 21:54

31 Name: Anonymous 2012-04-07 3:32