>>215
Obviously because they will be read/written the most and thus deserve to be in registers, at least for the duration of that loop. It should also be obvious what a loop is, it's whatever corresponds to a loop in the source language. A loop within a loop is also similarly self-explanatory. The representation doesn't matter.
It sounds like you are operating on the source language directly. In the version I was thinking of, it was to take abstract assembly that uses an unbounded amount of variables, and then output the same assembly but using as few variables as possible. When operating at the assembly level, loop constructs can become very complex and general. I'm not sure which approach would be better. It feels like it would be better to do it after the compiler has done its other optimizations, but I could be wrong.
This is where it seems you're still stuck with "the old way": a variable does not ever need to be at a "permanent memory address". There doesn't even need to be any correlation between how variables are defined and used in the source and the output, as long as they are semantically equivalent. Think of a variable as just a label for a "flow" of data between two or more operations. That flow might not exist if the operations that it connects aren't important to the final result of the function.
Well that's interesting. Is there an advantage to allocating a single variable in multiple locations? Each location is essentially a variable. So this could be modeled as variable assignment. I could see this getting very interesting if each storage location had a different cost of access. But assuming each register is equally cheap to read and write to, all you have to worry about is swapping values in and out of memory and the behavior of the special purpose registers, in which case, you might not need to go there.
That's what academics think. The real world says otherwise.
The real world is satisfied with crap.
But multiple registers OR memory addresses may be allocated to one "variable" too.
I'm not sure if there is an advantage in doing that. The algorithm would be creating a new variable that is equivalent to another previously defined one. If the goal is to create an equivalent program using the fewest amount of variables, and hence, needing the fewest amount of storage locations, then creating new variables doesn't seem like a good idea. But if the goal is to increase performance, then maybe it's possible in some circumstances to use more storage locations. At that point you'd need to work with a model closely related to the processor. It wouldn't be enough to just assume that using the minimum amount of storage locations will give you the most speed, although you'd think that would be true if there was no spillage as a result.
That's exactly the point I'm making: it's too general.
Whether or not graph coloring is too general for register allocation depends the class of graphs created in the construction of the variable interference graph. If it can form an arbitrary graph, then optimal register allocation is just as hard as graph coloring. But if the class of graphs is actually more limited, then it can be possible to use a special purpose graph coloring algorithm that takes advantage of properties present in the restricted class of graphs.
According to the RA article on Wikipedia this is possible with SSA, and there's nothing about anything being NP-complete on its page. Unfortunately they're both full of academic verbiage so it's difficult to determine what they're really talking about.
Good news! In the version I am working with, I am getting lines in the assembly where each variable must be holding its current value. If in a single line there are multiple variables that must hold their values, then these variables cannot be allocated in the same place. This yields a set of cliques. Now finding a clique in a graph is NP Complete, and the encoding we get for the variable interference graph is all of its cliques. So if it is possible to optimally color a graph using knowledge of what all its cliques are, then that could be used to optimally color the variable interference graph. I got something working for one example, but once I apply it to general graphs I get lost. Maybe there's a structure to the variable interference graph that I'm unknowingly taking advantage of.
Another two examples of why more "targeted" algorithms can work better: instructions that require operands in particular registers, and those that prefer particular registers. With traditional graph colouring, you can force allocation of a certain register, but there's no way to specify a preference.
The problem becomes much more complex, but you can use utility functions to rank the optimality of a coloring. But this requires a calculation of a utility function.
An Asm programmer can think "there's a divide coming up: let's arrange things so that its input will be in the right registers when the time comes" and override that choice with something more important when necessary. Traditional RA algorithms can't do that.
They can understand that the registers used by the divide instruction must be used to store the variables resulting from the division. This can be represented by allocating variables for the result of the division and forcing their colors to represent the registers they are put in by the instruction.
tl;dr: Register allocation is one area where compilers could improve significantly, but due to a reluctance to challenge existing norms, has not. It's only recently that stuff like linear-scan (closer to what someone with common sense, but not exposure to previous biases, would come up with) and SSA started getting visibility. These basics should've been where we started.
Indeed. But that makes it more exciting doesn't it? The late start makes it more likely that you might discover something new.
To whoever started the insistence that register allocation is as general a problem as graph colouring and propagated it: FUCK YOU.
so sorry! But really graph coloring is just solving the same problem in a more abstract setting. If you want to build more restrictions from the platform into the algorithm, just add more information to your definition of the graph to reflect the extra restrictions. These additions are usually small, and the advantage of using graphs is that all research and developments that have gone into graph algorithms will be easy for you to leverage. But there is the problem of graphs being such general objects. It could be that your domain is much less general than graph coloring. In which case it may be more useful to operate within the domain itself and never create a graph.