>>177
Truth: asm is less readable and thus less maintanable.
Readability is subjective (size or speed is NOT), but to someone experienced, it's
more readable. Even the format is consistent. The only thing is, it also tends to be more voluminous.
>>183
Look at the bugtrackers of other XML libraries and you'll find much worse.
When I was talking about design, I meant decisions that are not easy to reverse, and that can have huge consequences for efficiency. For example, instead of the obvious (and inefficient) "make objects out of everything and copy data into them" parsing algorithm, do it in-place so there's no unnecessary copying of data. Simplify the design to make it more "direct". I'll use the web browser as an example here since we're all familiar with them: how long is the path (in terms of # of instructions or call depth) between Firefox getting a WM_PAINT and its first API call into the OS to draw the page content into the window? In the browser design I'm planning, it's < 5. I'm not planning to do any memory allocation/deallocation there either.
Maybe "optimisation" is the wrong word for this, but what else do you call making something more efficient? If you plan your design like this and have it carefully thought out before writing a single line of code, the chances are high that you'll be ahead of those who "optimise later", without even needing to go to inline Asm. You could say that, despite the compiler's stupidity, everyone is using the same compiler but your design has already been optimised at a higher level, so its inefficient output is still going to be more efficient than the output from an inefficient design.
>>193
It seems like in order for the compiler to truly generate assembly with necessity, it would need to have a complete high level understanding of what you are trying to do.
That's not necessary, all it needs to understand is the semantics of the source language. What we agree is not statement-by-statement translation and applying optimisation afterwards (there's that anti-premature-optimisation again!), but my idea is to generate code "backwards", working from the desired result. E.g. in
int foo() {
int i = f();
int j = k();
int l = 2 + j;
... /* code that does a lot with l, but never has effect on i, j, nor other global variables */
return i + 2*j;
}
a "dumb" or "traditional" compiler would emit code for all that stuff in the middle, and optimisation
might have a chance at removing (some of) it and moving variables into registers. An "intelligent" compiler could work backwards and "think" "This function's result depends on i and j. What do i and j depend on? f() and k(). What do f() and k() depend on? ... " Eventually it might determine that f() and k() are actually constant, and substitute that in, propagate the changes down, and reduce foo() itself to a constant. And of course, all of this might not even be done if foo() can never get called from any entry point. At every reference to a nonlocal variable, this process would need to be performed as their results are "visible" to other functions.
For things like register allocation (done after the above), the compiler could track how many variables are actually needed, and then choose how many extra "slots" are required in memory. It can alter their allocation to registers depending on their usage and loop nesting level. In size/balanced optimisation mode, if it sees certain variables having LIFO-like usage patterns, it can emit push/pop (single byte instructions) instead of explicitly doing a stack allocate and moves. If there's a long-running inner loop with frequently accessed variables, push the less frequently used variables on the stack. This is how an Asm programmer does register allocation by necessity.
Due to the halting problem it's not possible to prove that f() or k() terminate even if they do not depend on nonlocal variables, but the compiler can offer an option to simulate their execution for a limited number of cycles. No deep understanding (other than language semantics) required by the compiler, just a different way of approaching the problem.
Follow this process to its logical conclusion, even into library functions and such, and your
printf("%d\n", fib(5)); becomes a
fputs("2178309\n", stdio);. In the output binary too, there will be nothing more than what the compiler found was needed. That is the ultimate goal.