Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-8081-120121-160161-200201-240241-280281-

Any decent modern general-purpose languages?

Name: Anonymous 2012-07-25 10:55

Assembly: Unportable. No standardised syntax.
Classical Visual Basic: Some good parts. Shit overall.
C: Shitty standard library. Deficient type system. Can't into Unicode. ``Unportable assembly.''
D and C++: Obfuscated boilerplate languages.
Java and C#: Forced OOP.
Common Lisp: Archaic cons-based library. Writing complex macros is a PitA due to the unlispy quotation syntaxes.
Scheme: CL without namespaces.
Clojure and Erlang: Concurrency is unneeded outside of a few very specific applications. Parallelism is where it's at.
OCaml: Great language, only one, deficient, implementation.
Haskell: Academic sex toy.
Forth: Reinventing the wheel over and over.
Ruby: Implicit declarations. Slow as fuck.
Python: Implicit declarations. FioC.
Perl: Brain damage.
PHP: Pretty much shit.
JavaScript: "" == false

It's impossible to list them all but, please, what decent modern general-purpose languages exist?

Name: Anonymous 2012-07-25 10:56

Lisp and assembly.

Name: Anonymous 2012-07-25 11:04

Try Lua I guess?

Name: Anonymous 2012-07-25 11:06

Prolog: Perhaps the slowest programming language on the planet.

Name: Anonymous 2012-07-25 11:15

C is too old, and its semantics were built on 1970s instruction sets. We need a language as close to the metal as C with modern design, but not going overboard with the shitty hacks like SEPPLES.

Name: Anonymous 2012-07-25 11:16

Name: Anonymous 2012-07-25 11:42

Python + Cython

Performance, elegance, portability.

Name: Anonymous 2012-07-25 11:43

>>7
Enjoy your half baked lambda implementation.

Name: Anonymous 2012-07-25 11:54

>>7
FIOC is not elegant.

Name: Anonymous 2012-07-25 11:56

Go.
Statically typed, duck typing, no shitty OOP without sacrificing usability, simple, easy to write, looks good, compiled, quite fast, garbage collected, good library, easy to FFI with C, great concurrency, no boilerplate or garbage syntax, easy (and fast) compilation without the need to write Makefiles, ...

Name: Anonymous 2012-07-25 11:58

>>8
You can use everything except statements in Python's lambda. Python can be fully functional (in both meanings) with only expressions.
If you really want to use statements so much, just use a closure. Functions are first class objects, lambda is not really necessary.

Name: Anonymous 2012-07-25 12:03

>>11
What's a closure?

Sorry, I'm a complete functional fagstorm.

Name: Anonymous 2012-07-25 12:26

>>12
I was talking about defining a function inside another function and then returning it.
This inner function has access to the scope of the outer function, so you can do something like this:
def a(b):
    def c(d):
        return b + d
    return c

f = a(3)
print f(2) # 5
print f(3) # 6
g = a(6)
print g(2) # 8
print f(2) # still 5

As you can see, the inner function closes over the variables in the outer one, thus ``closure''.

Basically, anything you could do with a lambda, you can also do with a standard function.

Name: Anonymous 2012-07-25 12:34

>>10
package/import aren't elegant.

easy to FFI with C
What isn't?

Name: Anonymous 2012-07-25 12:46

Factor.

Name: Anonymous 2012-07-25 12:46

>>14
Why aren't they elegant? Packages are useful for organization and what would you use instead of import?

Many languages have FFI, I didn't say Go is special. However, there are several languages where that isn't really simple to do. Go isn't among them.

Name: Anonymous 2012-07-25 12:46

>>1
lol general purpose languages.

java is about as general purpose as it gets, it has the worst of pretty much every paradigm.

(or you could like, pick your language based on your task like everyone else)

Name: Anonymous 2012-07-25 13:02

>>13
So a closure is a nested function? Does that mean GNU C has closures?

Damn.

Thanks, >>13-san.

Name: Anonymous 2012-07-25 13:11

>>18
http://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html

GCC implements taking the address of a nested function using a technique called trampolines. This technique was described in Lexical Closures for C++ (Thomas M. Breuel, USENIX C++ Conference Proceedings, October 17-21, 1988).

yes apparently

Name: Anonymous 2012-07-25 13:24

>>5
totally agree, C and C++'s day is over as general purpose programming language, they will be relegated to systems/embedded programming which they were meant for. Languages like D, Go and Rust are the future

Name: Anonymous 2012-07-25 14:02

>>1
fuck you for thinking javascript is shit. you're obviously not worthy of calling yourself a programmer

Name: Anonymous 2012-07-25 14:06

>>20
D, Go
GC is shit

Rust
not ready yet

Name: Anonymous 2012-07-25 14:09

1-22
languages
terrible

Name: Anonymous 2012-07-25 14:10

>>20
Languages like D, Go and Rust are the future
That's funny because they are all terribly useless. Absolutely no one uses them in production, for good reason. Go is a shit language; it's like C mixed with Javascript and Pascal, but crippled. Google doesn't even use it for anything important. The only reason anyone has heard of it is because of cat-v Plan 9 faggots astroturfing on the web. Rust and D are even more obscure and shitty. D is like C++ except for retards. Not to mention that none of these have compilers that produce code as fast as C++ compilers, or even the JVM for that matter.

Name: Anonymous 2012-07-25 14:12

D is like C++ except for retards

lol'd

have an upboat

Name: Anonymous 2012-07-25 14:58

>>24
Go is hardly useless. It it enough like C to feel familiar while not bothering to implement features irrelevant to modern programming. That makes it simple to program with (unlike stuf like C++) without any major drawbacks (such as Java's forced OOP or scripting languages' inability to do any lower level stuff, not to mention their egregiously slow speed). Go sources are usually more portable than your standard C or C++ code too.

For example, the lack of pointer arithmetic and manual memory management means that the programmer has a lot less stuff to worry about.

In addition to removing completely unnecessary (and outright dangerous) features, Go is also quite strict. You mustn't have unused variables, code style is enforced, no implicit type casting (including int and int32, despite being the same underneath, being separate and requiring explicit casting), etc. While that may seem to be excessively restrictive to your standard volatile C obfuctor, it inevitably results in a code that is generally cleaner, safer and better.

Garbage collection should be the norm in userland applications anyway, because the tiny amount of speed gained by manual memory management is utterly negligible in all but the most performance-critical code, while it makes segfaults and memory leaks all but gone (yes, >>22, GC is actually useful).

In speed, it's true that Go is somewhat slower than C/C++, but that's only to be expected with GC, type reflection and other features. It still at least matches, if not outright exceeds, the (both programming and execution) speed of most equivalent Java programs.

Google stated that they use Go in several of their internal applications. In addition to that, Go is a good choice for Google App Engine.

Conclusively, Go is a true current-generation programming language that's focused on ease of use and good programs, not on any  irrelevant innovation that's ultimately destructive, since it precludes the good, simple and obvious ways of coding in order for the programmer to make use of the novelties.

Name: Anonymous 2012-07-25 15:07

>>26
Why is prague such a destination for viral marketers?

Name: Anonymous 2012-07-25 15:18

Perl. Haters gonna hate.

Truth is, if you have to ask "can i do _ with perl?" Answer is most likely yes.

Name: Anonymous 2012-07-25 15:30

>>28
Can I write uniform-looking and easily-readable code (from the perspective of someone who's not a Perl wizard) that also executes fast enough (Java-tier speed, not Python-tier) with Perl?

Name: Anonymous 2012-07-25 15:36

>>28
can i write a function that accepts two arrays as arguments?

Name: Anonymous 2012-07-25 16:01

>>27
It's the same reason why there are trolls and shitposters who disparage others senselessly.

Name: Anonymous 2012-07-25 16:05

>>28
Can I reify part of the continuation up to a certain prompt contained inside of a continuation mark as a procedure and call it multiple times?

Name: Anonymous 2012-07-25 16:08

>>13
>Basically, anything you could do with a lambda, you can also do with a standard function.

Except for the fact the closures won't have the same state unless the variables are immutable.

Name: Anonymous 2012-07-25 16:14

>>33
If you change a variable afterwards, that's your fault.
Don't do it if you don't want that.

Name: Anonymous 2012-07-25 16:16

>>34
Which defeats the purpose of emulating closures in the first place.

Name: Anonymous 2012-07-25 16:19

And just for the record, java requires variable to be final when emulating certain types of closures.

Name: Anonymous 2012-07-25 16:21

Every programming language is crap unless it's Lisp, which is shit.

Programming is a disgusting practice and you should be ashamed.

Name: Anonymous 2012-07-25 16:55

>>24
That's funny because they are all terribly useless. Absolutely no one uses them in production, for good reason. Go is a shit language; it's like C mixed with Javascript and Pascal, but crippled. Google doesn't even use it for anything important. The only reason anyone has heard of it is because of cat-v Plan 9 faggots astroturfing on the web. Rust and D are even more obscure and shitty. D is like C++ except for retards. Not to mention that none of these have compilers that produce code as fast as C++ compilers, or even the JVM for that matter.

Languages like C and C++ had their day in the 80s when you had to write directly to hardware on singletasking OSless home computers. When you wrote a program back then, the program was the OS, everyone used asm back then. Using an archaic language like C++ now is sheer stupidity. Having to manually manage memory back when you were writing directly to hardware addresses and had full control of the CPU, but that kind of programming is gone with mondern OS's. You can call it astroturfing all you want, but languages like Go, D and Rust are going to take over and eliminate all the stupid time wasting errors that could be eliminated with garbage collection. Basically you should be able to look at code and know what it does, not poke around with a debugger and static anaylisis to try and find out what the program might do because the code is to unsafe to trust on its own. There are a lot of people who make their living coding C/C++ whos work will be made obsolete by modern compiled languages, languages that should have caught on 15 years ago.

Name: Anonymous 2012-07-25 16:58

Objective-C.

Name: Anonymous 2012-07-25 17:07

>>39
a language that has C as a strict subset with unsafe OO bolted on
gtfo

Name: Anonymous 2012-07-25 17:38

>>39
+1 for properly spelling out the name of Objective-C

- Tremendous Faggot
  163k •31 •320 •628

Name: Anonymous 2012-07-25 17:50

>>38
Basically you should be able to look at code and know what it does, not poke around with a debugger and static anaylisis to try and find out what the program might do because the code is to unsafe to trust on its own.
Oh lawd the ironing. With native code you can see every single instruction from compiled output, knowing that this copy will not change at all. And you're really delusional. Hardware gets better not to make it easier for weak programmers, but to allow for better software. It's not like everyone is going to buy expensive hardware just so that shitty bloated GCs can run at a maximum. They will buy expensive hardware so that they can run software that delivers in parity with the hardware. For example, I'm not going to buy a new graphics card and use more power just to play a game that has graphics no better than last year. What a fucking waste. Apply the same idea to enterprise servers and we're talking about billions of dollars.

Related: http://www.gotw.ca/publications/concurrency-ddj.htm

Name: Anonymous 2012-07-25 18:00

>>42
pure disinformation
 As I said, the day when a program loaded in RAM and took over the computer are over, you cant track a program in memory anymore, its all artificial segmented memory hidden from you by the OS and CPU hardware, not to mention most of the program runs in dynamic libraries.

People keep trying to connect garbage collection with slow VM languages like Java and C#, but compiled gc languages are a whole new ballgame.

Name: Anonymous 2012-07-25 18:03

Name: Anonymous 2012-07-25 18:21

>>5
x86 is a 1970s instruction set.

Name: Anonymous 2012-07-25 20:50

Ada. It's great.

Name: Anonymous 2012-07-25 20:50

>>44
Go is slow as fuck. It has a broken conservative garbage collector. Try a language with a real GC instead of something designed to be retrofitted onto a language with unsafe pointers.
IHBT

Name: Anonymous 2012-07-25 23:44

Try newLISP. It has pretty much everything you need for general purpose programming. Strings are UTF8, it has pattern matching, pretty battery-included standard library (sockets, regex, etc), trivial c interface and not to mention it has true code = data unlike scheme and cl. The only downside is it's gpl.

Name: Anonymous 2012-07-25 23:46

Forgot to mention, newLISP also has Cilk baked in

Name: Anonymous 2012-07-26 0:02

Name: Anonymous 2012-07-26 0:06

TI83 BASIC

Name: Anonymous 2012-07-26 0:06

>>48-49
Oh god no.

Name: Anonymous 2012-07-26 0:43

>>47
Try a language with a real GC
Like what, you retarded fuck? No example provided, so I assume you mean JAVA.

GC is shit.

Name: Anonymous 2012-07-26 0:53

>>53
Fuck off and die already you fucking retard.

Name: Anonymous 2012-07-26 1:15

>>53
Well, Java's GC is pretty robust. But GC is indeed shit.

Name: Anonymous 2012-07-26 1:32

>>55
Fuck off, retard.

Name: Anonymous 2012-07-26 1:38

If you havin garbage problems I feel bad for you son, I got 99 problems but a GC ain't one.

Name: Anonymous 2012-07-26 2:16

>>57
And all the 99 are segmentation faults.

Name: Anonymous 2012-07-26 3:24

>>1
Archaic cons-based library.
You don't understand Lisp.

Writing complex macros is a PitA due to the unlispy quotation syntaxes.
What the fuck

Name: 1 2012-07-26 4:37

>>3
I'll look into Lua, thanks.

>>10,26,47
Pig disgusting! Pop infix syntax everywhere — worse than C. Terrible GC. And as I said of Clojure and Erlang, concurrency is not the next big thing, parallelism is.

>>15,50
Interesting. I'll look into Factor.

>>21
JavaScript does have some good parts, like VB6. But overall it's shit.

>>28
Perl has braindamaged syntax, ``There's more than one way to fuck up'' and it's slow.

>>39
What >>40 said.

>>46
Too verbose and antique approach to multiprocessing.

>>48-49
I don't see the point of modifying lambda expressions at run-time. Is that what you call ``true code = data''?

>>59
Archaic cons-based library

You don't understand Lisp

Could you elaborate? I'm fine with the special casing of two-elements records for linked lists but I don't understand why is the user encouraged to use them for things such as association lists elements or trees.

Writing complex macros is a PitA due to the unlispy quotation syntaxes.

What the fuck

I don't know why I wrote that.

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2012-07-26 4:51

I want a language in which you can go as high or as low-level as you like. If you just want to write programs quickly you can use lambdas, dynamic types, etc. but you should also be able to drop down to the level of inline asm for those parts where the compiler isn't good enough. The closest to that at the moment is probably C++.

>>45
And it's withstood the test of time. Besides the classic 8-bits (6502, Z80, 8051, etc.) what other CPU architectures have lasted as long?

Name: Anonymous 2012-07-26 4:53

Racket

Name: Anonymous 2012-07-26 5:08

>>60
Could you elaborate? I'm fine with the special casing of two-elements records for linked lists but I don't understand why is the user encouraged to use them for things such as association lists elements or trees.
Association lists are sometimes useful.  A proper Lisp (pun intended) would provide you with a library with at least hash tables and various trees (for speed, really).

Name: Anonymous 2012-07-26 6:17

2^8 GET

Name: Anonymous 2012-07-26 6:38

>>61
I, too, want an winged ponycorn that shoots laser beams from its eyes, Cudder.

Name: Anonymous 2012-07-26 6:48

>>60
concurrency is not the next big thing, parallelism is
Go's purpose is not to be ``the next big thing'', it's to include already existing functionality in a well-designed way. Concurrency is just fine for real world programming.

>>61
That's utterly ridiculous. Why would you rather have a jack-of-all-trades language like the mess that C++ is than learn a few (even two would suffice) languages that would together cover the same field without any one of them being shit because it tries to be everything at once? Wanting something like that from a programming language shows that you don't have a clue what you want.

Name: Anonymous 2012-07-26 10:44

>>64
That's not 10

Name: Anonymous 2012-07-26 10:49

>>65
>>66
It's a perfectly reasonable desire. You can master one language and use it as much as possible. The only drawback is that the language becomes more complex like C++, but no one forces you to use all or even half of the features.

Name: Anonymous 2012-07-26 17:00

>>64
You're a long ways away from 256

Name: Anonymous 2012-07-26 21:07

>>69
You're an unsigned char ways away from 256

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-07-27 2:54

than learn a few (even two would suffice) languages that would together cover the same field
The point here is easy integration. For whatever reason, language designers often think their language exists in a void and no one else should be using anything other than their language. If they think otherwise, then almost always, access to "outsider" languages is "at a distance", via an API. Take Python as an example. I haven't used it, but a quick search reveals things like http://docs.python.org/extending/extending.html that shows you need to go through an elaborate process just to write a few lines in an alternate language.

An exact opposite example would be asm { } blocks in C/C++. If I'm writing C and want to manually optimise some parts, I can put in an asm { ... } right there. Trying to put some C in Python, or even Asm in Python? No easy way. You have to go through the whole process above. It's complex enough to be discouraging for simple cases, reducing the opportunities that programmers have.

I may seem biased towards C/C++, but that's naturally because I'm an experienced long-time user and understand the entire process behind its design. And yet, like >>68 mentions, I have not used all the features of C++ like advanced templating. I've found that those who complain about it don't understand why it's the way it is. Once you understand how C++ maps to C, you might find it quite interesting and enlightening. Again, it allows you to choose the level of abstraction you want. I can use cout if I feel... "classy" and want to use the (IMHO bloated) OOP output, printf if I want to be C-like, and int 128 function 4 if I want to be as thin as possible and hit the kernel directly.

True utility comes from being able to solve the problem in the way you want, not the way the language designer enforces. Also remember that programming languages are tools, and like physical tools, acquire dirt and wear from use. The only perfect, clean tool is one that is never used.

Name: Anonymous 2012-07-27 3:16

>>71

you're experience has maid you clothes minded. Going to asm for efficiency is something you should leaf to the compiler, as it can be done well by the compiler and by adding your own bits of assembly, you alienate other architectures. And hand coded asm could be sub optimal. You should only use asm to access features that are not otherwise available, like in a wrapper library that configures interrupts or something.

Name: >>72 2012-07-27 3:19

You should only use asm to access features that are not otherwise available

although it would be appropriate to use assembly for routines that can be implemented with special assembly ops that the compiler isn't clever enough to employ. I like to think that a sufficiently clever compiler can exist, but I don't think it can for every language. In particular C. So I actually take most of >>72 back, but only for advanced opcodes that can only be used in special situations, like SMD stuff.

Name: >>73 2012-07-27 3:21

SMDSIMD stuff

Name: Anonymous 2012-07-27 3:26

>>71
Trying to put some C in Python, or even Asm in Python? No easy way. You have to go through the whole process above.
Do you ever do any actual research into your bullshit claims you fucking retard?

Name: Anonymous 2012-07-27 3:38

>>71
Once you understand how C++ maps to C, you might find it quite interesting and enlightening.
And cry a little inside because you can't have your precious life back.

Name: Anonymous 2012-07-27 4:03

This thread is so bad I'm ashamed of myself. And I haven't even made a post in here yet.

Name: Anonymous 2012-07-27 4:03

C is undefined bullshit.

Name: Anonymous 2012-07-27 4:36

>>78
Does that mean by extension that every language that's written in C is undefined bullshit as well?

Name: Anonymous 2012-07-27 4:40

>>79
No matter how much Larry Wall and Guido van Rossum want you to think otherwise, the implementation is, in fact, not the language.
Else, every language is assembly and you have no excuse to write C.

Name: Anonymous 2012-07-27 5:56

>>79
So, it just means they're full of security flaws hidden in C undefined behaviour.

Name: Anonymous 2012-07-27 6:11

>>71
So you would rather use a terribly designed language than to write a few lines of code required to use the FFI? Seriously?

You shouldn't need inline assembly anywhere except in kernels and some
case of extreme optimizations. Requiring a modern language to support that is outright ludicrous.

By the way, check this:
package main
// #include <stdio.h>
// int f(int a, int b) { return a * b; }
import "C"
func main() {
    C.printf(C.CString("%d\n"), C.f(2, 3))
}

Yes, I just defined and called a C function in inline Go. See that? It's not that hard to use C stuff.
However, I see no reason at all to require my language to support anything more than that. Inline ASM? C as a subset of the language? What purpose would any of that serve in the long run except to complicate my job?

By the way, there's no such thing as asm block in C. What you're referring to is a C extension included in whatever C-based language you're using.

Name: Anonymous 2012-07-27 13:37

>>82
Weave as a part of scipy and Instant as a part of Fenics are just two Python modules which can do the same thing, they also allow easy access to numpy arrays.

Name: Anonymous 2012-07-27 14:04

>>72
Compilers don't do magic. They follow a set of rules like any software to turn common constructs into tighter code. But compilers are not aware of the context of the problem you're trying to solve, or the algorithm you're trying to express. Suppose you're in a very hot block of code. You find a more clever solution than the ASM output. That's where you use inline assembly.

Name: Anonymous 2012-07-27 14:08

>>82
By the way, there's no such thing as asm block in C.
It's not standard, but every C compiler to exist has implemented it.

Name: Anonymous 2012-07-27 14:14

Try Grovy or Scala or Erlang

Name: Anonymous 2012-07-27 14:23

>>85
In midly incompatible ways.

Name: Anonymous 2012-07-27 14:40

>>84
However, it's extremely rare that such an optimization would be worth more than general portability. It certainly shouldn't be a requirement for a general-purpose language (note that ``general'' does not mean ``specific set of hardware''). So, when you REALLY have to use that, you can use whatever convoluted way you can think of, since it's quite unlikely that you'll be doing it regularly.

>>85
And that's exactly what I said. That block isn't a part of the C language, but is a very common compiler extension.

Name: Anonymous 2012-07-27 15:11

>>71,82
Haha, prawg, open your mouth and take my huge Lisp with inline C. http://lush.sourceforge.net/index.html

Name: Anonymous 2012-07-27 15:48

Scheme: CL without namespaces.
lol no, Scheme: unbloated CL

Name: Anonymous 2012-07-27 15:52

Perl6 come to me

Name: Anonymous 2012-07-27 15:57

>>90
Scheme is the C of Lisp. Undefined as shit.

Name: Anonymous 2012-07-27 16:36

>>92
Say, what's undefined in Scheme?

Name: Anonymous 2012-07-27 16:59

>>93
Order of evaluation of arguments, let may evaluate its clauses in any order, and certainly some others.

Name: Anonymous 2012-07-27 17:03

>>90
More like
Scheme: CL without any standard library

Name: Anonymous 2012-07-27 18:06

>>1
Erlang: Concurrency is unneeded outside of a few very specific applications. Parallelism is where it's at.

You do realize that Elang can do both, right?
At the same time, too.

Name: Anonymous 2012-07-27 18:43

>>96
yea but can it do both while it's doing both?

Name: Anonymous 2012-07-27 18:49

>>97
Just because you're the robot doesn't mean you're the robot.

Did you even read the book?

Name: Anonymous 2012-07-27 19:11

>>98
No, but I read SICP.

Name: Anonymous 2012-07-27 19:42

100 GET

Name: Anonymous 2012-07-27 19:51

>>94
Order of evaluation of arguments, let may evaluate its clauses in any order

I don't know why some people thinks that 'undefined behaviors' are a bad thing. Actually it's good because it force you to think in a correct way

Name: Anonymous 2012-07-27 19:55

Why would such a minor freckle like this matter? Are you mutating states, /prog/?

Name: Anonymous 2012-07-27 20:04

>>101
No, it's not.  Everything should be well-defined and nothing left up to the implementation.

>>102
In fact I am mutating states whenever I must, which happens about 1% of the time.  And in that 1% of the time, only 1% of the time does the side-effect escape the function.

Name: Anonymous 2012-07-27 20:14

>>103
I think that you don't understand why something is undefined behavior

Name: Anonymous 2012-07-27 20:18

JACKSON 5 + 100 GET

20 TIMES BETTER THAN JACKSON  5 GET

Name: Anonymous 2012-07-27 20:27

>>103
Everything should be well-defined and nothing left up to the implementation.
No languages have ``nothing left up to the implementation'' unless they are ``defined'' by their implementation. See: Perl.

Name: Anonymous 2012-07-27 20:29

>>103
I guess you want Java, “run once, write everywhere”.

Name: !L33tUKZj5I 2012-07-27 23:53

The best programming language there ever was, or ever will be, is Spectrum BASIC.
Every machine out there should have an interpreter.

Name: Anonymous 2012-07-28 3:11

>>103
Who made up that rule? If I made a Shitgol and left the entire language as an exercise to the implementations and Shitgol compiler writers and Shitgol users were okay with that then it should be fine. Standard specs only exist to prevent excess incompatible extensions between vendors.

Name: Anonymous 2012-07-28 4:43

>>104
I do understand, it's because compiler writers want extra room for optimizations.  For instance, a compiler could schedule the evaluation of a procedure's arguments in parallel.

>>106
Most languages aren't Lisp, thus they're crap.  Pointing out that most languages have undefined behaviour means nothing.

>>107,109
I never thought I'd say this, but Java does have something good about it.  I find reproducibility to be fairly important -- if a program is executed on two different implementations, it should have the same result (leaving aside things like reference->integer conversions which aren't even guaranteed to be the same across two runs of the same implementation).  Leaving undefined behaviour in your language is a great way to make your users shoot themselves in the foot.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-07-28 5:53

>>72,73,82
Clearly none of you are experienced with Asm, or looked at compiler output (or you have, but can't understand/see a way to do it better, because of the first point). Compilers are a good way to generate code quickly; even at the highest optimisation level none of them can reason about things like register allocation in the same way that a knowledgeable programmer can. People have been saying compilers are better at generating code than programmers, but that's only if you compare to average or below-average Asm programmers.

Go learn Asm and compile any program you want, then look at the Asm output. Find a function and see if you can do it better. Unless you have some ultra-powerful compiler I haven't seen before, chances are you can beat the compiler in size, speed, or both. Perhaps because of the source language semantics the compiler can't do something, but YOU (should) know how your program operates and can easily take advantage of things like cross-function register allocation, convenient invariants (you can prove a register will always have e.g. 0, and use it appropriately; the compiler can't), and advanced stack manipulation.

>>82
Saying a language is "terribly designed" when it has the capabilities I need (and then some) is simply opinion. And it's not "write a few lines of code required to use the FFI", it's having to do that plus all the other "administrative" cruft surrounding it. In other words the integration is not as seamless as it could be. You said it yourself -- "FFI". Not "embed code directly in the compiler/JIT's output".

>>83
This is interesting, but note that it's not a core language feature. Someone clearly had the need to do that.

>>88
True portability is a myth. You can never get that in a compiled language, no matter how hard the standards bodies try. The closest thing to it is Java, and everyone should be familiar with its characteristics. If portability is ALL I care about for an application, then that's what I'd choose. My definition of "general purpose language" is one that allows the programmer to use whatever level of abstraction he/she finds appropriate, and switch between them effortlessly. How many architectures out there, in active use, have 37-bit integers, 11-bit bytes, or other oddities that the portability advocates keep using as examples? If you were working on such an uncommon system, portability is going to be the least of your worries. Right now, if you're targeting regular desktops, there's basically two ISAs and two OSs: x86-32/x86-64, Windows/*nix.

>>89
This is the sort of stuff I was talking about.

For instance, a compiler could schedule the evaluation of a procedure's arguments in parallel.
Unless those arguments require extensive computation, chances are that the overhead of preparing to get that done would outweigh the cost of just running them all in 1 thread serially. Modern processors are now quite good at parallel instruction execution, IIRC these 3 instructions

add b[edx], 5
sub eax, ecx
sub ebx, ecx


are executed in parallel since ~Core 2, with the second two running while the first is waiting on two memory accesses.

Name: Anonymous 2012-07-28 6:26

>>111
While you could hyper-optimize that code, do you really need that? Just when was the last time you really needed something like that instead of the code that would work at least on x86 and ARM?

Name: Anonymous 2012-07-28 6:41

>>11
Except Python does not do tail call elimination. Or have it's own stack for function calls. Enjoy your 998 calls or
>RuntimeError: maximum recursion depth exceeded

Name: Anonymous 2012-07-28 6:51

>>113
CLISP doesn't either, yet it is still perfectly usable.

Name: Anonymous 2012-07-28 6:52

>>111
Compilers can track the state of numerous registers far more effectively than a human can. They can take a higher level approach to assigning code than a human would reasonably do with a pen.

Name: Anonymous 2012-07-28 6:59

>>114
No, just no.

Name: Anonymous 2012-07-28 8:08

>>111
This is interesting, but note that it's not a core language feature. Someone clearly had the need to do that.
If you drop numpy arrays you can also drop numpy as a dependency and then all that is left is distutils which is part of the core language.

Name: Anonymous 2012-07-28 12:33

>>84
Compilers don't do magic. They follow a set of rules like any software to turn common constructs into tighter code.

Depending on the set of optimization algorithms they employ, they can appear to be magical.

But compilers are not aware of the context of the problem you're trying to solve, or the algorithm you're trying to express.

Then you are using the wrong language. You should only express the program as a minimally described idea. The compiler can then find an implementation of the idea that obtains an optimal efficiency on that target architecture and platform.

Suppose you're in a very hot block of code. You find a more clever solution than the ASM output. That's where you use inline assembly.

I'd rather let my genetic peep hole simulated annealing optimizer run on it with test input for two days.

Name: Anonymous 2012-07-28 12:50

>>118
let my genetic peep hole simulated annealing optimizer run on it with test input for two days.
please be trolling

Name: Anonymous 2012-07-28 12:55

>>>111

TRIPS

Clearly none of you are experienced with Asm, or looked at compiler output (or you have, but can't understand/see a way to do it better, because of the first point). Compilers are a good way to generate code quickly; even at the highest optimisation level none of them can reason about things like register allocation in the same way that a knowledgeable programmer can.

http://en.wikipedia.org/wiki/Register_allocation
http://en.wikipedia.org/wiki/Liveness_analysis
http://en.wikipedia.org/wiki/Category:Data-flow_analysis

People have been saying compilers are better at generating code than programmers, but that's only if you compare to average or below-average Asm programmers.

Use a better compiler.

Go learn Asm and compile any program you want, then look at the Asm output. Find a function and see if you can do it better. Unless you have some ultra-powerful compiler I haven't seen before, chances are you can beat the compiler in size, speed, or both. Perhaps because of the source language semantics the compiler can't do something,

Use a better language. The only defined behavior present should be critical to the correctness of the program.

but YOU (should) know how your program operates and can easily take advantage of things like cross-function register allocation, convenient invariants (you can prove a register will always have e.g. 0, and use it appropriately; the compiler can't), and advanced stack manipulation.

http://en.wikipedia.org/wiki/Interprocedural_optimization

Saying a language is "terribly designed" when it has the capabilities I need (and then some) is simply opinion. And it's not "write a few lines of code required to use the FFI", it's having to do that plus all the other "administrative" cruft surrounding it. In other words the integration is not as seamless as it could be. You said it yourself -- "FFI". Not "embed code directly in the compiler/JIT's output".

This is an implementation issue, not a language issue.

Unless those arguments require extensive computation, chances are that the overhead of preparing to get that done would outweigh the cost of just running them all in 1 thread serially.

This is a decision that can be made by the compiler. Although it would be annoying if in order to take advantage of this optimization, if you had to stuff large expressions to evaluate in parallel inside of function call parameter lists. In the purely functional setting, you can simply define your variables relative to other prior evaluations, and the compiler can create a dynamic evaluator that evaluates expressions as soon as their dependencies are finished, using a number of threads optimal for the target architecture.

Modern processors are now quite good at parallel instruction execution, IIRC these 3 instructions

add b[edx], 5
sub eax, ecx
sub ebx, ecx

are executed in parallel since ~Core 2, with the second two running while the first is waiting on two memory accesses.

cool stuff.

Name: Anonymous 2012-07-28 13:07

>>111
True portability is a myth.
I would say it's the least common denominator of architectures capabilities. Obviously, when you need maximum performance/power efficiency, you'll end writing one specialized version for each.

Name: Anonymous 2012-07-28 13:51

>>121
What exactly are you writing that needs to run on more than one architecture or platform?

Name: Anonymous 2012-07-28 16:31

>>122
Operating system, web browser, games, media player, user interface to hardware controllers

Name: Anonymous 2012-07-28 19:04

>>84
>You find a more clever solution than the ASM output
>encouraging people to be clever
get out

Name: Anonymous 2012-07-28 19:27

>>124
>encouraging people to be stupid
get out

Name: Anonymous 2012-07-28 20:22

>>125
>>124-san has a point. “Clever” designs suffer from not-invented-here syndrome more often, i.e. more bugs and less support. Masters don't only know how but when they use each tool.

Name: Anonymous 2012-07-28 21:37

01111111 GET

Name: Anonymous 2012-07-28 21:46

2^7 ``GET''!

Name: Anonymous 2012-07-28 23:39

>>127
299593 GET? wtf

Name: Anonymous 2012-07-28 23:45

>>129
There are only 10 kinds of people in the world....

Name: Anonymous 2012-07-28 23:57

>>129
what does it feel like to have friends

Name: Anonymous 2012-07-29 0:03

>>130
There are only 10 kinds of people in the world, those who fully understand octal, those who understand it but feign ignorance so their peers won't call them faggots, those who confuse it with hexadecimal, those who confuse it with decimal, those who confuse it with binary, those who refuse to understand it, those who are incapable of understanding it, and those who only use it for UNIX permissions.

Name: Anonymous 2012-07-29 1:24

>>132
Too long of a delivery, but you get points for a valiant attempt.

Name: Anonymous 2012-07-29 1:41

>>133

I found it rather humorous, but I find you point assignment to be absolute enraging and I hope you slowly die in a fire in which you seek shelter in a pool of mud, and temporarily find relief, but are inevitably slowly cooked by the surrounding flames.

Name: Anonymous 2012-07-29 4:33

>>134
Too long of a delivery, but you get points for a valiant attempt.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-07-29 5:04

>>112
Actually, the majority of stuff I work on is x86, Windows or Linux. If I was using some other architecture then I'd write in Asm for that too. (Although MIPS and ARM are a bit... boring.)

>>115,118,120
I was expecting replies like this. You still put blind faith in compilers and think they're god-like and better than anyone else. That's what the academia always says. But consider that if it was true, I wouldn't be writing this today. I've probably read through more compiler output than anyone else here, and quite frankly, most of it sucks. ICC is the best I've seen, MSVC is a close second, and GCC is far behind. Delphi is... about the worst I've ever seen.

>>123
The majority of those are provided in different versions for each platform. Like I said above, if you really want portability, use Java.

>>124,125,126
If they're lazy idiots they can let the compiler hold their hand and do everything for them. If they're intelligent they'll find a way to do things more efficiently. It's about letting the programmer exploit her capabilities to the limit.

Name: Anonymous 2012-07-29 5:22

My little cudder
Compilers are magic

Name: Anonymous 2012-07-29 5:34

>>136
+1 for mentioning gcc is complete shit

- Tremendous Faggot
  163k •31 •320 •628
  96% accept rate

Name: Anonymous 2012-07-29 5:41

>>38
>crying about garbage collection
as if writing 1 line is so hard

>>26
also no pointers? have fun being a useless language

have fun with your unused "future" languages

Name: Anonymous 2012-07-29 6:20

>>138
The other ones are non-free and closed-source, thus introduce security risks.

Name: Anonymous 2012-07-29 6:23

>>136,138
I'm no x86 assembly expert but what's wrong with gcc's output?

Name: Anonymous 2012-07-29 6:28

>>141
It doesn't have NSA backdoors so it threatens national security.

Name: Anonymous 2012-07-29 6:41

>>141
It can't produce code as well as something like ICC in some cases. Then again, ICC probably has a massive budget pumped into it compared, so it kinda makes sense.

Name: Anonymous 2012-07-29 7:09

12 squared dubs get claimed

Name: Anonymous 2012-07-29 7:14

>>136

I was expecting replies like this. You still put blind faith in compilers and think they're god-like and better than anyone else. That's what the academia always says. But consider that if it was true, I wouldn't be writing this today. I've probably read through more compiler output than anyone else here, and quite frankly, most of it sucks. ICC is the best I've seen, MSVC is a close second, and GCC is far behind. Delphi is... about the worst I've ever seen.

Yes, we don't have a qualifying implementation yet. But someday we might. And with the benefits it would bring, I don't understand why more people aren't concerned with improving the tools they use every day. Instead, they just crank out applications using what is currently available and popular. A possible solution to a problem brought up in this thread:

* include a construct in the language that places constraints on data types. This constraint can be formally checked with static code analysis, tested dynamically with asserts in testing, and then used as a hint for optimization when building the release version. The compiler could consider a violation of the constraint as undefined behavior, and assume its truth when generating optimal assembly. This way, you can get your architecture specific optimizations without creating undocumented and undetectable restrictions on the inputs.

Would be nifty:

* Automate profiling of generated machine code. Select constructs that give the best performance based upon empirical data collected from are large set test runs.

If they're lazy idiots they can let the compiler hold their hand and do everything for them. If they're intelligent they'll find a way to do things more efficiently. It's about letting the programmer exploit her capabilities to the limit.

Lazy and idiotic is all relative. Computer are idiots. But they are far from lazy. A VERY determined idiot may be able to write a better program than you if given a disproportionately larger amount of time to do it in.

Name: Anonymous 2012-07-29 7:38

>>145




I think that












what you want













already exists














in the form of
















design by

















contract.













Some systems,


















in fact,



















do statically check
















them, when they
















can.

Name: Anonymous 2012-07-29 7:47

>>145
Automate profiling of generated machine code. Select constructs that give the best performance based upon empirical data collected from are large set test runs.
http://theory.stanford.edu/~aiken/publications/papers/asplos06.pdf

Name: Anonymous 2012-07-29 9:38

Hey faggots, try out this multiboot kernel: ArCtGwMAAQD7T1HkAAAQAAAAEAAAAAAAAAAAACAAEAAxwLkAgAsAZjEBQUEPMcHgCIH5oI8LAHLu6+c=

Name: Anonymous 2012-07-29 10:41

>>5
So, I'm not alone with such thoughts. Unfortunately one have to write his own language with such properties. Atleast I haven't seen one.

Name: Anonymous 2012-07-29 10:47

>>1
Haskell. Anything other is shit.

Name: Anonymous 2012-07-29 12:07

>>136
If they're lazy
Writing clean code isn't lazy, it's polite.

exploit her capabilities to the limit.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

Name: Anonymous 2012-07-29 15:01

>>146,147

Yeah, I didn't doubt that they didn't already exist. Just if they were in wide spread use in industry. Thanks for the link.

Name: Anonymous 2012-07-29 15:15

>>147

The problem of emitting efficient SIMD code has confounded
compiler-authors for many years; gcc at least does not appear to
attempt to use SIMD instructions.

Oh gaawd

Name: Anonymous 2012-07-29 15:26

>>152
For contracts:
The D Programming Language has contracts by the kitchen sink law. No static checking, but they're not checked at runtime when release-building.
C# has http://msdn.microsoft.com/it-it/devlabs/dd491992.aspx
There was some work on static contracts for Haskell here: http://research.microsoft.com/en-us/um/people/simonpj/papers/verify/HaskellContract.ps but it doesn't seem to be implemented anywhere, nor I know if there have been any new developments.

Name: Racketfan 2012-07-29 15:43

Racket-lang.org

Name: Anonymous 2012-07-29 15:53

I am offended that PHP is listed as a programming language. PHP is not a programming language. It's a miscarriage of engineering, a monstrously deformed abortion that by some cruel fluke of nature has managed to cling to life.

Name: Anonymous 2012-07-29 16:12

>>156
I agree. But not just cling to life; it has become central to the average, modern website. Such a shame..

Name: Anonymous 2012-07-29 16:22

>>156
It would be unfair to not list PHP and then list JS, D, C++ and C.

Name: Anonymous 2012-07-29 16:27

>>158
Please don't even begin to compare C to PHP.

Name: Anonymous 2012-07-29 16:46

>>156
PHP is touring complete

Name: Anonymous 2012-07-29 16:50

>>158
i don't see why javascript SHOULDN'T be up there, even if PHP weren't listed

Name: Anonymous 2012-07-29 16:53

>>157
>>158
I realize that it mostly works and is useful, and sometimes use it myself when I need a tiny server backend for something, what jostles my flaps is the horrible choices in its design that result in ephemeral and obscure bugs creeping in to large projects. Of the useful, widespread programming languages it has the worst intrinsic flaws and insanities, that aren't the result of deliberate design decisions so much as incompetence. And so I am buttfurious on the internet about it.

Name: Anonymous 2012-07-29 17:08

>>162
>>157
>>158

buttfurious
Gabe plots cake, ``please''!

Name: Anonymous 2012-07-29 17:31

>>162
You know what else mostly works? A rusty knife.

Name: Anonymous 2012-07-29 17:32

>>164
Exactly.

Name: Anonymous 2012-07-29 17:36

>>165
I hope you'll enjoy tetanus, then?

Name: Anonymous 2012-07-29 19:03

>>166
HAX MY TETANUS

Name: Anonymous 2012-07-29 19:51

>>162
back to /g/, ``please''!

Name: Anonymous 2012-07-29 22:27

>>72
you're experience has maid you clothes minded.
[b][i]what[/i][/b]

Name: Anonymous 2012-07-29 22:38

So /g/ finally raided /prog/. I'm out of here, again.

fuck off, ``faggots"

Name: Anonymous 2012-07-29 23:26

>>170

no, we are just pretending to be from /g/.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-07-30 5:45

I don't understand why more people aren't concerned with improving the tools they use every day
Because they think compilers are somehow magical and mysterious entities. One of (many) somewhat-sluggish projects has been to write a better C compiler, one that generates code closer to how an assembly language programmer would write it---by necessity, not by pattern.

* Automate profiling of generated machine code. Select constructs that give the best performance based upon empirical data collected from are large set test runs.
MSVC has PGO, which is not bad, but is a bit of a pain to use and thus not used much except in special cases. Optimisation should be more "ubiquitous", not a special-case thing to be applied after you discover that things aren't quite right. I know they still teach that "premature optimisation" bullshit; but going back to change something that's been written already is a waste of resources. Optimisation should be integrated into design, thought of as "what's the best way to do this" throughout so that the final product doesn't need much more done to it.

>>151
That quote may be true for those who don't understand fundamentals like binary and machine instructions, but being clever---and thorough with the design---when writing code can in many cases completely avoid ever having to debug it, as you will have essentially proved it correct before ever running it.

>>156
Whatever you call it, it's become ubiquitous for websites for a reason. The same with all the other languages you complain about. "Purity" or whatever you want to call it has no real practical advantage. Pragmatism does.

Name: Anonymous 2012-07-30 6:30

>>172
Whatever you call it, it's become ubiquitous for websites for a reason.
The Jews at Zend bribed Web hosts.

Name: Anonymous 2012-07-30 7:35

Its only real selling point was that it had a foothold on at least 1% of all the internet's domains by 1998. In 1998 you had a choice of ASP classique, Coldfusion and PHP3 for 1. sticking your spaghetti code all over your HTML which was the fad back then and 2. run in-process in the web server instead of CGI. The next closest competitor had WSGI came about by 2003, mod_python came on a CD for a book in 2000, PHP was at version 3 by 1998. There were others a little later (JSP rushed out in 1999) but by then the Code-In-HTML language market was cornered by PHP on Linux and ASP on Windows.

Its other selling point, hosting companies liked PHP because of the shitty safe_mode when it came to cramming thousands of sites on a server.

Name: Anonymous 2012-07-30 8:02

>>172
While you ``optimize'' your bitfiddling code, I'll implement a better algorithm, Cudder.

Name: Anonymous 2012-07-30 12:04

>>175
LOL
Did you understand what he talked about?

Name: Anonymous 2012-07-30 13:22

>>172
"premature optimisation" bullshit
Optimisation should be integrated into design
Truth: asm is less readable and thus less maintanable.
Ideally, programs should be executable algorithms, expressed in adequate abstractions. Optimization should be done just when necessary, because it's architecture-dependent and almost write-only.
Or are you telling me to follow master FrozenVoid?
IHBT

Name: Anonymous 2012-07-30 16:30

>>137
I wanna have a threesome with you and Cudderspace

Name: Anonymous 2012-07-30 18:27

>>172
Optimisation should be integrated into design, thought of as "what's the best way to do this" throughout so that the final product doesn't need much more done to it.

"The best way to do this" is the most readable way to do it. If the optimized way is the most readable then do that, if the nieve way is the most readable then do that. If it profiles badly, THEN rewrite it in the optimized way and leave a small novel in the comments on why you made every choice you made and what its actually accomplishing.

I'm not saying that your way of coding is wrong or bad or anything, I'm just saying that it makes people not want to work with you.

Name: Anonymous 2012-07-30 18:30

>>177,179
Typical “premature optimisation” bullshiters. It's good to plan ahead.

Name: Anonymous 2012-07-30 18:36

>>177
your post is full of shit

Name: Anonymous 2012-07-30 18:41

>>180
then go to /optimisation/, this is /prog/

Name: Anonymous 2012-07-30 19:24

>>180
Sometimes its needed. Sometimes.
But you're likely to spend much more time optimising ahead and them spend much more to fix bugs. See this: http://tibleiz.net/asm-xml/
3 in 4 releases are bugfixes.

Name: Anonymous 2012-07-30 19:46

>>179
OR I could just write it correctly the first time. Code readability is over-rated. As long as it was not intentionally obfuscated, a programmer worth his salt will be able to figure out what it does (because it's right there in front of his face).

Name: Anonymous 2012-07-30 20:18

>>177
3/10.

>>183
You're now blurring "optimizing ahead" and "writing shitty code" into one thing. You can not optimize your design ahead-of-time and still have bug riddled code.

Of course, this partially does depend on your language. If you're gonna be writing a decently sizeable system in something like C, you will indeed need to do a fair degree of ``premature optimization'', simply because changing large chunks later takes so much time and will end up introducing more bugs in the process.

Name: Anonymous 2012-07-30 21:13

>>185
What's with these X/10 little shits coming to /prog/?

Did /g/ finally invade us?

Name: Anonymous 2012-07-30 21:15

>>186
Nice attempt to avoid my argument.
Have fun with your slow programs!

Name: Anonymous 2012-07-30 22:18

>>185
The whole point of planning ahead is when you haven't code yet, such as the asmxml project at its beginning. From the ground up it was designed to be optimised, but still had bugs and new feature requests. Maybe a new feature can disrupt your clever design, and you'll be walking on eggs to implement it. But now that the design is clever and everything is coded in asm, it's a pita.

Name: 188 2012-07-30 22:29

http://www.adacore.com/adaanswers/gems/gem-93-high-performance-multi-core-programming-part-1/

When it comes to performance, no amount of tweaking can compensate for an inherently slow design.

Name: 188 2012-07-30 22:40

But you check the "design" and see that it doesn't involve rewriting your app in asm because your compiler doesn't output optimal code.

Name: Anonymous 2012-07-30 22:48

>>184
OR I could just write it correctly the first time. Code readability is over-rated. As long as it was not intentionally obfuscated, a programmer worth his salt will be able to figure out what it does (because it's right there in front of his face).
What if you fucked up everything and are too stupid to realize it, and the superior programmer is reading your fucked up code, trying to infer what the fuck you were actually trying to do from your endless chain of tacit false assumptions?

>>186
yes

>>188
It's too late for them. Don't bother.

Name: Anonymous 2012-07-30 22:49

This thread is fail.

>>172
That quote may be true for those who don't understand fundamentals like binary and machine instructions
That quote may be Brian W. Kernighan

Name: Anonymous 2012-07-31 2:12

>>172
I don't understand why more people aren't concerned with improving the tools they use every day
Because they think compilers are somehow magical and mysterious entities.
People are probably just either satisfied with suboptimal machine code generation or they don't know any better.
One of (many) somewhat-sluggish projects has been to write a better C compiler, one that generates code closer to how an assembly language programmer would write it---by necessity, not by pattern.
This would be very nifty. The only implementation strategy for machine code generation I've seen so far is to simply convert syntactical constructs to a stream of equivalent instructions and apply a series of optimizations that maintain equivalence throughout the process. Albeit, there can be other things going on like using a minimum amount of registers. In the whole process, the compiler doesn't really know what you are trying to do. It's just spitting out machine code it knows is equivalent to the higher level source code you gave it, and then simplifying it a bit. An optimizing compiler tries to find a highest performing program within the equivalence class of the program the programmer submitted. So it applies all these operations that it knows preserve equivalence, but will likely improve performance.

It seems like in order for the compiler to truly generate assembly with necessity, it would need to have a complete high level understanding of what you are trying to do. But as soon as I start to think about whether this is possible, or already implemented, it feels like it comes down to a philosophical question. If I turn the steering wheel of my car to the left, does my car realize that I want to go left, and turns its wheels by necessity? Or is it just doing what the mechanical engineer designed it to do, which was to allow rotation of the steering wheel was to directly affect the wheels?

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-07-31 6:27

>>177
Truth: asm is less readable and thus less maintanable.
Readability is subjective (size or speed is NOT), but to someone experienced, it's more readable. Even the format is consistent. The only thing is, it also tends to be more voluminous.

>>183
Look at the bugtrackers of other XML libraries and you'll find much worse.

When I was talking about design, I meant decisions that are not easy to reverse, and that can have huge consequences for efficiency. For example, instead of the obvious (and inefficient) "make objects out of everything and copy data into them" parsing algorithm, do it in-place so there's no unnecessary copying of data. Simplify the design to make it more "direct". I'll use the web browser as an example here since we're all familiar with them: how long is the path (in terms of # of instructions or call depth) between Firefox getting a WM_PAINT and its first API call into the OS to draw the page content into the window? In the browser design I'm planning, it's < 5. I'm not planning to do any memory allocation/deallocation there either.

Maybe "optimisation" is the wrong word for this, but what else do you call making something more efficient? If you plan your design like this and have it carefully thought out before writing a single line of code, the chances are high that you'll be ahead of those who "optimise later", without even needing to go to inline Asm. You could say that, despite the compiler's stupidity, everyone is using the same compiler but your design has already been optimised at a higher level, so its inefficient output is still going to be more efficient than the output from an inefficient design.

>>193
It seems like in order for the compiler to truly generate assembly with necessity, it would need to have a complete high level understanding of what you are trying to do.
That's not necessary, all it needs to understand is the semantics of the source language. What we agree is not statement-by-statement translation and applying optimisation afterwards (there's that anti-premature-optimisation again!), but my idea is to generate code "backwards", working from the desired result. E.g. in


int foo() {
 int i = f();
 int j = k();
 int l = 2 + j;
 ... /* code that does a lot with l, but never has effect on i, j, nor other global variables */
 return i + 2*j;
}


a "dumb" or "traditional" compiler would emit code for all that stuff in the middle, and optimisation might have a chance at removing (some of) it and moving variables into registers. An "intelligent" compiler could work backwards and "think" "This function's result depends on i and j. What do i and j depend on? f() and k(). What do f() and k() depend on? ... " Eventually it might determine that f() and k() are actually constant, and substitute that in, propagate the changes down, and reduce foo() itself to a constant. And of course, all of this might not even be done if foo() can never get called from any entry point. At every reference to a nonlocal variable, this process would need to be performed as their results are "visible" to other functions.

For things like register allocation (done after the above), the compiler could track how many variables are actually needed, and then choose how many extra "slots" are required in memory. It can alter their allocation to registers depending on their usage and loop nesting level. In size/balanced optimisation mode, if it sees certain variables having LIFO-like usage patterns, it can emit push/pop (single byte instructions) instead of explicitly doing a stack allocate and moves. If there's a long-running inner loop with frequently accessed variables, push the less frequently used variables on the stack. This is how an Asm programmer does register allocation by necessity.

Due to the halting problem it's not possible to prove that f() or k() terminate even if they do not depend on nonlocal variables, but the compiler can offer an option to simulate their execution for a limited number of cycles. No deep understanding (other than language semantics) required by the compiler, just a different way of approaching the problem.

Follow this process to its logical conclusion, even into library functions and such, and your printf("%d\n", fib(5)); becomes a fputs("2178309\n", stdio);. In the output binary too, there will be nothing more than what the compiler found was needed. That is the ultimate goal.

Name: Anonymous 2012-07-31 10:49

>>194
side-effects troubles optimizing compilers
don't want functional programming
LOL

Name: Anonymous 2012-07-31 11:25

Use Eiffel.

Name: Anonymous 2012-07-31 12:24

>>194
If you plan your design like this and have it carefully thought out before writing a single line of code, the chances are high that you'll be ahead of those who "optimise later", without even needing to go to inline Asm.
well the issue isn't that you're making things more efficient, efficiency is generally good, the issue is that it's your only priority.

take a contrived fibinacci example. you don't want to use the recursive O(2^n) algorithm...

but you also probably also don't want to use the O(1) algorithm unless you have a REALLY good reason to:


const double inverseSqrt5 = 0.44721359549995793928183473374626
const double phi = 1.6180339887498948482045868343656

static int Fibonacci(int n) {
    return (int)Math.Floor(Math.Pow(phi, n) * inverseSqrt5 + 0.5);
}

Name: >>197 2012-07-31 12:25

>>197
forgot to quote first line.

Name: Anonymous 2012-07-31 12:39

>>197
but you also probably also don't want to use the O(1) algorithm unless you have a REALLY good reason to
Why? The meaning of the function is captured by int Fibonacci(int n). How it is implemented is none of anyone's business.

Name: Anonymous 2012-07-31 12:42

>>197
what is the value of (int)std::numeric_limits<double>::infinity(), anyway?

Name: Anonymous 2012-07-31 13:05

>>199
it is if there's a typo.

Name: Anonymous 2012-07-31 13:09

>>199
How it is implemented is none of anyone's business.
Also you make a lot of statements like this, leading me to believe that you've never actually worked with other people before. Am I correct?

Name: Corporate Critter 2012-07-31 13:32

>>1
Why not put HTML5 on your list?

Name: Anonymous 2012-07-31 13:48

ITT: Excuses for not knowing assembly

Stay mediocre

Name: Anonymous 2012-07-31 14:43

JACKSON 5 + 200 GET

Name: Anonymous 2012-07-31 14:44

JACKSON 5 * 41 GET

Name: Anonymous 2012-07-31 14:52

>>204
Itt: i know assembly, try again.

Name: Anonymous 2012-07-31 15:02

>>206
5*41 is not 206

Name: Anonymous 2012-07-31 22:59

That's not necessary, all it needs to understand is the semantics of the source language. What we agree is not statement-by-statement translation and applying optimisation afterwards (there's that anti-premature-optimisation again!), but my idea is to generate code "backwards", working from the desired result. E.g. in


int foo() {
 int i = f();
 int j = k();
 int l = 2 + j;
 ... /* code that does a lot with l, but never has effect on i, j, nor other global variables */
 return i + 2*j;
}


a "dumb" or "traditional" compiler would emit code for all that stuff in the middle, and optimisation might have a chance at removing (some of) it and moving variables into registers. An "intelligent" compiler could work backwards and "think" "This function's result depends on i and j. What do i and j depend on? f() and k(). What do f() and k() depend on? ... " Eventually it might determine that f() and k() are actually constant, and substitute that in, propagate the changes down, and reduce foo() itself to a constant.

Yeah, this can actually be formulated as a simple graph algorithm. Every intermediate expression is a node. An expression X has a directed edge to an expression Y if and only if the evaluation of X depends on the result of the evaluation of Y. We then look at the node for the expression being returned from the subroutine. Perform a depth first traversal. In the post order phase, emit assembly instructions for evaluating the current node. This will ensure the dependencies for each expression are evaluated in time. Further more, expressions whose values do not contribute to the return value will not be reachable from the return expression node. So the depth first traversal will never reach, and never emit assembly for these unused expressions. And even further more, but the expression dependency graph also provides a scheme for evaluating dependencies in parallel. The disadvantage of trying to take advantage of that is the cost of synchronization. The time it takes to evaluate an expression node may vary very much with different inputs. Effective synchronization would likely need to be dynamic, and the inherent cost of managing this would only make it advantages when the expressions have intensive evaluation procedures.

And of course, all of this might not even be done if foo() can never get called from any entry point. At every reference to a nonlocal variable, this process would need to be performed as their results are "visible" to other functions.

Yeap.

For things like register allocation (done after the above), the compiler could track how many variables are actually needed, and then choose how many extra "slots" are required in memory. It can alter their allocation to registers depending on their usage and loop nesting level.

Yeah, register allocation can be stated as a graph coloring problem.

http://en.wikipedia.org/wiki/Graph_coloring#Register_allocation
The textbook approach to this problem is to model it as a graph coloring problem.[29] The compiler constructs an interference graph, where vertices are symbolic registers and an edge connects two nodes if they are needed at the same time. If the graph can be colored with k colors then the variables can be stored in k registers.

For general graphs, finding an optimal coloring is NP hard, and finding a coloring using less than k colors is NP complete. But you can get your optimal register allocation if you are willing to wait. If the variable interference graph has certain characteristics, it could be possible to color it easier.

In size/balanced optimisation mode, if it sees certain variables having LIFO-like usage patterns, it can emit push/pop (single byte instructions) instead of explicitly doing a stack allocate and moves. If there's a long-running inner loop with frequently accessed variables, push the less frequently used variables on the stack. This is how an Asm programmer does register allocation by necessity.

Cool stuff. This could be stated as an optimization problem.

Due to the halting problem it's not possible to prove that f() or k() terminate even if they do not depend on nonlocal variables, but the compiler can offer an option to simulate their execution for a limited number of cycles. No deep understanding (other than language semantics) required by the compiler, just a different way of approaching the problem.

I saw a language that offered something similar. At function definitions you could specify a maximum time out for the function. This might be useful in some real time applications, but I could see it getting in the way otherwise.

http://en.wikipedia.org/wiki/Hume_(language)

Follow this process to its logical conclusion, even into library functions and such, and your printf("%d\n", fib(5)); becomes a fputs("2178309\n", stdio);. In the output binary too, there will be nothing more than what the compiler found was needed. That is the ultimate goal.

Yeap, that's all possible with function inlining and constant folding. If the printf implementation prepares the "2178309\n" string by writing into a buffer, the compiler would need to understand this and evaluate the resulting state of the buffer at compile time.

Name: Anonymous 2012-07-31 23:24

>>200
0, apparently.

Name: Anonymous 2012-07-31 23:37

>>210
Is that guaranteed by the standard, implementation defined, or undefined?

Also, what's the point of a factorial function that only works for extremely small values?

Name: Anonymous 2012-08-01 4:42

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-01 7:24

The disadvantage of trying to take advantage of that is the cost of synchronization.
???
As I mentioned above all recent CPUs automatically parallelise when there are no dependencies between instructions.

For general graphs, finding an optimal coloring is NP hard, and finding a coloring using less than k colors is NP complete.
I've never really understood this obsession with graph colouring for RA. Perhaps it's too general for register allocation, or register-memory/register-immediate/memory-immediate architectures like x86 can avoid this issue. In the flow graph, think of a wire connecting two nodes for each variable's value that needs to be preserved. Then the "thickness" of the "bundle" between each node is how many units (bytes e.g. if each wire represents 8 bits) of data need to be preserved between them. Assign the ones used in the innermost loops/most often to registers first. The rest can go in memory, or swapped with another if there's a change in use (x86 conveniently provides an xchg instruction for this.)

This algorithm is quite different from any I've seen before (except perhaps linear scan, although that one is not particularly efficient) and has some traits of the optimise-first approach: instead of attempting to allocate everything to registers and "spill" afterwards, determine the maximum storage needed first, then allocate to both memory and registers. For example you have 80 bits of "real" local variables (i.e. values that actually get used and need preservation, which may not correspond to the number of declarations in the source code) and 56 bits of registers. A dumb graph-colouring algorithm is NOT going to see the 24 bits until it's too late. Then it goes back, "spills", tries again, etc. The algorithm I described above is going to allocate the 24 bits somehow (pre-allocation via stack, push/pop, etc.), and fit everything in naturally.

At function definitions you could specify a maximum time out for the function.
That's not what I meant. It's a compile-time optimisation constraint to prevent compilation from taking too long, at the expense of possible lost pre-computation. E.g. you write a program that computes 64K digits of pi. This is self-contained and should terminate, so theoretically could be computed by the compiler. With a low precomputation limit, it doesn't finish so at best the compiler can save the partial state and you end up with e.g. a program that contains 4K digits already computed and another 60K to go. Set it high enough and you get a program that only contains the 64K digits you want because the computation terminated in the compiler.

If the printf implementation prepares the "2178309\n" string by writing into a buffer, the compiler would need to understand this and evaluate the resulting state of the buffer at compile time.
It's just the collection of visible state changes induced by the program. You could just collect the library function calls it made, along with their associated data, and emit code to do only that, if the program depended on no input. Programs are deterministic. If they have no input they will always produce the same output (or not halt). If they halt with no input then capture their output and that is the embodiment of the process it performs.

Name: Anonymous 2012-08-02 0:59

As I mentioned above all recent CPUs automatically parallelise when there are no dependencies between instructions.
Yes, so it would be silly to explicitly use threads to evaluate expressions whose evaluations are only a few primitive opcodes long. But if the values are the returned values from procedures that take inputs and take a while, and have time of execution that varies with their inputs, spawning child threads to compute the values and then joining them for the calculation of the common dependency will allow for the dependencies to be calculated in parallel.

I've never really understood this obsession with graph colouring for RA. Perhaps it's too general for register allocation,
To answer this question, one must ask if it is possible to create a chunk of code that will yield a variable interference graph isomorphic to an arbitrary graph.

or register-memory/register-immediate/memory-immediate architectures like x86 can avoid this issue.
Only if the number of registers is two.

In the flow graph, think of a wire connecting two nodes for each variable's value that needs to be preserved. Then the "thickness" of the "bundle" between each node is how many units (bytes e.g. if each wire represents 8 bits) of data need to be preserved between them.
Larger data types can be represented as a series of independent variables, where each variable represents a fixed size portion of the data type. If allocated in memory, and extra restriction that each portion must be in successive memory addresses can be assigned.

Assign the ones used in the innermost loops/most often to registers first.
why? And what's a loop in this context? what code representation are you operating on?

The rest can go in memory, or swapped with another if there's a change in use (x86 conveniently provides an xchg instruction for this.)
Yes, if there is spillage, there will need to be registers reserved for swapping values in between the swapping register and the variables permanent memory address.

This algorithm is quite different from any I've seen before (except perhaps linear scan, although that one is not particularly efficient) and has some traits of the optimise-first approach:
Without a proof of optimality, it's crap.

instead of attempting to allocate everything to registers and "spill" afterwards, determine the maximum storage needed first, then allocate to both memory and registers. For example you have 80 bits of "real" local variables (i.e. values that actually get used and need preservation, which may not correspond to the number of declarations in the source code) and 56 bits of registers.
Multiple variables can still be allocated in a single memory address, much how multiple variables can be allocated in a single register. An optimal graph coloring will give you a smallest number of slots needed to allocate your variables. Some of these slots will be in registers and if there are too many slots, they will be in memory as well. But in any case, a minimal amount of slots and memory will be used.

A dumb graph-colouring algorithm is NOT going to see the 24 bits until it's too late. Then it goes back, "spills", tries again, etc.
The graph coloring algorithm doesn't know about registers or spills. It just gives you colors. How you interpret and use these colors in a separate issue.

The algorithm I described above is going to allocate the 24 bits somehow (pre-allocation via stack, push/pop, etc.), and fit everything in naturally.
naturally. I wish this word was stricken from anything technical. If it ain't optimal it's crap.

That's not what I meant. It's a compile-time optimisation constraint to prevent compilation from taking too long, at the expense of possible lost pre-computation. E.g. you write a program that computes 64K digits of pi. This is self-contained and should terminate, so theoretically could be computed by the compiler. With a low precomputation limit, it doesn't finish so at best the compiler can save the partial state and you end up with e.g. a program that contains 4K digits already computed and another 60K to go. Set it high enough and you get a program that only contains the 64K digits you want because the computation terminated in the compiler.
my bad. That's a good point to be aware of if pushing those computations to compile time.

It's just the collection of visible state changes induced by the program. You could just collect the library function calls it made, along with their associated data, and emit code to do only that, if the program depended on no input. Programs are deterministic. If they have no input they will always produce the same output (or not halt). If they halt with no input then capture their output and that is the embodiment of the process it performs.
Yeah, constant folding on function calls can go a long way. It would be most powerful if it could adapt to almost any user function, as well as library functions.


Anyways, I'm going to try to get optimal register allocation in polynomial time. Wish me luck!

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-02 7:22

why? And what's a loop in this context? what code representation are you operating on?
Obviously because they will be read/written the most and thus deserve to be in registers, at least for the duration of that loop. It should also be obvious what a loop is, it's whatever corresponds to a loop in the source language. A loop within a loop is also similarly self-explanatory. The representation doesn't matter.

Yes, if there is spillage, there will need to be registers reserved for swapping values in between the swapping register and the variables permanent memory address.
This is where it seems you're still stuck with "the old way": a variable does not ever need to be at a "permanent memory address". There doesn't even need to be any correlation between how variables are defined and used in the source and the output, as long as they are semantically equivalent. Think of a variable as just a label for a "flow" of data between two or more operations. That flow might not exist if the operations that it connects aren't important to the final result of the function.

Without a proof of optimality, it's crap.
That's what academics think. The real world says otherwise.

Multiple variables can still be allocated in a single memory address, much how multiple variables can be allocated in a single register.
But multiple registers OR memory addresses may be allocated to one "variable" too.

The graph coloring algorithm doesn't know about registers or spills. It just gives you colors. How you interpret and use these colors in a separate issue.
That's exactly the point I'm making: it's too general.

Anyways, I'm going to try to get optimal register allocation in polynomial time. Wish me luck!
According to the RA article on Wikipedia this is possible with SSA, and there's nothing about anything being NP-complete on its page. Unfortunately they're both full of academic verbiage so it's difficult to determine what they're really talking about.

Another two examples of why more "targeted" algorithms can work better: instructions that require operands in particular registers, and those that prefer particular registers. With traditional graph colouring, you can force allocation of a certain register, but there's no way to specify a preference. An Asm programmer can think "there's a divide coming up: let's arrange things so that its input will be in the right registers when the time comes" and override that choice with something more important when necessary. Traditional RA algorithms can't do that.

tl;dr: Register allocation is one area where compilers could improve significantly, but due to a reluctance to challenge existing norms, has not. It's only recently that stuff like linear-scan (closer to what someone with common sense, but not exposure to previous biases, would come up with) and SSA started getting visibility. These basics should've been where we started.

To whoever started the insistence that register allocation is as general a problem as graph colouring and propagated it: FUCK YOU.

Name: Anonymous 2012-08-02 23:36

>>215

Obviously because they will be read/written the most and thus deserve to be in registers, at least for the duration of that loop. It should also be obvious what a loop is, it's whatever corresponds to a loop in the source language. A loop within a loop is also similarly self-explanatory. The representation doesn't matter.
It sounds like you are operating on the source language directly. In the version I was thinking of, it was to take abstract assembly that uses an unbounded amount of variables, and then output the same assembly but using as few variables as possible. When operating at the assembly level, loop constructs can become very complex and general. I'm not sure which approach would be better. It feels like it would be better to do it after the compiler has done its other optimizations, but I could be wrong.

This is where it seems you're still stuck with "the old way": a variable does not ever need to be at a "permanent memory address". There doesn't even need to be any correlation between how variables are defined and used in the source and the output, as long as they are semantically equivalent. Think of a variable as just a label for a "flow" of data between two or more operations. That flow might not exist if the operations that it connects aren't important to the final result of the function.
Well that's interesting. Is there an advantage to allocating a single variable in multiple locations? Each location is essentially a variable. So this could be modeled as variable assignment. I could see this getting very interesting if each storage location had a different cost of access. But assuming each register is equally cheap to read and write to, all you have to worry about is swapping values in and out of memory and the behavior of the special purpose registers, in which case, you might not need to go there.

That's what academics think. The real world says otherwise.
The real world is satisfied with crap.

But multiple registers OR memory addresses may be allocated to one "variable" too.
I'm not sure if there is an advantage in doing that. The algorithm would be creating a new variable that is equivalent to another previously defined one. If the goal is to create an equivalent program using the fewest amount of variables, and hence, needing the fewest amount of storage locations, then creating new variables doesn't seem like a good idea. But if the goal is to increase performance, then maybe it's possible in some circumstances to use more storage locations. At that point you'd need to work with a model closely related to the processor. It wouldn't be enough to just assume that using the minimum amount of storage locations will give you the most speed, although you'd think that would be true if there was no spillage as a result.

That's exactly the point I'm making: it's too general.
Whether or not graph coloring is too general for register allocation depends the class of graphs created in the construction of the variable interference graph. If it can form an arbitrary graph, then optimal register allocation is just as hard as graph coloring. But if the class of graphs is actually more limited, then it can be possible to use a special purpose graph coloring algorithm that takes advantage of properties present in the restricted class of graphs.

According to the RA article on Wikipedia this is possible with SSA, and there's nothing about anything being NP-complete on its page. Unfortunately they're both full of academic verbiage so it's difficult to determine what they're really talking about.
Good news! In the version I am working with, I am getting lines in the assembly where each variable must be holding its current value. If in a single line there are multiple variables that must hold their values, then these variables cannot be allocated in the same place. This yields a set of cliques. Now finding a clique in a graph is NP Complete, and the encoding we get for the variable interference graph is all of its cliques. So if it is possible to optimally color a graph using knowledge of what all its cliques are, then that could be used to optimally color the variable interference graph. I got something working for one example, but once I apply it to general graphs I get lost. Maybe there's a structure to the variable interference graph that I'm unknowingly taking advantage of.

Another two examples of why more "targeted" algorithms can work better: instructions that require operands in particular registers, and those that prefer particular registers. With traditional graph colouring, you can force allocation of a certain register, but there's no way to specify a preference.
The problem becomes much more complex, but you can use utility functions to rank the optimality of a coloring. But this requires a calculation of a utility function.

An Asm programmer can think "there's a divide coming up: let's arrange things so that its input will be in the right registers when the time comes" and override that choice with something more important when necessary. Traditional RA algorithms can't do that.
They can understand that the registers used by the divide instruction must be used to store the variables resulting from the division. This can be represented by allocating variables for the result of the division and forcing their colors to represent the registers they are put in by the instruction.

tl;dr: Register allocation is one area where compilers could improve significantly, but due to a reluctance to challenge existing norms, has not. It's only recently that stuff like linear-scan (closer to what someone with common sense, but not exposure to previous biases, would come up with) and SSA started getting visibility. These basics should've been where we started.
Indeed. But that makes it more exciting doesn't it? The late start makes it more likely that you might discover something new.

To whoever started the insistence that register allocation is as general a problem as graph colouring and propagated it: FUCK YOU.
so sorry! But really graph coloring is just solving the same problem in a more abstract setting. If you want to build more restrictions from the platform into the algorithm, just add more information to your definition of the graph to reflect the extra restrictions. These additions are usually small, and the advantage of using graphs is that all research and developments that have gone into graph algorithms will be easy for you to leverage. But there is the problem of graphs being such general objects. It could be that your domain is much less general than graph coloring. In which case it may be more useful to operate within the domain itself and never create a graph.

Name: Anonymous 2012-08-03 1:06

>>213-217
tl;dr

Name: Anonymous 2012-08-03 4:07

CHECK

Name: Anonymous 2012-08-03 4:07

OUT

Name: Anonymous 2012-08-03 4:07

THESE

Name: Anonymous 2012-08-03 4:08

TRIPS

Name: Anonymous 2012-08-03 4:08

FAGGOTS

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-03 4:52

In the version I was thinking of, it was to take abstract assembly that uses an unbounded amount of variables, and then output the same assembly but using as few variables as possible.
That's basically SSA. Determining loop nesting level in Asm isn't difficult either: count the maximum number of complete cycles in the flow control graph enclosing a use of a given "variable".

Is there an advantage to allocating a single variable in multiple locations?
As I said in what you quoted, at this phase of compilation you should forget about what variables were present in the source code, and think of them as only labels for a flow of data between two computational nodes. Registers and memory serve as temporary storage locations for this flow, and thus it may occupy different locations at different points in the code. The advantage is additional freedom in code generation.
But assuming each register is equally cheap to read and write to
They're not. For example, the accumulator is preferred for many instructions. (e)DX must hold the remainder of a division. If a division is coming up but all the registers are already used for values that will be used afterwards, it makes sense to move whatever DX is storing into memory (possibly a push on the stack) and pull it back into a (possibly different) register sometime later when the compiler decides that it'll be used enough that it should get a register.

The real world is satisfied with crap.
Whatever you call it, that's the way things are.

I lost you at the graph colouring stuff, so I can't say much there.

The late start makes it more likely that you might discover something new.
It's only new to those who have buried their heads in the academic "literature" (propaganda?) all along. Try asking any Asm programmers if they colour graphs to allocate registers when they write code. I sure don't.

But really graph coloring is just solving the same problem in a more abstract setting.
Too abstract. If I were forced to choose an NP-complete problem to represent register allocation, it would be bin packing. At least that intuitively feels like the process of allocating items (values) to different bins (registers).

But there is the problem of graphs being such general objects. It could be that your domain is much less general than graph coloring.
I have a very strong feeling that it is: If all programs can be transformed into SSA form and doing so isn't NP-complete, and by doing so you can allocate registers in polynomial time, whereas doing it for the same program not in SSA is supposed to be NP-complete, then isn't that like saying P=NP? Following this to its conclusion, take any NP-complete problem, reduce it to graph colouring, reduce graph colouring to RA, derive a program from it, convert the program to SSA, RA that, and you've solved the NP-complete problem in polynomial time.

The other big drawback of thinking so abstractly is that only those who know much about graph theory (unlike me) can talk about RA like that. It creates a rather closed-minded environment and propagates the "thou shalt use graph colouring" myth. In contrast, the way I describe and think about it, it wouldn't be difficult at all for even a 10-year-old to understand and relate to the underlying problem. Getting rid of all that academic cruft and theory is a great way to get others thinking about these sorts of problems, and as a result drive innovation and progress through greater proliferation of practical, readily applicable ideas.

"Farmers don't want to learn about spherical cows. They just want the milk."

Name: Anonymous 2012-08-04 3:43

>>223
That's basically SSA. Determining loop nesting level in Asm isn't difficult either: count the maximum number of complete cycles in the flow control graph enclosing a use of a given "variable".
Cool stuff.

As I said in what you quoted, at this phase of compilation you should forget about what variables were present in the source code, and think of them as only labels for a flow of data between two computational nodes. Registers and memory serve as temporary storage locations for this flow, and thus it may occupy different locations at different points in the code. The advantage is additional freedom in code generation.
A limitation with the typical graph coloring approach is that assignments to registers are static and never change during the variable's life time. I think it could be developed to account for this though. At each time (at each line in the assembly) the live variables must be allocated in some known assignment to registers and memory. But this assignment is subject to change at any point in the program. So rather than assigning a variable v to a register r, a variable v together with a line in the assembly l will be assigned to a register. So rather than calculating a function mapping variable -> storage, it would be calculating a function (variable, time) -> storage. Although then you'd have to account for the extra instructions required for moving the values around. It's not quite the same as the register allocation problem I'm used to, which was to just relabel the registers to use as few as possible.

They're not. For example, the accumulator is preferred for many instructions. (e)DX must hold the remainder of a division. If a division is coming up but all the registers are already used for values that will be used afterwards, it makes sense to move whatever DX is storing into memory (possibly a push on the stack) and pull it back into a (possibly different) register sometime later when the compiler decides that it'll be used enough that it should get a register.
The limitation with DX can be expressed in graph coloring by assigning the variable representing the output remainder from the division instruction the color representing the DX register, and then forcing the colorer to deal with the predefined constraint. I'm not familiar enough with x86 to respond specifically about the accumulator, but if there are registers with faster access for certain instructions, variables used by those instructions should get priority for allocation in those registers. To calculate this, one can consider the instructions applied to a variable during the variable's lifetime, and store the instructions in an array. Depending on the assignment to register for the variable, each instruction will have a certain cost. The algorithm could then try to put the variable in a register that minimizes the sum cost. Now this is where the ability to move variables between registers comes in handy. If the first ten instructions would perform best if X was in R1, and then the next twenty performed best if X was in R8, it could be worth the extra mov instruction to move X from R1 to R8. This gets really complicated, especially when there are other live variables that must also share the registers. But it can be stated as an optimization problem, and then solved in a brute force manner or guessed well with heuristics.

Whatever you call it, that's the way things are.
No, I understand. People are interested in what is practical. It isn't practical to wait 2 months for your program to compile if the output program is 4% faster than a build that took 4 minutes to compile.

I lost you at the graph colouring stuff, so I can't say much there.
Sorry. The variable interference graph is a graph, where the nodes represent variables and there is an undirected edge connecting two variables if the variables are live at the same moment, at some point in the program. A clique is a set of vertices that are all mutually adjacent. In the context of register allocation, it is a set of variables that all cannot share a register in their register allocation. In the input to the register allocation algorithm I was working with, the algorithm would scan the assembly and calculate which variables were live and which were not, for each line in the assembly. So at each line, there is a set of live variables. All variables that are live in the same line cannot share any registers in their allocation (assuming the allocation is static). So at each line of the assembly, I get a clique in the variable interference graph. Normally one gets encodings of graphs in the form of an adjacency list, which is just a list of all the nodes, were each node contains a list of all adjacent edges. Since in this register allocation problem you get a list of cliques instead, doing graph coloring might be easier.

It's only new to those who have buried their heads in the academic "literature" (propaganda?) all along. Try asking any Asm programmers if they colour graphs to allocate registers when they write code. I sure don't.
Graph coloring is sort of similar to sudoku, cross word puzzles, etc. It's an abstract representation of some of the basic constraint satisfaction problems people solve themselves every day. But people usually use creative methods to solve the problems. Computers are more brute force, although heuristics can be used as well.

Too abstract. If I were forced to choose an NP-complete problem to represent register allocation, it would be bin packing. At least that intuitively feels like the process of allocating items (values) to different bins (registers).
Yeah, maybe. Graph coloring is useful for a lot of constraint satisfaction problems. The register allocation problem is to allocate variables to registers with the constraint that variables live in the same moment are not allocated to the same register. So it translates well for that.

I have a very strong feeling that it is: If all programs can be transformed into SSA form and doing so isn't NP-complete, and by doing so you can allocate registers in polynomial time, whereas doing it for the same program not in SSA is supposed to be NP-complete, then isn't that like saying P=NP? Following this to its conclusion, take any NP-complete problem, reduce it to graph colouring, reduce graph colouring to RA, derive a program from it, convert the program to SSA, RA that, and you've solved the NP-complete problem in polynomial time.
I agree. And that's a nice proof right there. I was very confused when I read about that and I was wondering why someone hadn't written something that got some more consistency on this.

The other big drawback of thinking so abstractly is that only those who know much about graph theory (unlike me) can talk about RA like that. It creates a rather closed-minded environment and propagates the "thou shalt use graph colouring" myth. In contrast, the way I describe and think about it, it wouldn't be difficult at all for even a 10-year-old to understand and relate to the underlying problem. Getting rid of all that academic cruft and theory is a great way to get others thinking about these sorts of problems, and as a result drive innovation and progress through greater proliferation of practical, readily applicable ideas.
It's true. Abstraction in general is this way. The further you get from the original problem, the less intuitive it will appear. But there is the possibility that there are other problems out there with very different details, environments, and jargon, but are actually the same problem. By abstracting away the details and preserving only the concept of the problem, you can unify all these applications. And then development and solutions for the abstraction will apply to all of its applications. This can be useful sometimes, but it always creates a barrier to understanding. Especially when the abstract terminology begins to stack on top of itself and no one other than a researcher would be interested in learning it all. The abstraction can also be lossy. There might be details in the original problem that are lost in the abstraction. In this case, solutions to the abstraction will need to be extended to account for the new details before they can be applied.

"Farmers don't want to learn about spherical cows. They just want the milk."
Indeed.

Name: Damn you 2012-08-04 5:52

>.>
no Ada, Lua, AWK, or KSH93.

GTFO OP. And find your needs.

Personal: C, Brainfuck, KSH93, AWK, and RISC's ASM

Name: Anonymous 2012-08-04 7:51

OCaml actually has a second implementation: F#. Considering it's Microsoft's, it really isn't that much better than the INRIA one but it does have some pros for real world use. Like Unicode support.

Otherwise, polymorphic, strongly typed, higher order, functional, right-to-left call-by-value is the best "modern GP language" has to offer now.

Name: Anonymous 2012-08-04 7:52

>>226
F# is not OCaml.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-04 8:54

At each time (at each line in the assembly) the live variables must be allocated in some known assignment to registers and memory. But this assignment is subject to change at any point in the program.
From what I've read, that's part of the principle behind SSA: to split up the idea of a "variable" into multiple ones, and each is only used once, so they correspond exactly to labelling the flow of data between operations. However that still clings to the notion of a "variable", which should be considered source-code-only. When generating code, it's better to think of how data flows between different operations in order to finally arrive at one or more final results.

Although then you'd have to account for the extra instructions required for moving the values around. It's not quite the same as the register allocation problem I'm used to, which was to just relabel the registers to use as few as possible.
A move is not necessary in many cases when the operation does not increase the total storage needed, especially with more flexible architectures. This is better illustrated with an example:


reduce total storage; one less slot needed:
       (C = A + B, A and B not used afterwards)
A       B          AX      BX
|       |          |       |               |       |
+---*---+     ->   *-------+         or    +-------*
    |              |    <free>          <free>     |
    C              AX
             (generates ADD AX, BX  (generates ADD BX, AX
              and frees BX)           and frees AX)

unchanged total storage:
           (C = A + B, B is unused afterwards)
A       B          AX      BX
|       |          |       |
+---*---+     ->   +-------*
|   |              |       |
A   C              AX      BX
             (generates ADD BX, AX
         and AX continues, BX is new flow)

increase total storage:
           (C = A + B, both A and B are used afterwards)
A       B          AX      BX
|       |          |       |
+---*---+     ->   +   *---+   (MOV CX, BX)
|   |   |          |   |   |
A   C   B          AX  CX  BX
                   |   |   |
                   +---*   |   (ADD CX, AX)
                   |   |   |

Because of the flexibility of x86, CX above could be replaced with a memory location (e.g. [BP+4]) if the compiler determined that CX was still in use and should stay there, and so should all the other registers, or move its contents into memory before the move, if circumstances favour it being used for the result of the addition.

Using the fewest registers is also not necessarily the best policy; processors can parallelise better when multiple registers are used, and the instructions are not dependent. For example, although the minimum number of registers for memory-to-memory copies is 1, when there are several of these to be done, you'll see e.g.
mov eax, [esp+8]
mov ecx, [esp+16]
mov edx, [esp+64]
mov ebx, [esp+96]
mov [esi+4], eax
mov [esi+8], ecx
mov [esi+12], edx
mov [esi+16], ebx

instead of repeatedly using eax. Thus the algorithm should, when the minimum number of registers used is below the amount available, shuffle the order of operations (while maintaining their dependency ordering) to increase it. The optimal solution isn't the minimum number of registers, but the exact amount available.

Graph coloring is sort of similar to sudoku, cross word puzzles, etc. It's an abstract representation of some of the basic constraint satisfaction problems people solve themselves every day. But people usually use creative methods to solve the problems. Computers are more brute force, although heuristics can be used as well.
Thus we should examine those "creative methods" and quantify the exact steps and decisions they take, in order to implement an algorithm that can behave like a programmer thinks, with the added advantage of computer speed.

I agree. And that's a nice proof right there. I was very confused when I read about that and I was wondering why someone hadn't written something that got some more consistency on this.
Because of the "accepted fact" that graph colouring is the way to do register allocation, and that it's proved to be NP-complete, few academics have challenged those norms and entertained the possibility that RA might not be. It's necessary to "think outside the box" here.

There might be details in the original problem that are lost in the abstraction.
By using abstraction to transform RA to graph colouring you have transformed a very specific problem into a very general one, and increased its complexity. Because I am quite convinced that RA is not NP-complete I do not believe in using this abstraction.

In this case, solutions to the abstraction will need to be extended to account for the new details before they can be applied.
Why not forget about it completely and think of the solution to the specific problem from the ground up, using first principles? It seems people are more interested in building abstractions on top of abstractions than solving the actual problem, and I do not think that is a good use of resources. By indoctrinating others in this way, it becomes harder for them to think freely and innovate. The fact that linear-scan and the interest in SSA stuff is so recent clearly illustrates this.

In summary, it's an issue of perspective. Asm programmers do not think of variables much, but more about results of computation being held in registers/memory, and probably wouldn't care about assigning names to them. It's not "eax is currently holding j", it's "eax is holding the result of ...", and they're free to move that data around without having to think about what it's called. The flow model I talk about is a lot closer to this, and SSA is a step in the right direction. It's a little sad to see all this research being spent on these traditional directions, and not gaining much, when we could've had much better compilers back then, if they had chosen to think about these things "bottom-up".

Name: Anonymous 2012-08-04 12:30

>>223
They're not. For example, the accumulator is preferred for many instructions.
There hasn't been a real "accumulator" in Intel CPUs since the Pentium Pro. x86's non-uniform accumulator architecture was obsolete before it was designed (compare PDP-11, 68K, VAX and every RISC). It helped make the 8086 cheaper than 68K back in 1978 when the accumulator was attached to the ALU but now it just makes things harder for programmers and hardware designers. Learn an assembly language other than x86, and wonder why anyone would willingly use Intel crap (especially the ugly 32-bit and 64-bit hacks) if IBM didn't choose it for the original PC (for its cheap price, not for technical merit) and the Wintel monopoly didn't shove it down OEMs' throats.

Name: Anonymous 2012-08-04 12:32

>>228
It's a little sad to see all this research being spent on these traditional directions, and not gaining much, when we could've had much better compilers back then, if they had chosen to think about these things "bottom-up".
Or if IBM had chosen a CPU with a uniform register set for their PC.

Name: Anonymous 2012-08-04 13:26

>>227
Yeah it is bruh

Name: Anonymous 2012-08-04 13:50

>>231
No, it's not, pal.

Name: Anonymous 2012-08-04 14:35

>>232
Yeah it is, ``faggot''

Name: Anonymous 2012-08-04 23:01

From what I've read, that's part of the principle behind SSA: to split up the idea of a "variable" into multiple ones, and each is only used once, so they correspond exactly to labelling the flow of data between operations. However that still clings to the notion of a "variable", which should be considered source-code-only. When generating code, it's better to think of how data flows between different operations in order to finally arrive at one or more final results.
The variables can always be closer to the assembly level than the source. For example, divisor and modulus result variables can be introduced for ever divide instruction.

In response to the in place add example
I think the best way to take advantage of this would be to generate code in an abstract assembly where addition always created a new variable to follow SSA style. Then apply the optimal register allocation algorithm, which will collapse many of the variables together. Then expand the add instructions as:


add X, Y -> C becomes:
MOV C, X
ADD C, Y

add X, Y -> Y becomes:
ADD Y, X

add X, Y -> X becomes:
ADD X, Y


Using the fewest registers is also not necessarily the best policy; processors can parallelise better when multiple registers are used, and the instructions are not dependent. For example, although the minimum number of registers for memory-to-memory copies is 1, when there are several of these to be done, you'll see e.g.

mov eax, [esp+8]
mov ecx, [esp+16]
mov edx, [esp+64]
mov ebx, [esp+96]
mov [esi+4], eax
mov [esi+8], ecx
mov [esi+12], edx
mov [esi+16], ebx


>instead of repeatedly using eax. Thus the algorithm should, when the minimum number of registers used is below the amount available, shuffle the order of operations (while maintaining their dependency ordering) to increase it. The optimal solution isn't the minimum number of registers, but the exact amount available.
That's a good point. The compiler will need to take this parallel execution into account as well. If you had an arbitrarily large amount of registers, you would never store two variables in the same register, to avoid locking up the parallelism. Sequential operations that use the same register will cost more than sequential operations that do not depend on each other. This cost can be incorporated by taking into account the previous instructions when calculating the cost of the current instruction. If the current instruction depends on the execution of close previous instructions, it should cost more.

Thus we should examine those "creative methods" and quantify the exact steps and decisions they take, in order to implement an algorithm that can behave like a programmer thinks, with the added advantage of computer speed.
The problem is that people rarely have a proof of optimality for their solution if the problem is very large. People instead invest effort into the solution until it is good enough. Once it is no longer worth continued effort, they stop and use what they have.

Because of the "accepted fact" that graph colouring is the way to do register allocation, and that it's proved to be NP-complete, few academics have challenged those norms and entertained the possibility that RA might not be. It's necessary to "think outside the box" here.
If RA is not NP complete, then either of two things are probably true: the graph to color has a restrictive structure and can be optimally colored by a polynomial time algorithm, or the graph is arbitrary but the assembly code encodes accessible information about the structure of the graph that can be used to get a coloring in polynomial time, and the coloring becomes difficult only if you ignore this information.

By using abstraction to transform RA to graph colouring you have transformed a very specific problem into a very general one, and increased its complexity. Because I am quite convinced that RA is not NP-complete I do not believe in using this abstraction.
Translating RA to graph coloring directly will be lossy. Kind of like taking a cryptographic hash of a string then then trying to recover the original string. But the information in RA that makes it easier than general graph coloring could also appear in other contexts. So it is worth abstracting. It's just that the conventional abstraction is misguided.

Why not forget about it completely and think of the solution to the specific problem from the ground up, using first principles? It seems people are more interested in building abstractions on top of abstractions than solving the actual problem, and I do not think that is a good use of resources. By indoctrinating others in this way, it becomes harder for them to think freely and innovate. The fact that linear-scan and the interest in SSA stuff is so recent clearly illustrates this.
The theoretical developments indirectly pay off to a lot of things. Just think, where did SSA come from?

In summary, it's an issue of perspective. Asm programmers do not think of variables much, but more about results of computation being held in registers/memory, and probably wouldn't care about assigning names to them. It's not "eax is currently holding j", it's "eax is holding the result of ...", and they're free to move that data around without having to think about what it's called. The flow model I talk about is a lot closer to this, and SSA is a step in the right direction. It's a little sad to see all this research being spent on these traditional directions, and not gaining much, when we could've had much better compilers back then, if they had chosen to think about these things "bottom-up".
A lot of things that appear to be very different can turn out to be equivalent. But graph coloring appears to have more research because it is a more general problem, and has a broader range of applications. But even then, I wouldn't consider it that popular. But then register allocation is a step further into the specific. No one really cares about compilers. Just look at databases, networking, and graphics. Then look at compilers.

Name: Anonymous 2012-08-05 7:07

>>233
No it's not, ``dipshit''.

Name: Anonymous 2012-08-05 20:44

>>234
Or they could use a 3-address CPU instead of the obsolete junk Intel stole from Datapoint.

Name: Anonymous 2012-08-05 20:59

>>236

It would probably be more productive to design a better architecture and use that, or to use a better designed architecture already in existence, but the complications in x86's design make for a more difficult code generation problem. If you are only interested in productivity, it is better if the problem to solve never exists. But is ok if all you want to do is solve problems, even if the best solution is not to solve it.

Name: Anonymous 2012-08-05 21:15

>>237
It's already been solved. In fact, it's been solved in architectures created even before Intel licensed the 4004 from Busicom. x86 is the Malbolge of CPU architectures. It manages to be painful for both programmers and chip designers.

Name: Anonymous 2012-08-05 21:54

>>238
,,Sent from my Yeeloong-8133''?

Name: Anonymous 2012-08-05 23:26

le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-06 4:53

>>229,230
The accumulator is still preferred, while it may not be directly faster many instructions are at least 1 byte shorter when operating on it which translates into better code density and cache utilisation. I've worked with more than half a dozen different architectures and x86 is one of my favourites (among others such as the Z80 and 8051), it's quite "balanced" and has many opportunities for optimisation. I have less nice things to say about the 64-bit "extensions".

The theoretical developments indirectly pay off to a lot of things. Just think, where did SSA come from?
SSA is not new, it was already around at the time of the Dragon Book. And Asm programmers have been implicitly using that sort of thinking in allocating registers before that.

No one really cares about compilers. Just look at databases, networking, and graphics. Then look at compilers.
That is unfortunate, because unless written in Asm, all software depends on them and a highly optimising compiler has the potential to make almost all software more efficient. Then again, the "optimise later" or "forget optimisation" bullshit that CS students are brainwashed with doesn't help either.

>>236,237
You can't gain something without losing something else. What you gain in flexibility, you lose in instruction size and the delays introduced by additional datapath circuitry. Going back to the diagram above, how many of the operations in a program, on average, need to preserve both of the inputs? Or put it another way, what is the average number of "outputs" needed for all these operations? This corresponds directly to the best architecture choice. If all 3 of them need to be preseved most of the time, then a 3-address architecture makes sense. If only 1 of them needs to be preserved, then 1 address architectures will be most suitable. From the general proliferation of 2-address architectures, it's probably the case that this number is around 2, and maybe slightly less. Either way it's a good middle-ground choice.

I will agree that x86 code generation is not as simple as it is with other architectures. But at the same time, it also presents more opportunities for optimisation so that the hardware can also run at its full capabilities. Consider the other extreme, a 3-address "pure RISC" with so many registers (let's say... 256) that "register allocation" becomes trivial and having to "spill" means there is something very unusual about your program. Instructions are going to have to be at least 24 bits just to hold the register specfications alone; round that up to 32, so an instruction is at least 32 bits. Forgetting for the moment how you can have that many registers at such clock speeds, consider running a typical application program on such a processor, where there are never more than ~10 live variables and the majority of the operations only need 2-address type instructions. Now each time it fetches an instruction it's wasting 8+ bits on a redundant register field, and 95%+ of the registers are going to be unused. Clearly this is a huge waste of bus bandwidth and die area.

The advantage that CISCs like x86 have is exemplified by the systems of today, where memory is much slower than the core and there are multiple cores vying for its use. Here, you want the CPU to be able to run quickly inside the core while sipping instructions from memory (including caches) at a slower rate. This is where high code density is crucial. Expand instructions into RISC-like micro-ops inside, taking less bandwidth than pulling them over the memory bus.

Name: Anonymous 2012-08-06 5:04

>>241
So the ideal architecture is MIPS64 with destination = first operand?

Name: Anonymous 2012-08-06 6:19

If you want something done, you better do it yourself. Just start making and using your own language.

Name: Anonymous 2012-08-06 6:36

This thread made me discover Go and I'm quite pleased. I realize I'm not quite a fan of promoting Google or it's products but Go seems quite nice.

Name: Anonymous 2012-08-06 6:44

>>241
mrinstr ::= mxxxxxxx | m1111111 instr
instr ::= xxxxxxxx

add r1, r2, r3 ; r1 = r2+r3 => 10000000 r1 r2 r3
add r1, r2     ; r1 = r1+r2 => 00000000 r1 r2
add r1, r1, r2 ; can be both


There, saved you're space.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-07 5:14

>>242
No, more like IA32 with a 64-bit extension closer to how x86 was extended to 32 bits. MIPS is optimised for silicon area and not much else. Things like branch delay slots(!) and fixed-width instructions make the ISA very rigid and there isn't a whole lot the software can do. In other words the ISA is so tightly coupled to the original hardware implementation that there is little opportunity for optimisation. In contrast, with x86 where many instructions perform multiple basic operations, parallelism almost suggests itself. All these improvements can happen inside the core and immediately benefit existing applications, without changing anything else. If anyone here remembers the days of Socket 370, that's exactly what happened.

The RISCs that have huge instructions that only do 1 thing will require more memory bandwidth to do superscalar/OOE, because by necessity they will need to fetch instructions faster. A 32-bit wide core-memory bus will either need to be widened to 64-bit or have its speed doubled. Not needed with x86: 32 bits is already enough to hold up to 4 instructions and they just need to change the core itself to execute them in parallel.

>>245
Cutting off half the opcode space to do that is hardly desirable, not to mention 256 registers is still way too much for the majority of use cases.

x86 has several 3-address instructions (some of the multiplies and divides), and they were likely added because an analysis of existing code showed that those specific instructions were much more likely to benefit with them present.

Name: Anonymous 2012-08-07 5:20

>>246
What exactly is wrong with AMD64? The retarded space-eating REX prefixes?

Name: Anonymous 2012-08-07 9:05

>>246
We knew you're shit at programming, and now we know you're shit at computer architecture. Are there more things you're shit at that we should know?

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-08 6:51

>>247
That's part of it. They got rid of a whole row of instructions when one of the existing unused ones could be repurposed for a size override/high-bank register select, like they did when going from 16 to 32 bits. It was overall too disruptive of a change.

>>248
I see you've run out of arguments.

Name: Anonymous 2012-08-08 13:02

>>246
>The RISCs that have huge instructions that only do 1 thing will require more memory bandwidth to do superscalar/OOE, because by necessity they will need to fetch instructions faster. A 32-bit wide core-memory bus will either need to be widened to 64-bit or have its speed doubled. Not needed with x86: 32 bits is already enough to hold up to 4 instructions and they just need to change the core itself to execute them in parallel.
Are you seriously brain-dead enough to rehash arguments that have been dead for 20 years out of your ignorance? PROTIP, you moronic piece of shit, modern x86 can only function by breaking down the complex instructions to a reduced set for the hardware first.

It's really sad that in the time you spent showing the world how pathetically uneducated you were, you could have maybe read and learned something. Well, maybe a word or two of it with your displayed intellectual capacity so far.

Name: Anonymous 2012-08-08 13:15

>>250
Learn about the Von Neumann bottleneck and Shiichan quotes.

Name: Anonymous 2012-08-08 13:23

>>246
Cutting off half the opcode space to do that is hardly desirable,
I don't hear you bitch about the ?ax-specialized opcodes in x86, though. Or multiple encoding for the same instruction. Or the 66h/67h prefixes.
not to mention 256 registers is still way too much for the majority of use cases.
Make them 32, 4 bit for each register, add r1 r2 r3 is now 20 bit, add r1 r2 is now 16 bit. They're, saved you're space again.

Name: Anonymous 2012-08-08 13:27

C:\TEXT\KOPIPE.TXT

God says...

For years, people have been predicting--hoping for--the demise of assembly
language, claiming that the world is ready to move on to less primitive approaches to
programming...and for years, the best programs around have been written in assembly language.
Why is this? Simply because assembly language is hard to work with, but--properly used--
produces programs of unparalleled performance. Mediocre programmers have a terrible time
working with assembly language; on the other hand, assembly language is, without fail, the
language that PC gurus use when they need the best possible code.

Name: VIPPER 2012-08-08 15:19

>>253
11/10

Seriously, ASM fucking sucks(x86 atleast) and using ASM in this age for anything really serious is fucking stupid.
Its not the point that ASM is or was powerfull, ASM was and is shit, its just that programmers of the old were fucking demi-gods that yet many of us could imagine to be real.

Some DOS games back in the time were way more innovative and creative then most games of today. Plus they had much lower budgets, less computing power, less help both personel wise and computer assistance wise.
And lots of them were writen in ASM and/or some obscure shitty language.

Name: VIPPER 2012-08-08 15:20

>>254
*Could not imagine to be real.

Sorry

Name: Anonymous 2012-08-08 15:47

>>254
One of my favorite games (Tyrian) was written in Pascal

Name: Anonymous 2012-08-08 18:52

>>246
http://bitsavers.org/pdf/datapoint/2200/2200_Programmers_Man_Aug71.pdf
Why is the original description of the embedded terminal architecture that Intel licensed from Datapoint to make the 8008/8080 so much more readable than Intel's description? And why are we still using extensions to extensions to extensions to an embedded CPU architecture designed to be what could cheaply fit on one chip in the 70's?

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-09 2:42

>>250
modern x86 can only function by breaking down the complex instructions to a reduced set for the hardware first.
And that happens in the core, where clock speeds can be pushed much higher. Here's an analogy that makes the situation clearer: suppose you can choose two types of slaves (this is 4chan after all), one can do everything really quickly but also only knows really simple commands, and another that isn't that quick but you can tell them to do something in one word what would take a few dozen with the other, and they'll take their time doing it. You can only talk to one slave at a time. Which type will you choose a few dozen of if you're the master and want to get them to do something like build a house?

With the first type, all your time is going to be spent telling them one-by-one what to do in excruciating detail, while all the others are just waiting for their next little command since they finished their last one almost immediately after you told them. Not a good use of resources and overall throughput is going to suck. With the second, you'll spend much less time telling them what to do, and they'll spend much more time actually doing something rather than waiting. You might even have enough time to go spend some time with your friends. You are the memory. The slaves are cores of a multicore processor. Your friends are peripheral devices. The first type is RISC. The second is CISC. Get it now?

I'm not keen on turning this into personal insults but it appears that your definition of "educated" is more akin to "brainwashed with the beliefs of the academics", and it really shows the lack of actual thinking going on in your head when you can't even seem to understand my argument enough to formulate a defense. I hope the analogy helps.

>>252
The accumulator opcodes are designed for use in calculations involving a long chain of accumulated results, which is relatively frequent. Function return values go there too. It's quite clear that a long-lived variable or set of variables should almost always stay in (e)ax. Multiple encodings are always an issue, they're hard to get rid of but relatively speaking x86 has a lot less of them (and empty space) than some other architectures.

No idea what you're getting at about the Osize/Asize prefices, but that's much better than x86-64's replacing an entire row of register increment/decrement instructions with 64-bit-only prefices. One of the most common operations done in loops. I'm almost willing to bet Intel was NOT pleased with AMD doing that.

>>257
It's probably just you. I have the 8008 manual and it's not bad. Maybe you like the octal notation for opcodes.

We're still using x86 because it's what works the best for general purpose CPUs at the moment. The i432 failed to unseat it, Itanium was pretty dismal, ARM and MIPS are filling the low end of the spectrum where die size/transistor count and power consumption matter more than anything else, POWER, SPARC and the "big RISCs" are in the massively parallel TOP500 supercomputers where they belong (and in applications that can actually make use of all those registers, with specialised hardware/software that ensures all the cores don't fight each other for memory), which leaves x86 to fill in the huge general-purpose/value-performance market.

Name: Anonymous 2012-08-09 4:57

try Ada

Name: Anonymous 2012-08-09 5:58

>>258
>"educated" is more akin to "brainwashed with the beliefs of the academics"
Hahahahah oh wow. Sure, reality is brainwashing, and all the processors operate otherwise because all the engineers Intel and AMD employ are unaware of the scientific breakthroughs a special brain dead piece of shit made in his basement. Clearly no one was aware that one CISC operation could do more in a slower run on a system that couldn't scale, and no one analysed that it ended up being overall much much slower than running multiple commands on a fast architecture, they must all have put CISC->RISC subsystems on modern x86 processors out of sheer ignorance! If only they had employed a clinically retarded turd nugget who thought understanding one CISC operation doing more was an expression of genius and relevant, one that couldn't imagine these concepts are really simple to non-retarded folk and his intellectual humiliation came from not understanding the facts! All these people brainwashed by "logic" and "mathematics" will never have your vision, will they, retard? Oh, and these aren't insults, I am making factual statements as I always do.

Enjoy your cold hard ownage.

Name: Anonymous 2012-08-09 6:11

>>260
Firstly, learn about the processor-main memory bottleneck. Your notions of algorithmic complexity and such are useless at such a small level. Secondly, go back to the imageboards, ``please''!

Name: Anonymous 2012-08-09 6:39

>>260
Well your attack is not credible since you don't provide any argument yourself, you only echo other peoples beliefs like a sheep.

Name: Anonymous 2012-08-09 7:14

>>261
Irrelevant to the topic, and the fact that algorithmic complexity doesn't come into play is to the point of my argument although I didn't even bother touching that subject as it is a topic that has a conclusive answer. Are you sure you read the right post, or are do you just have some pre-made answers in case of butthurt that you chug forward regardless of the content? Sorry for educating you so far.

>>262
So saying 2+2=4, and that it is recognized as such by all experts and all hardware at the modern date is implemented as such puts the burden of proof on me, and "BAWWWW ACADEMICS ARE BRAINWASHED!!11 ONE INSTRUCTION CAN DO TWO THINGS MEANS IT WILL BE FASTER EVEN IF THE OTHER PROCESSOR CAN DO A HUNDRED INSTRUCTIONS IN THE SAME TIME" is a credible counterargument worth discussing? Are you sure you aren't a certain brain-dead excrement leftover tripfag samefagging who would be uneducated enough to take this seriously?

Name: Anonymous 2012-08-09 7:40

>>263
>>261
Irrelevant to the topic
Are you really sure that the issue of memory bottlenecking is irrelevant to machine code compression?

Name: Anonymous 2012-08-09 7:50

Cudder is killing /prog/ as always. I don't want to park my car in her cudder anymore.

Name: Anonymous 2012-08-09 11:19

>>265
// ==UserScript==
// @name           /prog/ cleaner
// @description    Decudder
// @namespace      http://dis.4chan.org/prog/
// @include        http://dis.4chan.org/*;;
// @version        1.0
// ==/UserScript==

(function() {
    var posts = document.getElementsByClassName('post');
    for (var i = posts.length - 1; i >= 0; --i) {
        var trip = posts[i].getElementsByTagName('span')[4];
        if (trip.textContent == '!MhMRSATORI!FBeUS42x4uM+kgp' ||
            trip.textContent == '!MhMRSATORI!fR8duoqGZdD/iE5' ||
            trip.textContent == '!MhMRSATORI!vzR1SbE7g/KHqrb')
            posts[i].parentNode.removeChild(posts[i]);
    }
})();


He changes secure trips occasionally, so expect to adjust accordingly.

________________
sqlite> select count(*), trip from posts where author = 'Cudder ' group by trip order by count(*) desc;
204|!MhMRSATORI!FBeUS42x4uM+kgp
61|!MhMRSATORI!fR8duoqGZdD/iE5
31|!MhMRSATORI!vzR1SbE7g/KHqrb
9|!MhMRSATORI
1|!MhMRSATORI!cbCOMSATORIM/pr

Name: Anonymous 2012-08-09 11:39

>>265,266
He could still post as Anonymous, dumbshits.

Name: Anonymous 2012-08-09 14:07

>>267
So? If he were forced into anonymity, he'd certainly post less, because he craves attention, but it won't even come to that, because most people wouldn't be using >>266-kun's script.
Why do Americans always think that if a measure doesn't solve 100% of the problem it addresses, it's not worth doing at all?

Name: Anonymous 2012-08-09 16:58

>>267-268
Cudder is female.

Name: Anonymous 2012-08-09 17:00

>>269
He wishes.

Name: Anonymous 2012-08-09 18:47

>>268
U+1F502 CLOCKWISE RIGHTWARDS AND LEFTWARDS OPEN CIRCLE ARROWS WITH CIRCLED ONE OVERLAY
Your argument is invalid.

Name: Anonymous 2012-08-09 19:00

>>269
yeah right

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-10 7:21

>>260
that it ended up being overall much much slower than running multiple commands on a fast architecture
Look at when those studies were published. Back then, they thought they could keep raising clock speeds indefinitely, and memory bandwidth was plentiful. The technology of the time was such that the disparity between CPU core speeds and memory speeds was much smaller. A lot has changed since then.

CISC->RISC subsystems on modern x86 processors
I already answered that in the first part of >>258. I wouldn't call it "CISC->RISC" though, it's just a parallel microsequencer.

All these people brainwashed by "logic" and "mathematics" will never have your vision
The problem is the academics parroting information from out-of-date or flawed studies and failing to actually look at the reality and think. We saw this from the discussion of register allocation earlier.

From the RISC wiki page:
"The goal was to make instructions so simple that they could easily be pipelined, in order to achieve a single clock throughput at high frequencies."

Anyone reading that should automatically think "what about the rest of the system? Can the memory keep up? Can the clock frequency be raised indefinitely?" The huge assumption they made there was that memory would always be fast enough --- and that assumption, while it may have been true in the 80s and early 90s, is not anymore. The laws of physics also weigh in on this, limiting clock frequencies. Someone who accepts that statement without questioning its assumptions is like one who believes in a religion; as that quote from one of your "idols" himself says, it's "unscientific and ultimately destructive".

When your opponent starts becoming half-unintelligible it's a good sign that he's unable to think anymore. P.S. calling someone a "retard" is far from "factual" :)

>>263
EVEN IF THE OTHER PROCESSOR CAN DO A HUNDRED INSTRUCTIONS IN THE SAME TIME
But it can't, given the same memory bandwidth. That's the whole point. A simple calculation shows it even more clearly: Suppose you have 10GB/s of memory bandwidth. On a classic RISC with 4-byte instructions and 1 instruction per cycle (e.g. MIPS), that corresponds to 2.5GIPS. That is the maximum speed you can execute instructions at, and would be equivalent to a clock speed of 2.5GHz. You cannot raise the clock speed more without needing to insert wait states, even it the silicon allowed going up to 4GHz. 100 instructions is going to take 40ns, no matter what. Now consider a typical modern x86, up to 4 instructions per cycle, as low as 1 byte per instruction, and a clock speed of 2.5GHz. Assuming that 4 1-byte instructions can do as much work as the 100 RISC instructions, and runs in 100 clock cycles, they can be fetched in 0.4ns, leaving the memory bus idle for the remaining 39.6ns while the instructions execute in the core for 99 more clock cycles. Even ignoring the 4x superscalar speedup, here is where x86 has the advantage: the silicon can do more than 2.5GHz, so let's increase it to e.g. 4GHz. You can't do this for the RISC because the memory can't go any faster. The x86 is now fetching in 0.4ns and takes 24.75ns to execute, total of 25.15ns for the four instructions of 100 clock cycles each, running in parallel. The core can be run at the highest frequency it can, not limited by how fast instructions can be fed into it. In practice, memory is a bit slower (10GB/s quoted above is a peak burst rate), so the difference is even bigger.

>>265
If by "killing" you mean repealing old misconceptions and beliefs and encouraging more critical thinking, I'm in complete agreement. /prog/ isn't a church. Fuck this pseudo-religion.

Name: Anonymous 2012-08-10 9:31

>>273
Will you marry me?

Name: Anonymous 2012-08-10 10:10

>>273
We saw this from the discussion of register allocation earlier.
Welp, it surely was someone with no background whatsoever in graph theory that noted that all the graphs generated by SSA are chordal and can be colored in polytime, right?

Name: Anonymous 2012-08-10 12:36

>>273
>"HURRR SUPPOSE 2=3, THEN 2+2=6!!!11 LOOK I AM NOT RETARDED NO MORE, WHERE IS YOUR 4 NOW!1! RELIGION OF SCIENCE!!111 I HAVE ANSWERED YOUR ARGUMENT ALREADY EVEN THOUGH I NEVER DID, AND NEVER COULD UNSURPRISINGLY BECAUSE WHATEVER I CRY DOESN'T CHANGE THE FACT THAT CISC INSTRUCTIONS ARE STILL CONVERTED TO RISC IN MODERN PROCESSORS"
Well you being clinically retarded is not up to discussion, you prove it by every sentence you manage to shit out. Notice that you are so incredibly brain dead that you are still talking about imaginary old "published studies" while the cold hard facts I've brought to the table were the implementations by profit-seeking corporations that are bound to the x86 CISC architecture. Sure, all the top level engineers at Intel, everyone working in alternatives in CISC and RISC based devices, along with everyone in academia and anyone knowing mathematics and logic is unaware of your scientific breakthrough that makes CISC magically relevant after 80's, perhaps you should name your theory after yourself "Retard Processing" (well, you actually haven't managed to provide content for it other than spouting baseless numbers that don't relate to reality, but I'm sure even a retard such as yourself has backed your notions despite not telling us, right? right?)...

Ah, also, it's funny that you are talking about "an opponent" of yours in "an argument" - I am merely putting some brainless piece of shit to his place by attempting to grind facts into his vacuous mind, there is nothing to argue here for anyone who knows basic mathematics or logic. You might even be crying "2+2=5" while bleeding out of your anus like this. You should read more instead of crying, you won't learn anything with your level of retardation, but you'll get points for effort.

Name: Anonymous 2012-08-10 12:42

Cudder, please stop arguing with the imageboard retard.

Name: Anonymous 2012-08-10 14:39

                                                                       `
one of your "idols" himself

λBλLSλN λ SλSSMλN

Name: Anonymous 2012-08-10 15:21

If you seriously want to make the claim that RISC is superior, you'd better have the fucking willpower to optimize the living fuck out of your code or be able to shell out for a proprietary compiler that can.
x86 was shitty as fuck when it came out, but our requirements have changed - things that impacted performance back in the day don't impact it now, and being able to code in a way that has hardware-level optimizations for certain operations is a useful innovation/

Name: Anonymous 2012-08-10 16:06

RISC is shit.

Name: Anonymous 2012-08-11 1:11

``i used the superior speed of assembly to program a lifetime supply of semen for all my holes'' -- cudder

Seriously why is this discussion still going on, it's less relevant to programming than which 2hu I'd fuk. At the very least please sage consistently so you stop bumping this crap.

Name: Anonymous 2012-08-11 1:50

bump

Name: Anonymous 2012-08-11 2:18

>>282
back to /g/

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-11 7:49

>>275
It doesn't take any knowledge of graph theory to see that register allocation just ain't that hard. The academics took this huge long detour into graph theory, SSA, etc. before finally beginning to realise that.

>>276
CISC INSTRUCTIONS ARE STILL CONVERTED TO RISC IN MODERN PROCESSORS
That's one way to think of it, but the "RISC inside" is more of a marketing strategy than a description of what really happens, as uops are a lot more elementary than a typical RISC instruction. If you consider uops "RISC", then even the original 8086 would be doing that "conversion"; the only difference being the microsequencers back then were sequential-only, whereas they're more parallel now. In case it wasn't clear from the beginning, we're arguing about RISC vs CISC externally, i.e. instruction sets like MIPS, ARM, etc. vs x86. Whether you consider a processor to be RISC or CISC internally (and you seem to be saying that they're all RISC, fair enough) isn't the point.

imaginary old "published studies"
I'm talking about the original papers in the 80s and early 90s. Possibly the only thing "imaginary" was you at that time.

You might even be crying "2+2=5"
Are you going to argue against maths now? Do I have to teach you basic arithmetic first?

baseless numbers that don't relate to reality
Then how about YOU try to come up with a counterexample, or at the very least point out exactly what you have issue with? If you can't do that, then it means you do not even have sufficient familiarity with your side to defend it!

Your repeated incessant personal attacks only show that you cannot come up with any better argument. You have lost.

(This is not unlike arguing with a religious fundamentalist about evolution. Seemingly educated, mild-mannered people quickly turn into surprisingly bad-tempered screaming insult-flinging children.)

>>279
Or at least provide some nice convincing numbers to inspect.

I'll conclude this with a few words from our sponsorssupporters:

http://utcc.utoronto.ca/~cks/space/blog/tech/Whyx86WonVsRISC
Memory access speed was not seen as a big concern back in 1992, but it has become an increasingly important one since then and this favours architectures with compact instruction encodings (even if this results in them being far from regular and easy to decode).
http://utcc.utoronto.ca/~cks/space/blog/tech/Whyx86WonVsRISCII
What people really cared about in the early 1990s was RISC ISAs, and (1992 style) RISC ISAs unambiguously lost. CPUs implementing the x86 ISA were pushed to performance and price levels that could not be matched by CPUs implementing RISC ISAs and as a result RISC ISAs have basically died out.
http://lkml.indiana.edu/hypermail/linux/kernel/0302.2/1909.html
http://lkml.indiana.edu/hypermail/linux/kernel/0302.2/1950.html
http://faculty.washington.edu/rynn/421/Articles/RISCy%20Business.pdf

Name: Anonymous 2012-08-11 7:59

x86 is crap. If it ain't x86, it's worse than crap.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-11 8:01

>>285
It's "crap" that beats everything else.

See the http://dis.4chan.org/read/prog/1317072394/126- too.

Name: 285 2012-08-11 8:53

>>286
Congratulations for your Intel GET. Also, did you read my whole post? I explicitly said that the other ISA were worse.

Name: Anonymous 2012-08-11 9:18

>>285-287
in during intel shills
Shalom!

Name: Anonymous 2012-08-11 9:24

>>288
implying

Name: Anonymous 2012-08-11 15:17

>>285-288
Jew86 ist Judenscheisse. Congratulations on your Intel shit.

Name: Anonymous 2012-08-11 15:19

>>286
Uh huh. Then why is virtually every mobile OS now targeting ARM?

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2012-08-12 6:50

>>291
ARM has low transistor count and power consumption, and is easily embedded as an IP core into an SoC. Memory bandwidth doesn't matter as much (they run the core at a low enough speed, for power consumption reasons.) These characteristics are more suited to a mobile application.

Name: Anonymous 2012-08-12 13:01

>>292
Care to qualify your overly broad statement about x86 "beating" everything else then?

Name: Anonymous 2012-08-12 14:54

>>293
Meh not like mobile shit matters when they do everything in complete slow as fuck bloated shit like java and obj-c.

Name: Anonymous 2012-08-13 3:57

OOP because of IntelliSense!

Don't change these.
Name: Email:
Entire Thread Thread List