>>39
I've looked briefly at Clay; I did think about it when posting that. It is very interesting but it has some problems. The syntax for some things is terrible; postfix ^ is used for dereferencing, so foo^.bar is for accessing struct fields. The fact that it primarily uses statements instead of expressions, e.g. no conditional or comma operators, is kind of a throwback to Fortran and COBOL. This makes it verbose and difficult for functional/concurrent programming. And I really would like a GC during experimental development, as long as I can pull it out later.
Basically it doesn't look fun. It looks like Fortran without type declarations. I will probably try writing a serious program in it at some point soon to get a feel for how it works.
I disagree here. I will admit to being very C-minded and this is probably a symptom of that. If you already have by-ref and by-value types then it's no big deal, but switching over to that model for no other reason is probably not great. I'll pretend you said something about safety and I'll drop it, deal?
No deal, I actually do want to talk about it! I will fully admit that this is the most controversial of my points. I'm really not 100% behind it myself; mostly I just want to believe that it's possible to do it without sacrificing anything.
I just find that a lot of my time in C and C++ is spent not just writing syntax for pointers, but *thinking* about it. If I take a pointer to a struct as a function argument, I refer to its fields through dereferencing (->). But if I instead declare this struct on the stack, now I access it with regular member syntax (.). Now why exactly does it matter where this object lives? Why do I need to think about where it lives when I access it? Why isn't its location in memory a mere part of its declaration, nothing more?
Many languages *partially* solve this in a variety of ways. For instance in Python, everything is a binding to an object. Writing "x += 5" creates a new object which is the sum of x and 5, and then binds x to the new object. It does not actually modify any objects though, because the number objects themselves are immutable. This happy restriction means the compiler is free to optimize it by copying the numbers directly by value instead of allocating them from the heap. (I'm not sure if CPython actually does this, but CPython sucks. Haskell probably does this.) The unfortunate downside is that there is no way to have a value modified by reference; single-element lists are a common workaround in Python.
Scheme is similar in that everything is a binding, except that numbers are not immutable. So you can modify them by reference. However, someone might set! on your int, so you have to always pass it as a reference. Someone might keep the reference expecting it to be modified later, so you have to allocate it from the heap and let it be garbage collected. The only way to optimize this into by-value semantics is through whole-program analysis in order to prove that the value is not changed or retained.
Clearly I'm not the only one who thinks pointer syntax is a pain. Just look at Apple's libraries like CoreFoundation. For most objects, stack allocations are banned (because structs are hidden), and all types have a typedef'd pointer with "Ref" on the end, which you only get through Create and Release methods. Thus it is always a reference. A few types are value types (raw structs), such as CGRect. These are *never* passed as a pointer; they are always copied by value. I don't think I've ever seen a single *, &, or -> when dealing with Apple's C code. They are essentially emulating the syntax-free semantic distinction between by-value and by-reference types.
If Apple gets rid of pointer syntax in C with some typedefs, clearly it's worth it to do so as a fundamental part of any new language. Isn't it?
I'm not clear on this. Do you mean the compiler should perform analysis and insert malloc/free or equivalent calls in your code automatically?
No, I mean specifying the memory location of objects through annotations. For example, the scope keyword in D means the object should be allocated directly in the current struct or stack frame. You can just add this to a variable and the object is "scoped". Or you can make a class manually memory managed by overriding operator new and operator delete. The point is that you do not need to significantly change code in order to remove the garbage collector; it is not as though you are rewriting the program, just annotating it. Unfortunately this is very limited in D. It's hard to allow an object to be allocated and constructed from different allocators. There are no smart pointers to clean up this sort of special memory with RAII.
About null pointers, I'm not quite convinced on the performance comment. If you can prove certain things you can eliminate the need to perform checks in certain cases, however.
The idea is that 99% of the time, you would need to do the check for nil anyway (otherwise why would it be a union with nil?), so you're not losing performance. You also aren't losing space; any decent compiler of course handles a tagged union between a reference and the nil type just by using the zero address for nil. If you know for a fact that a possibly-nil pointer is actually not nil, but the compiler is unable to verify it, it could supply a simple construct like __assume(x != nil). The MSVC compiler actually already supports this in C++ for optimization purposes, so it's not much of a stretch.
That's a tall order.
Doesn't seem that tall. What are the holes in my reasoning? Couldn't returning safe unique_ptrs from new solve this problem, barring reinterpret_casts and such?
In fact, I will try it. The next C++ project I do, I hereby resolve to turn on sepplesox, and exclusively use a wrapper to new for allocation which returns unique_ptrs (and if possible, I will disable reset() and all of unique_ptrs' constructors, or perhaps write my own version of it.) We'll see how that turns out.