/prog/ - Pascal vs C

Name: Anonymous 2010-04-05 17:12

If you were to give a language like Python static typing and make it a compiled language instead of interpreted, you basically end up with something like Pascal. So how would you rate Pascal's abilities compared to C? I know dynamic typing requires runtime checking, but would it be possible to have dynamic typing in a compiled language like Pascal?

Name: Anonymous 2010-04-07 10:38

>>32
Okay, first, Lisp doesn't have a linked list as a native type, instead all linked lists are built out of conses, this gives you the possibility to build nested lists which share structure, have circularities, etc.
A proper linked list is one which is either:
1)A Cons, the CAR points to the first element of the list
the CDR points to another proper linked list
2)NIL (Symbol in CL, empty '() list in Scheme)
This is rather simple to describe in C, but I'll refrain from giving trivial translations/examples.
Now, the question is what is a CONS. A CONS is a structure of any 2 elements. In C, it could be represented as just 2 random void* pointers. Pretty much most modern Lisp compilers translate CAR and CDR as simple pointer dereferences as that's what they are, there's really no magic here. It's 1-3 instructions on x86 (for example, if a CONS is made of 2 DWORDS, and ECX points to a CONS, you could get the CDR in EAX by doing MOV EAX,DWORD PTR:[EDX+4], simple enough?). CONS itself can be thought of as in the more complex case, a simple malloc(sizeof(CONS)) call, and setting up the structure further(I'll explain in a few). All 3 operations can be inlined, and most list operations can be easily shown to revolve around these ops quite nicely. I could also show how one could represent just about any Lisp object in C or ASM if you need examples, altough I feel you would be better off reading the source code of the implementation.

What really sets Lisp apart though, is that each object is usually tagged internally with a base type, so a CONS, a FUNCTION, a STRING, an ARRAY, a BIGNUM, a STRUCTURE, a CLASS instance, and so on, are all basically data structures, but they have some space reserved for an object tag (be it, 8 bits, 4 bits, an int or whatever). There are many possible encodings, so there's no reason to discuss them all here, but just imagine that each object contains a field (for example a char) containing its type id before all the data. In some implementations, the tag bits may be part of the pointer... Other implementations may optimize fixnums(ints) to be unboxed under certain circumstances, and so on.

The thing is, you can still implement this in C, maybe with some slight hacks if more advanced/space conserative encodings are used, and without much trouble at all in ASM.

This tagging system is needed for most truly dynamically typed languages, and ensures operations work as they should, and do so in a relatively safe manner which allows for proper error handling, it also allows specializing by type and many other niceties.

The real point that should be made though, is that all these things are mere conventions an implementation chooses to obey, and once it comes down to metal, these are just simple instructions that get added by your compiler, and they function rather standalone. You could do the same in ASM or C, although, I can't imagine things would be very pleasant if you couldn't cover up all the redundancies - things which Lisp compilers and Lisp macros do an excellent job of hiding, and things which the programmer can extent as he wishes, without being bogged by the details more than once(or a few times when he maintains some specific module) - that's one basis of abstraction... but I'm getting offtopic here.

What I wanted to say is that CONS, CAR and CDR are congruent to machine ops quite nicely, and they're usually 1-2 ops. Maybe on a 64bit system, you could even represent a CONS as an unboxed register, if you wished, also, did you ever look at the origin of the names? They come from IBM 704 Assembler, and they were rather direct macros over that machine's assembler, and very low-level constructs. It's why we're still using these names today (of course, you're free to use FIRST/REST if you don't like CAR/CDR). I don't like linking to Wikipedia, but here, go ahead and read it: http://en.wikipedia.org/wiki/CAR_and_CDR

So not only is CAR/CDR translatable to 1-3 instructions on modern CPUs, it was always a low-level operation even during its inception, and you don't need a runtime to support it.

So what could a special purpose CPU make easier for Lisp? Tag bits! Those CPUs had dedicated tag bits which allowed typing the pointers and other data, which makes things much nicer for Lisp, while nowadays we don't really type our data, but rely on the compiler to set up the conventions as it pleases, as long as there's enough space in the register(if you plan on storing it in a reg-sized value, otherwise, if you're storing it part of the struct, there's no arch specific-trickery at all).

tl;dr: Lisp runtimes can sometimes be large, but they're large for different reasons than what someone in this thread thought - they're large because they tend to include compilers/interpreters, FFIs, and many other complex libraries with them. Things which can be stripped if not needed. Lisp code can be self-containing, just like ASM or C code, and doesn't have to depend on the runtime, more than any other kind of code - as long as the code that interacts with it follows proper conventions, or you have an FFI do the translation for you.

Name: 34 2010-04-07 12:10

I forgot to mention that since each object's type is known, and you can traverse all the data in the memory given some root object(s) (for example, a list of packages). This makes precise garbage collection possible. Such GCs tend to be written in C most of the time, and they're usually not too large (~100KB seems right). That's usually the major part of a runtime of an useful Lisp, but of course GC-less Lisps are perfectly possible, although probably not the smartest things around. The other low-level parts of the runtime usually deal with loading and mapping the image to memory form disk and vice-versa, and maybe some system call thunks. For example, SBCL's main executable, stripped of the bloated debug symbols is only 94KB here, and that includes all the load/save/gc functionality. If you used a C-compiled lisp, things would be slightly different, but not by that much. It might also be worth mentioning that on Lisp Machines, the GC, image fasl load/save were part of the OS(which was written in Lisp as well, like everything else), so those parts didn't have to be implemented in the executable, as the OS itself took care of those things. Of course, if you really look at what the low-level runtime usually contains, you'll see that if you wanted, you could do away with it and still have working code (with some changes to how images are written).

Well, whatever, I'm feeling this discussion is somewhat pointless as anyone can do whatever they want in any language, and everything is just limited to their imagination as long as they understand the architecture as well as how to design software.

Dynamic languages come with some overhead, but that overhead is in inherent extra typechecking operations, not "bloated runtimes" or "bloated libraries" (things which can exist, but are not required).

Pascal vs C

1 Name: Anonymous 2010-04-05 17:12

34 Name: Anonymous 2010-04-07 10:38

35 Name: 34 2010-04-07 12:10

Name: Anonymous 2010-04-05 17:12

Name: Anonymous 2010-04-07 10:38

Name: 34 2010-04-07 12:10