So I've heard a couple arguments for this and I'd like to see what you think.
The allocation procedure under copying collectors is most definitely faster than malloc, as malloc has to search through fragments of memory for a sufficiently large block to reserve, whereas allocation under a CC is as simple as incrementing the free heap pointer, much like allocating a stack frame.
When the garbage collector is triggered, it must search through the live set of objects and copy each one over to the new heap. This takes time, but it is limited to the objects in the lives set, so as long as that is reasonably small, this shouldn't take very long. Also, no operation at all is needed on the objects that are no longer referenced. Under malloc and free, free must be called on each and every single one of these objects, the moment they become dead to the program.
I could see garbage collection performing poorly in comparison to malloc and free when the live set of objects is very large and many objects stay live for a very long time. The long living objects would require minimal time for management using malloc and free, whereas the garbage collector may traverse them many times. I suppose this is were using a generational copying collector can come in handy, however. So /prog/, what do think?
copy collectors have a tendency to thrash up the cache something awful, especially the longer it has been running.
yeah, I could see its toll when doing the depth first search through the live set of objects in the heap, but probably gives the program better locality of reference otherwise due to the heap being so compact. If I recall correctly, malloc uses different sections of the heap for large objects and small ones, to reduce fragmentation, so it intentionally puts distance between the objects.
err nevermind, as the heap starts to fill with objects, the objects in the live set could get quite far from each other. And I'm sure moving every live object on the heap to a totally new region of memory isn't so great for the cache.
>>5 copy collectors have a tendency to thrash up the cache something awful, especially the longer it has been running.
False. malloc allocates memory in essentially random places, copying GC at least ensures that temporal locality of allocations implies spatial locality of allocated memory.
yeah, I could see its toll when doing the depth first search through the live set of objects in the heap,
They don't do it. They use memory barriers (not the instruction reordering kind) to mark the older-generation objects that had their fields assigned. And a lot of clever tricks, I presume, to make the search for live objects over the generation being collected as linear as possible.
I am no expert, but this doesn't sound like something for opinions, but rather benchmarks.
Well, it's quite easy to make a benchmark that allocates and deallocates a lot of objects of varying sizes. I did that once, .NET code turned out to be about 50 times faster than C code using the default msvcrt malloc. Making a benchmark that is at least somewhat related to anything you might encounter in a real-world task is much harder, of course.
Name:
Anonymous2011-09-16 16:19
>>9 Well, it's quite easy to make a benchmark that allocates and deallocates a lot of objects of varying sizes. I did that once, .NET code turned out to be about 50 times faster than C code using the default msvcrt malloc.
When you need to do that, the right thing is to use a dynamic array. If implemented properly, that solution should ridiculize the GC.
Name:
Anonymous2011-09-16 16:44
The thing maybe some of you are overlooking is that many C and C++ programmers who are concerned about memory allocation performance and fragmentation will often use specialized allocators, and malloc to handle the edge-cases.
You can implement arena allocators which just increment a pointer on allocation, and then free everything at once at a predetermined time, like when unloading a module, file or dataset. Way faster than GC and it's not hard to do.
They don't do it. They use memory barriers (not the instruction reordering kind) to mark the older-generation objects that had their fields assigned. And a lot of clever tricks, I presume, to make the search for live objects over the generation being collected as linear as possible.
I think that only works when every data type on the heap is immutable. That's a very interesting technique though.
You can implement arena allocators which just increment a pointer on allocation, and then free everything at once at a predetermined time, like when unloading a module, file or dataset. Way faster than GC and it's not hard to do.
That's true, although freeing individual slots in the array can get a little hairy, although it can be done efficiently there too. One solution would be to store an array stack of free indecies in that array, I believe.