>>118
Reference counting has some things people need to be aware of. If you don't use it correctly, you will pay the price.
To correctly do reference counting in C or C++, you need to make the release operation thread-safe, so that if two threads hold a reference to the same object, if the threads release the object at around the same time, it's possible you'll get a double-free unless you protect the reference count.
Generally, this is done using native CPU specific atomic compare-and-swap instructions. These instructions lock cache-lines or even the memory bus. On x86/x86-64, for example, a compare-and-swap takes around 150-200 cycles (where as say a simple dec and jnz might would just be a couple of cycles granted that the reference count is in the CPU cache and the branch predictor doesn't fault.
There's overhead. Therefore you should avoid lots of fine-grained reference counted objects. You don't need to use reference counting for everything. Use reference counting only where you need to, and on heavier-grained objects which explicitly manage smaller-grained objects. That way you can setup strong ownership rules: when the heavier-grained object goes out of scope or is released, it will cleanup and free all of the fine-grained objects it owns, no expensive reference count release operations.
There are also different ways of implementing reference counting: intrusive and external. Intrusive is less expensive in practice, less memory fragmentation, single malloc and free per object, but you can't have weak references which are useful for implementing cycles. External reference counting is more usable, but often allocates the reference count in it's own block of memory, although there are ways to make it intrusive (see C++'s std::make_shared function).