The article is most useful as a very high-level introduction to the topic to a junior C programmer, from a Linux viewpoint. Factually, it's not particularly accurate; in particular:
* Precise garbage collection is listed as having moderate allocation speed (precise GC normally uses just a bounds-check and increment - it ought to be the fastest), moderate deallocation speed (deallocation is free in GC - you pay for collection, but that collects live objects, not freed memory), and "rarely" SMP friendly - but collection is a trivially parallelizable operation in a way that manual allocation never is.
* Reference counting is described as having "excellent" cache locality, when this is one of the weakest areas of reference counting; every reference copy operation comes with an indirection and, usually, a memory-bus assertion to ensure correct multi-threaded reference count updates.
* Manual allocators such as malloc etc. are described as being "easy" to use, when this is definitely not the case - if it were easy, most programs using it would be free of access violations and memory leaks. Meanwhile, if reference counting or precise (i.e. memory-safe) GC is built into the language in question, allocation is actually quite hard to get wrong. To describe malloc/etc as "easy" and most of the others as "moderate" is a severe misrepresentation.
This is a fine overview, but for more depth on alloc/free implementations, I've always really loved "Dynamic Memory Allocation, A Survey And Critical Review", which you can grab online at ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps (I'm amazed this link still works).
On most of the large programs I've worked on, one of the easiest systems programming wins has been replacing general-purpose malloc() with something tuned to the workload.
Incidentally I'm writing an article now on our graph-database which I'll be submitting later this week and our code for allocating records in files is based on algorithms in glibc's malloc, so there will be some goodies in there about that too.
TCMalloc has excellent performance. I was fighting a losing performance battle with it, and eventually dropped an old project on mulithreaded memory allocation.
The only drawback it had at the time was that it never returned memory to the OS. But if you know that doesn't matter for you (as I doubt it mattered for Google), then it's probably the best choice.
* Precise garbage collection is listed as having moderate allocation speed (precise GC normally uses just a bounds-check and increment - it ought to be the fastest), moderate deallocation speed (deallocation is free in GC - you pay for collection, but that collects live objects, not freed memory), and "rarely" SMP friendly - but collection is a trivially parallelizable operation in a way that manual allocation never is.
* Reference counting is described as having "excellent" cache locality, when this is one of the weakest areas of reference counting; every reference copy operation comes with an indirection and, usually, a memory-bus assertion to ensure correct multi-threaded reference count updates.
* Manual allocators such as malloc etc. are described as being "easy" to use, when this is definitely not the case - if it were easy, most programs using it would be free of access violations and memory leaks. Meanwhile, if reference counting or precise (i.e. memory-safe) GC is built into the language in question, allocation is actually quite hard to get wrong. To describe malloc/etc as "easy" and most of the others as "moderate" is a severe misrepresentation.