TL;DR: Mixing custom memory allocation with placement new is non-portable. Either override all new/delete combinations and let the compiler take care of array counters or ditch any new usage altogether and do everything by yourself.
Custom memory allocation is a very important aspect of a C++ library. By custom I mean the ability of the user to specify the functions for memory allocation and de-allocation. A library that directly uses the CRT facilities can cause serious headaches for the embedder as its memory is will essentially be ‘invisible’ to her.
There are several ways to write a library with custom memory allocators in C++.
- Allocate everything through custom ‘my_malloc’/’my_free’ functions for C libraries.
- Allocate everything through specific functions and use placement new for construction. (C++)
- Override new/delete and all their flavors. (C++).
The first option is great but is relegated to C libraries. In C++ we must take care of object construction/destruction.
Option 3 is a good one but care should be taken. It is very important that the new/delete overriding doesn’t ‘leak’ in any way in the public interface of the library. Otherwise embedders might include the overridden functions and get their own allocations altered!
Most libraries I write work with a combination of 2 and 3. Allocations I do almost always need additional parameters – memory tags and eventually memory arenas where they should happen.
I override new/delete with additional params and make the default ones assert so that they are not used by mistake. All this is relegated in the internals of the library and the user never sees these declarations.
Recently I had to adapt a popular library – Skia to use custom memory allocations. Skia is a great 2D rendering library that is used in Chrome and Android. It doesn’t by default have a way to change the memory allocation functions though. The whole memory management thing is a bit messy in the library. There are functions called sk_malloc/sk_free, which is great but are use sporadically. Instead all C++ new/delete calls run through apposite macros SkNEW, SkDELETE etc.
By default they are defined:
# define SkNEW(type_name) (new type_name)
# define SkNEW_ARGS(type_name, args) (new type_name args)
# define SkNEW_ARRAY(type_name, count) (new type_name[(count)])
# define SkNEW_PLACEMENT(buf, type_name) (new (buf) type_name)
# define SkNEW_PLACEMENT_ARGS(buf, type_name, args) (new (buf) type_name args)
# define SkDELETE(obj) (delete (obj))
# define SkDELETE_ARRAY(array) (delete (array))
There is also a SK_OVERRIDE_GLOBAL_NEW switch that can enable overriding new/delete. As Skia developers had already put the macros in place I decided to modify them to use sk_malloc/sk_free and then modify these functions to go through my own memory allocators.
For the non-array flavors things were pretty straightforward:
# define SkNEW(type_name) (new(sk_malloc_flags(sizeof(type_name), 0)) type_name)
# define SkDELETE(obj) (SkAllocDestroy(obj))
Where SkAllocDestroy is defined in this way:
inline void SkAllocDestroy(const T* ptr)
if (!ptr) return;
I place-new the object in the memory allocated by sk_malloc_flags and the manually destruct it and free the memory.
Things get more complicated with array allocations. My first attempt was something like this:
# define SkNEW_ARRAY(type_name, count) (new (SkAllocArray(count)) type_name[count])
# define SkDELETE_ARRAY(array) (SkFreeArray(array))
I allocate the memory for the array and placement-new it so that all constructors get called. Actually I allocate a bit more space and put in front of the array a counter with the actual size of the array. The SkFreeArray call reads the counter that is always before the actual array pointer, calls all destructors and frees the memory. Seems legit.
I started the application -> crash!
After inspecting the addresses of allocations and the ones used by the application I noticed that when I return pointer 0x100, the pointer actually set to the variable holding it was 0x108. This was also the pointer that eventually arrived in SkFreeArray(). In assembly it was clear that the placement new was bumping my pointer 8 bytes – the size of the counter it uses most likely.
Long story short – placement new will bump the pointer by some bytes to account for a counter… but not always. This doesn’t happen if the array members are of type with trivial or no destructor. It’s clear that this is an optimization. In that case you don’t need a counter as you can just delete the whole array without calling any destructor.
An excerpt from the standard explains the behaviour:
new T results in a call of operator new(sizeof(T)*5+x), and
(14.4) new(2,f) T results in a call of operator new(sizeof(T)*5+y,2,f)
Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new. This overhead may be applied in all array new-expression s, including those referencing the library function operator new(std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another.
Indeed if we override new the amounts of storage required are calculated by the compiler. While in theory my SkAllocArray code can be fixed to account for the compiler pointer fixups, it will not be portable. The standard doesn’t specify the size of ‘x’ and ‘y’ and exactly when they are used – it is up to the compiler. Something that works today on MSVC might not work on Clang or even change in future versions.
In the end I decided to just re-write my SkAllocArray to do everything – it allocates memory + counter, calls constructors with ‘single’ placement new when necessary and returns the adjusted pointer.
The gist of this post is that mixing custom memory allocation with placement new is non-portable. Either override all new/delete combinations and let the compiler take care of array counters or ditch any new usage altogether and do everything by yourself.