Different... Quite a few years back I started developing a custom STL implementation which has eventually be adopted and used throughout our company. One of the most important things to get right when developing any kind of generic container library is the way memory is allocated. It needs to be lightweight, highly flexible and above all easy to understand so people are willing to experiment it.

But STL Allocators have a bad reputation and it’s for a good reason. They are complex, hard to understand and have some interesting behaviour that seems designed to confuse. As a result, I needed to look at how we were going to provide a custom allocation system that was both easy to use and simple to understand without restricting what people could do with them.

 

A Bit of History
A while back Bitsquid published a post entitled Custom Memory Allocation in C++. This excellent post covered how the BitSquid engine used an allocator model to perform memory allocation throughout their entire engine.

But the FTL requires a slightly different approach so I won’t be treading over already covered ground here.

FTL allocators are well hidden inside containers, the only control you have is specifying the allocator type which means it’s use, and how and when objects are allocated and created is completely fixed with only the allocator itself being able to effect the outcome. Because of this the allocator behaviour needs to be as customisable as possible without requiring any changes the container itself.

When the FTL was original started, it was quite small scale and only used by a couple of developers, so allocators were not a priority. The flexibility wasn’t needed so in-container malloc and free allowed us to concentrate on container creation but obviously this wasn’t going to last.

The follow just describes the various approaches we took, why we dismissed them and what we eventually ended up with.

 

The Initial Approach
Allocators should have a minimal over-head. What we didn’t want to do was increase the size of every container dramatically when we rolled out customisable allocators. As a result, we initially used an approach used by a couple of vendors and defined the allocator specification using only static members – removing the need for an internal allocator object.

1
2
3
4
5
6
7
8
9
10
// Completely made up code showing the use of a static based allocator
template <typename T, typename Alloc>
class vector
{
  void push_back(void)
  {
    void* new_memory = Alloc::allocate( sizeof(T) );
    T* new_object = new(new_memory) T;
  }
};

I knew this would limit the flexibility of the allocators, but it had minimal over-head (especially using default template parameters) and wouldn’t effect those containers already being used. And besides, programmers are a creative bunch and I wanted to see what people did with this before resigning it to the scrap heap.

But while programmers were able to work around the main limitation of not having allocator state per container, they were having to jump through hoops to get the behaviour they wanted. Which made it less likely that other programmers would feel confident enough writing their own allocators and their ability to really customise their behaviour was too limited.

So we ended up adding allocator state on a per container basis, making it much easier for developers to do what they wanted though it did add at least 4 bytes per container. But I felt that flexibility and simplicity were much more important.

1
2
3
4
5
6
7
8
9
10
11
12
// Completely made up code showing the use of an instanced allocator
template <typename T, typename Alloc>
class vector
{
  Alloc m_allocator;
 
  void push_back(void)
  {
    void* new_memory = m_allocator.allocate( sizeof(T) );
    T* new_object = new(new_memory) T;
  }
};

 

Allocator Specification
The allocator specification is complicated. While I’m sure there are good reasons for some of the decisions I wanted ours it to be much simpler. So removing the type information (since our allocators work on raw memory and nothing else), removing the rebind(!) and exception handling (which we don’t use) we ended up with the following.

1
2
3
4
5
6
7
8
typedef ftl::pair<void*, size_t> allocated_memory;
 
class Alloc
{
public:
  allocated_memory allocate(const size_t alloc_size);
  void deallocate(void* ptr);
};

And for the basic allocator, that’s it, it doesn’t require anything else to work, but it does have one interesting aspect.

1
typedef ftl::pair<void*, size_t>  allocated_memory;

As we can have all sorts of allocator types, what it allocates might not be exactly what you asked for. If it can’t allocate all the memory you requested, that’s fine, it simply returns allocated_memory(nullptr, 0) but it can also return more than was requested (for example, fixed sized block allocators will do this). This return type simply allows the allocator to return not only the allocated memory, but also how much was allocated, which allows calling objects to take advantage of this additional memory if possible.

Most of the time this isn’t queried and only what was asked for is given, but it offers an additional level of information which might allow better memory usage in some containers.

So a container will most likely end up with something like the following when adding and removing elements.

1
2
3
4
5
6
7
8
// Creating a new object in a container
allocated_memory new_memory = m_allocator.allocate( sizeof(T) );
if (new_memory.first)
  T* new_object = new(new_memory.first) T;
 
// Losing the object now we’re done with it
old_object->~T();
m_allocator.deallocate(old_object);

This is fine and gives us a very simple entry point for allocators. But by forcing the use of placement new and the destructor call in the container itself, we are limiting what an allocator can do. While the allocators are required to return raw memory, that doesn’t mean that it has to internally. Some allocators might pre-create the objects before returning them so creation is front loaded but the forced placement new could mean we’re over-riding an object that has already been created.

 

Construction Functions
As a result, we want developers to be able to not only over-ride the memory allocation, but also the object creation. 99% of allocators won’t need to do this so we don’t want to add additional requirements to the allocator so instead we can create non-member, non-friend functions, specialised on the allocator, which will do the creation for us.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
template <typename TConstruct, typename TAlloc>
TConstruct* construct(TAlloc& allocator);
 
template <typename TConstruct, typename TAlloc>
TConstruct* construct(TAlloc& allocator, const TConstruct& obj);
 
template <typename TConstruct, typename TAlloc>
TConstruct* construct(void* allocated_mem);
 
template <typename TConstruct, typename TAlloc>
TConstruct* construct(void* allocated_mem, const TConstruct& obj);
 
template <typename TConstruct, typename TAlloc>
void destroy(TConstruct* ptr);
 
template <typename TConstruct, typename TAlloc>
void destroy(TConstruct* ptr, TAlloc& allocator);

So our point of allocation/creation now becomes something much simpler and much more powerful.

1
2
3
4
5
// Creating a new object in a container
T* new_object = alloc::construct<T>(m_alloc);
 
// Lose our existing object and return it back to the allocator
alloc::destroy(old_object, m_alloc);

The default version of construct performs the allocation and placement new within the construct function, but should the allocator need something more (or less), simply over-loading the function on the allocator type allows complete control over both memory allocation and object creation. The same goes for the destroy function and the automatic call of the objects destructor.

 

No Base Allocator
One thing I didn’t add to the allocators was a base allocator interface. The allocators are created within the container, with their type defined at the point of the containers declaration. The addition of a base interface (and the associated virtual functions) would have increase the size over-head of the allocator which is something I wanted to avoid and something I thought didn’t add enough to warrant the overhead. I was less worried about the overhead of calling a virtual function as that would be insignificant compared to the overhead of everything else what was going on.

 

Conclusion
So by introducing an allocator model that is much simpler (only 2 required functions) with more extensible customisation should an allocator should need it, developers have complete control over all the memory allocation and importantly the object creation itself. Due to it’s initial simplicity, developers have no problem creating new allocators that improve both memory and container use in a project and can start to experiment with better and more efficient allocators quickly.

 

Title image by Squeezy.  Used with permission.