Memory Allocators and Memory Pool

Tags: optimization, memory, memory allocation, memory pool, object pool

Why do I want to study memory allocation

Virtually all programmers are obsessed with this thing called "optimization". We all like to do better, but the right way is often not always the instinctive way. That's since when I've realized that I need to do some real learning before I dig into the optimization part of a program.

With memory allocators such as tcmalloc and jemalloc available nowadays, together with larger memory (which, by side effect, lowers the number of memory fragments per unit memory used if with proper policies[1]), the problem of memory fragmentation is less serious than before. For most of the cases, default settings of a general allocator is sufficient.

However, there is still need to optimize for specific cases, particularly those with frequent allocation and deallocation of numerous objects. For example, in many games, which could often create and kill thousands or even millions of small objects in a short period, it is common to implement a memory pool to reduce memory usage and improve responsiveness[2].

In addition, from a fundamental view, memory is one of the few most vital resources that we programmers cope with everyday. If I ever want to understand what's going on under the hood when I require any memory resources, or where the efficiency bottleneck is, learning how a memory allocator is implemented, what wise algorithms have been devised, and what trade-offs there are is a great way to grasp a comprehensive picture of memory.

Most materials I found were established long ago (by "long ago" I mean more than five years ago). This is, perhaps, because that research in memory allocation is quite mature now (my guess). I encountered three well-known allocators, tcmalloc, jemalloc and mimalloc. I also found info ptmalloc but most recommend against it.

As to essays, I did find out an essay discussing on how much we had misunderstood the problem of memory fragmentation[1]. I've not finished reading it yet. Another essay is mainly an implementation of fixed-size memory pool, which can be found in the reference link at the wiki page[4][5].

My simple fixed size memory pool

In the book Game Coding Complete (4th Edition) [2], the author mentioned that it could be helpful to implement a simple memory pool in the game engine. I decided that this is a good chance for practice, and that's why I set off to write my own fixed size memory pool.

This memory pool is mixed with a little flavor of a slab allocator. I added a cache for freed and unallocated items for later reuse. An item, in this context, is a fixed size block that this memory pool allocates. I believe a cache (which is a stack) would speed up the whole memory pool instead of using linked lists.

Initially, I wanted to implement it in Rust, the language I'm fond of. However I know too little of if to write proper Rust code for that. So I resorted to C++. My first version of the memory pool was too over-designed and the code soon became too bloated to be managed. So I restructured the memory pool, this time using a much more clear approach, but keeping the idea of a cache. The code below is the second version. It still has a big drawback in that its size never decreases even if items are freed. But I don't see this a big problem right now. However instincts are often misleading. The true perf. awaits further profiling.

#include "internal/memory_pool_helper.hpp"
#include <vector>
#include <memory>

namespace seideun::memory {

/**
 * Singleton memory allocator for fixed size memory blocks
 */
template <size_t kItemSizeInBytes, size_t kItemsPerBlock>
class FixedSizeMemoryPool {
  static_assert(kItemSizeInBytes != 0, "Cannot allocate zero sized type");
  static_assert(kItemsPerBlock != 0, "Cannot allocate zero items per block");
  
public:
  FixedSizeMemoryPool(FixedSizeMemoryPool const& other) = delete;
  FixedSizeMemoryPool& operator=(FixedSizeMemoryPool const& other) = delete;

  void* alloc() {
    if (m_cached_items.empty()) { extend(); }
    auto ret = m_cached_items.back();
    m_cached_items.pop_back();
    return ret;
  }

  void free(void* item) { m_cached_items.push_back(static_cast<uint64_t*>(item)); }

private:
  void extend() {
    auto new_block = std::make_unique<Arr>();
    for (auto i = new_block.get(), i_end = i + kArraySize; i != i_end; i += kItemSize / 8) {
      m_cached_items.push_back(i);
    }
    m_blocks.emplace_back(std::move(new_block));
  }

  // real block size aligns to multiples of 8, which means 8 * 8 == 64 bits as minimal unit
  size_t constexpr static kItemSize = internal::round_up_to_multiples_of_8(kItemSizeInBytes);
  size_t constexpr static kArraySize = kItemsPerBlock * kItemSize / 8;
  using Arr = std::array<uint64_t, kArraySize>;

  /** Items has two states:
   *  1. Allocated
   *  2. Cached
   */
  std::vector<std::unique_ptr<Arr>> m_blocks;
  std::vector<uint64_t*> m_cached_items;
};

}

There is a helper function internal::round_up_to_multiples_of_8 which, by its name, rounds a number up to multiples of 8.

References

  1. Mark S. Johnstone et al, The Memory Fragmentation Problem: Solved? 1997
  2. Mike McShaffry, David Graham, Game Coding Complete (4th Edition), 2016, Ch3
  3. Slab allocation, https://en.wikipedia.org/wiki/Slab_allocation , cited 2020-09-03
  4. Memory Pool, https://en.wikipedia.org/wiki/Memory_pool , cited 2020-09-03
  5. Ben Kenwright, Fast Efficient Fixed-Size Memory Pool - No Loops and No Overhead, 2012
posted @ 2020-09-05 21:46  seideun  阅读(498)  评论(0)    收藏  举报