Never too late

2016-04-17 18:51

http://www.cnblogs.com/edwardlost/archive/2010/10/11/1848009.html

http://www.boost.org/doc/libs/1_60_0/doc/html/boost_asio/reference/asio_handler_allocate.html

每个异步操作的文档说明里几乎都有这一句：Copies will be made of the handler as required. 说明了会有内存的操作

异步IO操作中自定义内存分配

asio在执行异步IO操作时会使用系统函数来动态分配内存，使用完后便立即释放掉；在IO操作密集的应用中，这种内存动态分配策略会较大地影响程序的整体性能。为了避免这个问题，可以在在应用程序中创建一个内存块供asio异步IO操作使用，异步IO操作通过自定义接口asio_handler_allocate 和 asio_handler_deallocate 来使用该内存块。

示例：custom_allocation_server

上述例子中使用到了 boost::aligned_storage<1024> storage_ 来管理原始内存。

＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝

http://www.voidcn.com/blog/a809146548/article/p-5045780.html

自定义内存分配

很多异步操作需要分配对象,用来存储与操作有关的状态数据.例如,Win32的实现中需要将重叠子对象传递给Win32 API函数.

幸好程序包含一个易于识别的异步操作链.半双工协议(如HTTP服务)为每个客户端创建一个单操作链.全双工协议实现有两个并行的执行链.程序运用这个规则可以在链上的所有异步操作中重用内存.

同一个链接上的异步操作回调使用自定义的内存分配，而非默认的new/delete.

假设复制一个用户定义的句柄对象h,如果句柄的实现需要分配内存,应该有如下代码:

void* pointer = asio_handler_allocate(size, &h);

同样要释放内存:

asio_handler_deallocate(pointer, size, &h);

这个函数实现了参数依赖的定位查找.asio空间中有这个函数的默认实现:

void* asio_handler_allocate(size_t, ...);

void asio_handler_deallocate(void*, size_t, ...);

实现了::operator new() 和 ::operator delete()功能.

函数实现保证了相关句柄（你可以把同一个socket上的所有异步句柄共享同一份内存，得保证内存的使用效率，如果频繁的使用全局new的话得考虑把相关句柄进行进一步的划分，使用不同的内存）调用前会发生内存重新分配,这样就可以实现相关句柄上的异步操作重用内存.

发起一个异步操作的时候调用asio_handler_allocate，为回调对象分配内存
当条件满足开始执行回调函数之前释放回调对象内存asio_handler_deallocate
执行回调函数

在任意调用库函数的用户线程中都可调用自定义内存分配函数.实现保证库中的这些异步操作不会并发的对句柄进行内存分配函数调用.实现要加入适当的内存间隔,保证不同线程中调用的内存分配函数有正确的内存可见性.

===============

http://bbs.rosoo.net/home.php?mod=space&uid=2&do=blog&classid=1&view=me

http://bbs.rg4.net/thread-86-1-1.html

boost_asio/example/allocation/server.cpp

这个sample演示的是如何自定义handler的分配/释放。

handler其实是一个仿函数

asio中大量使用了handler，几乎所有异步函数中都带handler参数，例如io_service::post，async_read_some等等。在调用这些函数时，asio会分配一块内存拷贝保存这个handler，以便在异步完成时调用这个handler。一般的socket程序都会有很多次异步过程，例如循环调用async_read/write等等，这将导致频繁的分配释放小内存（一般情况下，handler的大小不会超过128 bytes，里面只有几个参数和shared_ptr），从而降低效率并且导致内存碎片。

为了避免这种情况的发生，可以自定义这个分配过程。可以在每个session(tcp连接)中预分配一块或几块小内存，让asio内部代码在分配并拷贝handler时使用这些内存。

为了达到这个效果，传入的handler object除了必须支持operator()(...)之外，还必须支持如下2个函数：asio_handler_allocate和asio_handler_deallocate。

有了这两个函数之后，asio在进行handler拷贝时会调用这两个函数来分配释放内存。

template <typename Handler>
inline custom_alloc_handler<Handler> make_custom_alloc_handler(
handler_allocator& a, Handler h)
{
return custom_alloc_handler<Handler>(a, h);
}
socket_.async_read_some(boost::asio::buffer(data_),
make_custom_alloc_handler(allocator_,
boost::bind(&session::handle_read,
shared_from_this(),
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred)));

复制代码

以上代码调用make_custom_alloc_handler构造了一个custom_alloc_handler，并且把这个handler传给asio的async_read_some函数。async_read_some会调用custom_alloc_handler::asio_handler_allocate分配一块内存并保存这个custom_alloc_handler。这个custom_alloc_handler就是前面所说的必须支持operator()(...)以及asio_handler_allocate和asio_handler_deallocate的class object。

使用strand的情况下：

考虑如下代码：

void connection::start()
{
socket_.async_read_some(boost::asio::buffer(buffer_),make_custom_alloc_handler(allocator_,strand_.wrap(
boost::bind(&connection::handle_read, shared_from_this(),
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred))));
}

复制代码

首先用bind构造了一个handler，再用strand_.wrap包裹这个handler并产生一个新handler，然后再用make_custom_alloc_handler包裹一次。传入async_read_some时，asio使用自定义的分配函数拷贝这个custom_handler，执行custom_handler之后用自定义函数释放。执行custom_handler实际上执行的是strand.dispatch()，这会导致bind构造的handler被分配拷贝一次。

所以更为彻底的做法是：

void connection::start()
{
socket_.async_read_some(boost::asio::buffer(buffer_),make_custom_alloc_handler(allocator_,strand_.wrap(
make_custom_alloc_handler(allocator_,boost::bind(&connection::handle_read, shared_from_this(),
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred)))));
}

复制代码

注意如果可能在多个线程上发出异步消息（分配并拷贝handler）或者可能在多个线程上执行handler（释放handler），则asio可能在多个线程上调用asio_handler_allocate和asio_handler_deallocate。这两个函数内部必须考虑同步问题。前面使用strand的代码并不能保证不在多个线程上发出异步消息和在多个线程上执行custom_handler，strand只保证handler的执行串行化，实际上它也是wrap了一个strand_handler，并发执行strand_handler，在strand_handler中再dispatch一次消息来实现handler的串行化的。

在多个线程环境中，按我的理解，如果同一块内存的分配和释放可能在多个线程中同时进行，那么应该使用锁。如果能够保证同一块内存的分配和释放在同一个线程中，则比较简洁的方案是在每个thread的tls中分配一批小内存，每次使用make_custom_alloc_handler(tls_allocator_,boost::bind(func,param))。这种方案的内存利用率相当高，而且非常简洁。如果是多个线程绑在同一个io_service，可能很难做到这一点。每线程一个io_service则很容易做到。因为post给一个io_service的完成函数肯定也由该io_service在同一个线程调用。这种情况下也无需使用strand，因此也不存在2次使用make_custom_alloc_handler的情况。但这种方式的缺点在于很难处理多个io_service之间的平衡，也即不知道哪些post应该发给哪个io_service，很难做到负载均衡。有可能发生某些thread很忙，有些thread一直没事干的情况。与之相比，多个thread绑在一个io_service上就不存在这个问题，多个thread之间可以由系统很好的平衡负载均衡性，但内存的分配和释放可能就需要锁保护。注意这种情况下的锁应该使用InitializeCriticalSectionAndSpinCount而不是普通的CriticalSession，因为内存分配和释放操作都很快，宁可spinlock也不应该让线程进入等待状态。相比较而言，线程进入等待状态以及唤醒操作都是很耗时的。

http://www.boost.org/doc/libs/1_60_0/doc/html/boost_asio/reference/asio_handler_allocate.html

http://www.boost.org/doc/libs/1_60_0/doc/html/boost_asio/reference/asynchronous_operations.html

＝＝＝＝＝＝＝＝

http://m.blog.chinaunix.net/uid-28905468-id-4151037.html

asio中大量使用handler, 几乎所有的异步函数都带有handler参数. 在调用这些函数时, asio会分配一片内存保存这个handler, 以便在异步完成时调用这个handler. 而这些handler大小往往很小, 因为里面只有几个参数.
以下代码来自boost_1_45/libs/asio/example/allocation/server.cpp
首先需要实现一个类模板custom_alloc_handler, 该类模板允许handler对象使用自定义的内存分配器. custom_alloc_handler需要实现operator(), asio_handler_allocate 和 asio_handler_deallocate 模板函数.

template <typename Handler>
class custom_alloc_handler
{
public:
custom_alloc_handler(handler_allocator& a, Handler h): allocator_(a), handler_(h)
{
}
template <typename Arg1>
void operator()(Arg1 arg1)
{
handler_(arg1);
}
template <typename Arg1, typename Arg2>
void operator()(Arg1 arg1, Arg2 arg2)
{
handler_(arg1, arg2);
}
friend void* asio_handler_allocate(std::size_t size,
custom_alloc_handler<Handler>* this_handler)
{
return this_handler->allocator_.allocate(size);
}
friend void asio_handler_deallocate(void* pointer, std::size_t /*size*/,
custom_alloc_handler<Handler>* this_handler)
{
this_handler->allocator_.deallocate(pointer);
}
private:
handler_allocator& allocator_; // 自定义的内存分配器
Handler handler_;
};

然后实现一个模板函数

template <typename Handler>

inline custom_alloc_handler<Handler> make_custom_alloc_handler(
handler_allocator& a, Handler h)
{
return custom_alloc_handler<Handler>(a, h);
}

这样, 在调用asio中异步函数的时候, 大多数情况下就是输入boost::bind()的地方, 修改为

make_custom_alloc_handler(allocator_, boost::bind(...));

allocator_都是作为引用使用的，所以引用的所使用的内存会被session对象一直占用
后话:

其实实现这段代码的必要性并不是很大, 反复调用的时候诚然会有很多小块的内存, 但是实际应用中, stl的list, map, vector 等以及相关迭代器的应用同样会带来一样的问题, 如果每次都这样定义std:list<int, alloctor=""> list; 我想你肯定会头痛, 何况内存池很可能是后期为了优化才加进去的.
因此在此建议项目前期直接new/delete就行了, 等到系统稳定, 可以考虑使用google的内存池的解决方案 google-perftools: http://code.google.com/p/google-perftools/

＝＝＝＝＝＝＝＝＝

自定义内存分配
Asio很多地方都需要复制拷贝handlers，缺省情况下，使用new/delete，如果handlers提供

void* asio_handler_allocate(size_t, …);
void asio_handler_deallocate(void*, size_t, …);
则会调用这两个函数来进行分配和释放。

The implementation guarantees that the deallocation will occur before the associated handler is invoked, which means the memory is ready to be reused for any new asynchronous operations started by the handler.

如果在完成函数中再发起一个异步请求，那么这块内存可以重用，也就是说，如果永远仅有一个异步请求在未完成的状态，那么仅需要一块内存就足够用于asio的handler copy了。

Never too late

公告

asio_handler_allocate

自定义内存分配