01. 设计无锁栈

nonblocking 的3种类型

事实上 nonblocking != lock-free, 将 nonblocking 分为 3 个类型

Obstruction-Free — If all other threads are paused, then any given thread will complete its operation in a bounded number of steps.

Lock-Free — If multiple threads are operating on a data structure, then after a bounded number of steps one of them will complete its operation.

Wait-Free — Every thread operating on a data structure will complete its opera tion in a bounded number of steps, even if other threads are also operating on the data structure.

简单分级来说就是 Obstruction-Free 是一种弱实现，在所有线程都暂停的情况下，指定的某一个线程会在有限的步骤下完成；Lock-Free 是说在多个线程操作同一个数据结构的时候，能保证有一个线程可以在有限的步骤下完成；Wait-Free 是一种更严格的 Lock-Free，因为他要求在多线程操作下，所有线程都能在有限的步骤后完成

Lock-free algorithms with these loops can result in one thread being subject to starvation. If another thread performs operations with the “wrong” timing, the other thread might make progress but the first thread continually has to retry its operation.

这里值得注意的是所有要求实现的前提是在 a bounded number of steps 的情况下，如果完成需要花费的步骤非常非常大也不会算是 Wait-Free，这也就是为什么 Wait-Free 很难实现的原因了，因为如果要所有的线程都能在有限的步骤后实现，那么就是要求线程之间不会因为彼此的操作而产生影响，不会因为另一个线程的行为而产生类似重试的操作，这种能够避免 starvation 的就是 Wait-Free. 这是非常困难的

Lock-Free 的好处与坏处

基本上使用 Lock-Free 数据结构是为了达到 2 种好处

enable maximum concurrency - 为了获得最大的并发量
robustness - 为了鲁棒性

因为是 Lock-Free 所以 some threads 在每一步都可以获得一些进展，而对于鲁棒性是指当使用 lock 的时候，如果一个线程在获得锁的时候出现了问题而没有释放锁，那么整个数据结构 is broken forever. 而如果是 Lock-Free 的情况下，其中一个线程出现了问题并不会产生像锁那样的影响

但是同时，Lock-Free 也会带来很多问题，因为实现 Lock-Free 依赖 atomic + memeory order 使其变得 visible，因为不依赖锁，所以对应的实现逻辑也会变得更复杂，对 atomic 的操作也会更多，对应的内存屏障，可见性的问题等等也都会带来很多的性能消耗，因此在某些场景下，Lock-Free 会比锁带来更多的消耗，所以不能将 Lock-Free 直接和高效划等号，只有在某些特定的场景下才合适

另外，虽然不会发生 deadlocks 但是 live-lock is possible，一个直观的 live-lock 的理解是一个只能允许一个人通过的通道，一个想要从左边到右边，一个想要从右边到左边，并且碰巧在中间碰到的情况下，双方都会重新回到原点重新尝试，而在重新尝试的过程中，依旧双方还是可能会被卡住，从而不断重试的场景。但是同样 live-lock 是 short-lived 因为她是依赖于特定的调度情况

Design lock free stack

当我们实际一个 Lock-Free stack 的时候，我们要确保的一个点是：一旦一个值被添加到了 stack，它就应该能够立即被另外一个线程安全地获取到，且只有一个线程能够获取

最简单的 Lock-Free stack 的实现就是使用链表结构，然后 head 指向栈顶元素，元素之间的顺序使用指针维护

Push() without lock

对于添加一个元素的正常逻辑一般是 3 个步骤

创建一个 Node 对象
将 Node 对象的 next 指针设置成为 head->next
将 head 指向创建的 Node 对象

这样的逻辑在单线程的情况下是OK的，但是在多线程的情况下，如果在线程 a 在执行 2，3 的中间有另外一个线程 b 成功添加了一个元素，这个时候如果线程 a 再修改 head 指针，那么对应 b 添加的元素久不在了...

因此，避免这种问题的方法就是在执行 3 的时候使用 atomic compare/exchange opertion, 只有当确保 head 没有被修改的情况下才能更改 head

template<typenam T>
class lock_free_stack
{
  private:
    struct node 
    {
      T data,
      node* next;
      node(T const& data_) : data(data_)
      {}
    };
    std::atomic<node*> head;
  public:
    void push(T const*& data)
    {
      node* const new_node = new node(data);
      new_node->next = head.load();
      while (!head.compare_exchange_weak(new_node->next, new_node));
    }
}

同样，这里还需要注意的一点是当创建的 Node 成为 head 之前，一定要保证说 Node 的创建是完成的，因为只有这样才能满足说，一旦一个值被添加到了 stack，它就应该能够立即被另外一个线程安全地获取到

然后一个细节是当运行!head.compare_exchange_weak(new_node->next, new_node)部分的时候，new_node->next是会被更新成为最新的值，所以之后是不用 reload 的

So, you might not have a pop() operation yet, but you can quickly check push() for compliance with the guidelines. The only place that can throw an exception is the construction of the new node B, but this will clean up after itself, and the list hasn’t been modified yet, so that’s perfectly safe. Because you build the data to be stored as part of the node, and you use compare_exchange_weak() to update the head pointer, there are no problematic race conditions here. Once the compare/exchange succeeds, the node is on the list and ready for the taking. There are no locks, so there’s no possibility of deadlock, and your push() function passes with flying colors.

这段话我觉得也蛮有意思的，检验 Lock-Free 的实现，它讨论了

exception-safe
race conditions
deadlock

Pop() that leaks nodes

对于取出一个元素的正常逻辑一般是 5 个步骤

获取 head 的值
获取 head->next
head 指向 head->next
将第一步获得到的 Node 的值返回
删除第一步获取到的 head

首先讨论的第一个问题是，因为是无锁的，有可能出现有多个指针在第一步的时候获取到相同的 head，而如果这时候有一个线程运行到 5 将其 delete，那么其他线程就会出现 dereferencing a dangling pointer 的问题，因此我们先不删除，先没有第 5 步 (CppCon 2017: Fedor Pikus “Read, Copy, Update, then what? RCU for non-kernel programmers” 好像确实，就像是在这里讨论RCU，说很多问题都是在 delete 的时候出现)

对于其内存管理部分在03. 无锁栈的内存管理讨论了两种管理方法：reference-counted 和 hazard pointers.

然后同样，就像是 Push() 部分一样，也是使用 atomic-compare/exchange 的方式来修改 head，这样不仅解决了和 push 相似的内容，也可以保证一个值只能被一个线程能够返回，因为在第 3 步，使用 atomic-compare/exchange 实现的话，只有一个能够成功，也因此也只有一个线程能进入到第 4 步，然后返回对应的值，不会出现多个 thead 从 head 中获得相同的 data 并返回

// listing 7.3 A lock-free stack that leaks nodes
#include <atomic>

template<typenam T>
class lock_free_stack
{
  private:
    struct node
    {
      std::shared_ptr<T> data;
      node* next;
      node (T const& data_) : data(std::make_shared<T>(data_))
      {}
    };
    std::atomic<node*> head;

  public:
    void push(T const& data)
    {
      node* const new_node = new node(data);
      new_node->next = head.load();
      while (!head.compare_exchange_weak(new_node->next, new_node));
    }
    std::shared_ptr<T> pop()
    {
      node* old_head = head.load();
      while (old_head &&
        !head.compare_exchange_weak(old_head, old_head->next));
      return old_head ? old_head->data : stad::shared_ptr<T>();
    }
}

Exception-safety issue

这部分主要讲了在多线程数据结构下"如何安全的值"的问题，尤其是在可能抛异常的情况下

整个问题是这样的：

如果通过"返回值"的方式返回对象，那么在拷贝这个值的过程中如果抛出了异常，返回值就丢了

std::optional<Data> pop() {
    std::lock_guard<std::mutex> lock(mutex);
    if (stack.empty()) return std::nullopt;

    Data value = stack.top();   // 1
    stack.pop();                // ← 已经把值移除！
    return value;               // ← 拷贝在这里可能抛异常
}

如果是在 1 抛出异常的情况下，值不会丢，但是一般是在 return 拷贝的时候发生异常，这就有问题了

如果说将结构使用引用参数穿出去(不是 return )，这样如果在拷贝的过程中抛出异常，那么对象还没有被栈弹出，所以栈的状态还没有变，值没丢

bool pop(Data& out) {
    std::lock_guard<std::mutex> lock(mutex);
    if (stack.empty()) return false;

    out = stack.top();     // ← 如果这里抛异常，程序终止，没有进行 pop()
    stack.pop();           // ← 值还是在栈里
    return true;
}

而在 lock-free 的情况下，你是没有办法的，因为你必须要在确定你是唯一拿到这个值之后你才可以返回，而本身"确定你是唯一拿到这个值"这件事情，是使用 atomic-compare/exchange 的方式，这个时候对应的 head 以及被设置了，即这个值已经被 pop 出了，而此时如果在复制的时候出现异常，这个值就丢了

结论：在这样的情况下，传引用已经没有了他的效果，所以还不如返回值

If you want to return the value safely, you have to use the other option from chapter 3: return a (smart) pointer to the data value. If you return a smart pointer, you can return nullptr to indicate that there’s no value to return, but this requires that the data be allocated on the heap. If you do the heap allocation as part of pop(), you’re still no better off, because the heap allocation might throw an exception. Instead, you can allocate the memory when you push() the data onto the stack—you have to allocate memory for the node anyway. Returning std::shared_ptr<> won’t throw an exception, so pop() is now safe. Putting all this together gives the following listing.

这里就是说，如果你真的想要 return value safely, 可以返回一个智能指针

template<typename T>
class lock_free_stack
{
private:
  struct node
  {
    std::shared_ptr<T> data;   
    node* next;
    node(T const& data_): data(std::make_shared<T>(data_)) {}
  };
  std::atomic<node*> head;
public:
  void push(T const& data)
  {
    node* const new_node=new node(data);
    new_node->next=head.load();
    while(!head.compare_exchange_weak(new_node->next,new_node));
  }
  std::shared_ptr<T> pop()
  {
    node* old_head=head.load();
    while(old_head &&                      
        !head.compare_exchange_weak(old_head,old_head->next));
    return old_head ? old_head->data : std::shared_ptr<T>();     
  }
};

这里有几个点，就是我能理解说就是要将对应的value存在heap，而且在push中去allocation是好的，因为在 push 中如果new node失败也没有关系，因为并没有影响heap，但是在看对应代码的实现的时候我发现他把 node 的 T*改成了 shared_ptr<T>, 在 push() 的时候创建 shared_ptr。这我就不太能理解了，就是说只要data是在heap应该应该就可以了，那原本 node* 也是在heap的啊，那为什么不是在 pop 的时候再创建对应的 shared_ptr ?

是因为考虑 make_shared 失败的情况吗
另外一个原因我能想到的就是 shared_ptr 如果多次创建那么就是引用块也不一样了
然后就是利用 shared_ptr 的复制是 noexcept 的：它只是增加引用计数，并不会分配内存或做别的可能抛异常的事情

posted @ 2025-03-29 08:41 rustic-stream 阅读(57) 评论(0) 收藏举报

刷新页面返回顶部