Doug Lea文章阅读记录-JUC系列

3.3 Queues

The heart of the framework is maintenance of queues of blocked threads, which are restricted here to FIFO queues. Thus, the framework does not support priority-based synchronization.

核心是一个阻塞线程的先进先出的队列,核心!

These days, there is little controversy that the most appropriate choices for synchronization queues are non-blocking data structures that do not themselves need to be constructed using lower-level locks. And of these, there are two main candidates: variants of Mellor-Crummey and Scott (MCS) locks [9], and variants of Craig, Landin, and Hagersten (CLH) locks [5][8][10]. Historically, CLH locks have been used only in spinlocks. However, they appeared more amenable than MCS for use in the synchronizer framework because they are more easily adapted to handle cancellation and timeouts, so were chosen as a basis. The resulting design is far enough removed from the original CLH structure to require explanation.

these days:目前

little controversy:毫无争议

most appropriate choices:最适合的选择

candidate:候选

variant:变体

amenable:顺从,经得起检验的

adapted to:适合

隆重介绍除了MCS队列或CLH队列作为同步器队列的数据结构别无它选。尽管CLH一般只使用再自旋锁,但是考虑到CLH更加适合处理取消和超时,所以选择了它作为基础,最终的实现结果当然是和原始的CLH队列差别很大。

A CLH queue is not very queue-like, because its enqueuing and dequeuing operations are intimately tied to its uses as a lock. It is a linked queue accessed via two atomically updatable fields, head and tail, both initially pointing to a dummy node.

intimately tied to:密切相关

atomically:原子方式的

通过原子更新两个字段headtail访问,来维护一个链表队列。这个原子就很关键了,毕竟是并发场景。另外,链表结构保证着顺序性。

结构图:

A new node, node, is enqueued using an atomic operation:

do { 
  pred = tail; 
} while(!tail.compareAndSet(pred, node)); 

The release status for each node is kept in its predecessor node. So, the "spin" of a spinlock looks like:

while (pred.status != RELEASED) ; // spin

A dequeue operation after this spin simply entails setting the head field to the node that just got the lock:

head = node;

predecessor:上一个

是的,直接上代码讲,先又重点说明了一下入队是原子操作,显而易见,这个原子操作是CAS支撑的。然后是指出自旋是在哪里自旋的,就是不断判断前面节点的status。最后是出队,出队只需要将head字段设置为刚刚获得锁的节点。

Among the advantages of CLH locks are that enqueuing and dequeuing are fast, lock-free, and obstruction free (even under contention, one thread will always win an insertion race so will make progress); that detecting whether any threads are waiting is also fast (just check if head is the same as tail); and that release status is decentralized, avoiding some memory contention.

obstruction:堵塞,阻挠

contention:竞争

decentralized:分散管理的

这里就说明一下这个设计的好处,CLH锁的好处是出队入队都是无阻塞的,即使在竞争激烈情况下因为总有一个线程能入队,就可以顺利进行下去,判断线程是不是等待状态也非常快,只需要判断头尾节点是不是同一个,另外就是自旋判断的状态是分散的,避免了内存竞争。

In the original versions of CLH locks, there were not even links connecting nodes. In a spinlock, the pred variable can be held as a local. However, Scott and Scherer[10] showed that by explicitly maintaining predecessor fields within nodes, CLH locks can deal with timeouts and other forms of cancellation: If a node's predecessor cancels, the node can slide up to use the previous node's status field.

The main additional modification needed to use CLH queues for blocking synchronizers is to provide an efficient way for one node to locate its successor. In spinlocks, a node need only change its status, which will be noticed on next spin by its successor, so links are unnecessary. But in a blocking synchronizer, a node needs to explicitly wake up (unpark) its successor.

explicitly:明确的

开始引导出自己在原来CLH上的改造,原先CLH并不需要有明确node之间的连接,然而有大佬背书说明确显示链表就可以处理超时和取消的要求,比如取消了前面的节点,那么后面的节点往前划动使用前面节点的状态即可。

最主要的修改就是为一个节点定位下一个节点提供一个有效的方式,最主要的原因是,一个自旋锁只需要更改他的状态,就相当于通知到下一个节点,但是作为一个阻塞同步器,一个节点是需要明确去唤醒下一个节点的。终于提到了关键点:unpark

An AbstractQueuedSynchronizer queue node contains a next link to its successor. But because there are no applicable techniques for lock-free atomic insertion of double-linked list nodes using compareAndSet, this link is not atomically set as part of insertion; it is simply assigned:

pred.next = node; 

after the insertion. This is reflected in all usages. The next link is treated only as an optimized path. If a node's successor does not appear to exist (or appears to be cancelled) via its next field, it is always possible to start at the tail of the list and traverse backwards using the pred field to accurately check if there really is one.

applicable :适应的

traverse:遍历

铺垫了这么久,终于可以说出AbstractQueuedSynchronizer实现的队列是显示队列的事了,并且还是双向链表。还没有技术手段可以通过CAS向一个双向链表插入节点,所以这个连接指向不是原子操作的一部分,只会简单的进行赋值操作。

A second set of modifications is to use the status field kept in each node for purposes of controlling blocking, not spinning. In the synchronizer framework, a queued thread can only return from an acquire operation if it passes the tryAcquire method defined in a concrete subclass; a single "released" bit does not suffice. But control is still needed to ensure that an active thread is only allowed to invoke tryAcquire when it is at the head of the queue; in which case it may fail to acquire, and (re)block. This does not require a per-node status flag because permission can be determined by checking that the current node's predecessor is the head. And unlike the case of spinlocks, there is not enough memory contention reading head to warrant replication. However, cancellation status must still be present in the status field.

concrete:具体的

suffice:足够

第二大修改的点是使用节点内维护的status field来控制阻塞,而不是用这个自旋。这个status field在源码中就是Node#waitStatus

这里就涉及到具体的代码实现了,AQS的设计是把用模版模式把一些模版方法留给子类去实现,并且明确告诉那些实现者,锁状态是一个volatile修饰的int。 通过判断自己的前节点是否为头节点老决定是否进行acquire操作。所以不需要每个节点内的状态而只要判断是不是头节点就行了,这个和自旋锁已经有很大不同了,那么每个节点上存的那个Node#waitStatus存的值具体做什么用的呢:

/** waitStatus value to indicate thread has cancelled */
static final int CANCELLED =  1;
/** waitStatus value to indicate successor's thread needs unparking */
static final int SIGNAL    = -1;
/** waitStatus value to indicate thread is waiting on condition */
static final int CONDITION = -2;
/**
 * waitStatus value to indicate the next acquireShared should
 * unconditionally propagate
 */
static final int PROPAGATE = -3;

The queue node status field is also used to avoid needless calls to park and unpark. While these methods are relatively fast as blocking primitives go, they encounter avoidable overhead in the boundary crossing between Java and the JVM runtime and/or OS. Before invoking park, a thread sets a "signal me" bit, and then rechecks synchronization and node status once more before invoking park. A releasing thread clears status. This saves threads from needlessly attempting to block often enough to be worthwhile, especially for lock classes in which lost time waiting for the next eligible thread to acquire a lock accentuates other contention effects. This also avoids requiring a releasing thread to determine its successor unless the successor has set the signal bit, which in turn eliminates those cases where it must traverse multiple nodes to cope with an apparently null next field unless signalling occurs in conjunction with cancellation.

relatively:相对的

accentuate:使突出,强调

在调用park前可以先检查节点status状态来避免,源代码中注释关联:

Non-negative values mean that a node doesn't need to signal.

节点status的维护释放线程状态,不需要判断线程状态只需要判断节点状态即可

Perhaps the main difference between the variant of CLH locks used in the synchronizer framework and those employed in other languages is that garbage collection is relied on for managing storage reclamation of nodes, which avoids complexity and overhead. However, reliance on GC does still entail nulling of link fields when they are sure to never to be needed. This can normally be done when dequeuing. Otherwise, unused nodes would still be reachable, causing them to be uncollectable.

Some further minor tunings, including lazy initialization of the initial dummy node required by CLH queues upon first contention, are described in the source code documentation in the J2SE1.5 release.

reclamation:开垦;收回;再利用;矫正

java自带GC机制,实现起来比其他没有GC能力的简单一些。不过出队的时候也会把连接指向设置为null,否则因为还有引用导致无法回收。

还有优化的点:CLH队列在第一次争用时所需的初始虚拟节点的延迟初始化

老爷子说还有很多不同的地方去看我的代码吧~,的确他在源码中写了大量的注释,像论文一样。

Omitting such details, the general form of the resulting implementation of the basic acquire operation (exclusive, noninterruptible, untimed case only) is:

// 入队前,先进行一次抢占锁操作,失败才进行入队
if (!tryAcquire(arg)) {
  // 创建新的节点
	node = create and enqueue new node;
  // 前节点,其实就是尾节点指向
	pred = node's effective predecessor;
  // 首先判断尾节点是否和头节点相同,是的话直接头节点设置
  // 不是的话再尝试抢占锁
	while (pred is not head node || !tryAcquire(arg)) {
    // 抢占锁失败,判断前面节点状态是否为signal,是的话表示前面节点还在等待唤醒,那我就肯定先等待
		if (pred's signal bit is set)
			park();
		else
      // 直接替换前面节点的signal状态
			compareAndSet pred's signal bit to true; 
      // 节点前移
			pred = node's effective predecessor; 
      // 继续循环判断前面节点
    }
	head = node; 
 }

And the release operation is:

if (tryRelease(arg) && head node's signal bit is set) { 
  compareAndSet head's signal bit to false;
  // 唤醒头节点的后面一个节点
	unpark head's successor, if one exists 
}

The number of iterations of the main acquire loop depends, of course, on the nature of tryAcquire. Otherwise, in the absence of cancellation, each component of acquire and release is a constant-time O(1) operation, amortized across threads, disregarding any OS thread scheduling occuring within park.

Cancellation support mainly entails checking for interrupt or timeout upon each return from park inside the acquire loop. A cancelled thread due to timeout or interrupt sets its node status and unparks its successor so it may reset links. With cancellation, determining predecessors and successors and resetting status may include O(n) traversals (where n is the length of the queue). Because a thread never again blocks for a cancelled operation, links and status fields tend to restabilize quickly.

omit:省去,遗漏

the nature of:的本质

amortized cost:摊余成本

这里重点看下为什么老爷子说acquire的伪代码是O(1)的复杂度,我觉得这里要理解一个并发场景假设有100个线程同时抢锁,上面的代码是在第一轮循环时确保一定有一个节点能放入,也就是循环的次数时1+2+3+...+100这样,所以最终的是:(1+2+3+...+n)/n。

关于取消,最坏情况是需要遍历整个队列,所以复杂度是O(n),假如整个队列都是取消状态。

3.4 Condition Queues

The synchronizer framework provides a ConditionObject class for use by synchronizers that maintain exclusive synchronization and conform to the Lock interface. Any number of condition objects may be attached to a lock object, providing classic monitor-style await, signal, and signalAll operations, including those with timeouts, along with some inspection and monitoring methods.

The ConditionObject class enables conditions to be efficiently integrated with other synchronization operations, again by fixing some design decisions. This class supports only Java-style monitor access rules in which condition operations are legal only when the lock owning the condition is held by the current thread (See [4] for discussion of alternatives). Thus, a ConditionObject attached to a ReentrantLock acts in the same way as do built-in monitors (via Object.wait etc), differing only in method names, extra functionality, and the fact that users can declare multiple conditions per lock.

inspection:视察;检查

attached to:附属于

integrate with:使与……结合

exclusive:独有的

继续介绍Condition,提供一个ConditionObject给同步器用,每一个condition都必须属于一个lock,这点和Object的await, signal, signalAll操作一样是要先获得锁。每一个锁可以有关联多个condition

A ConditionObject uses the same internal queue nodes as synchronizers, but maintains them on a separate condition queue. The signal operation is implemented as a queue transfer from the condition queue to the lock queue, without necessarily waking up the signalled thread before it has re-acquired its lock.

ConditionObject使用和前面提到的相同内部队列节点来实现,不过维护单独的队列。signal操作是codition队列到锁队列的传输。这个也是Condition实现的关键机制。

The basic await operation is:

create and add new node to condition queue; 
release lock;
block until node is on lock queue; 
re-acquire lock;

And the signal operation is:

transfer the first node from condition queue to lock queue;

Because these operations are performed only when the lock is held, they can use sequential linked queue operations (using a nextWaiter field in nodes) to maintain the condition queue. The transfer operation simply unlinks the first node from the condition queue, and then uses CLH insertion to attach it to the lock queue.

sequential:连续的,按顺序的

以上是await,signal的伪代码。因为condition队列的操作是在线程拿到锁的情况下进行的,所以维护节点连接的字段nextWaiter并不需要volatile修饰。

The main complication in implementing these operations is dealing with cancellation of condition waits due to timeouts or Thread.interrupt. A cancellation and signal occuring at approximately the same time encounter a race whose outcome conforms to the specifications for built-in monitors. As revised in JSR133, these require that if an interrupt occurs before a signal, then the await method must, after reacquiring the lock, throw InterruptedException. But if it is interrupted after a
signal, then the method must return without throwing an exception, but with its thread interrupt status set.

To maintain proper ordering, a bit in the queue node status records whether the node has been (or is in the process of being)
transferred. Both the signalling code and the cancelling code try to compareAndSet this status. If a signal operation loses this race, it instead transfers the next node on the queue, if one exists. If a cancellation loses, it must abort the transfer, and then await lock re-acquisition. This latter case introduces a potentially unbounded spin. A cancelled wait cannot commence lock reacquisition until the node has been successfully inserted on the lock queue, so must spin waiting for the CLH queue insertion compareAndSet being performed by the signalling thread to succeed. The need to spin here is rare, and employs a Thread.yield to provide a scheduling hint that some other thread, ideally the one doing the signal, should instead run. While it would be possible to implement here a helping strategy for the cancellation to insert the node, the case is much too rare to justify the added overhead that this would entail. In all other cases, the basic mechanics here and elsewhere use no spins or yields, which maintains reasonable performance on uniprocessors.

complication:复杂化

approximately:大约

entail:使必要,需要

reasonable:合理

正如在JSR133中修改的那样,这些规则要求如果一个中断发生在一个信号之前,那么await方法必须在重新获取锁之后抛出InterruptedException。但是如果它在一个信号之后被中断,那么该方法必须在不抛出异常的情况下返回,但是要设置它的线程中断状态。

老爷子解释了一下实现condition的难点是处理信号和取消并发的场景,这点在分析源码时再仔细回顾一下。

posted on 2022-01-21 20:41  每当变幻时  阅读(191)  评论(0编辑  收藏  举报

导航