JDK源码阅读—基本集合类(java.util)

JDK源码阅读—基本集合类

My Github
很久以前就看过集合类，但是没有记笔记，本文就当是补的笔记吧，其中涉及java.util包中的集合类型，没有包括java.util.concurrent包。

惯例的类图

Vector
Vector 实现可增长的对象数组。与数组一样，它包含可以使用整数索引进行访问的组件。Vector 的大小可以根据需要增大或缩小，以适应创建 Vector 后进行添加或移除项的操作。
每个向量会试图通过维护 capacity 和 capacityIncrement 来优化存储管理。capacity始终至少应与向量的大小相等；这个值通常比后者大些，因为随着将组件添加到向量中，其存储将按 capacityIncrement的大小增加存储。应用程序可以在插入大量组件前增加向量的容量；这样就减少了增加的重分配的量。它是线程安全的。
下面是它的扩容相关的方法。

 1 protected Object[] elementData;
 2 protected int capacityIncrement; // 如果不设置，在扩容时翻倍
 3 public synchronized void ensureCapacity(int minCapacity) {
 4     if (minCapacity > 0) {
 5         modCount++;
 6         ensureCapacityHelper(minCapacity);
 7     }
 8 }
 9 
10 private void ensureCapacityHelper(int minCapacity) {
11     // overflow-conscious code
12     if (minCapacity - elementData.length > 0)
13         grow(minCapacity);
14 }
15 
16 private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
17 private void grow(int minCapacity) {
18     // overflow-conscious code
19     int oldCapacity = elementData.length;
20     int newCapacity = oldCapacity + ((capacityIncrement > 0) ? capacityIncrement : oldCapacity);
21     if (newCapacity - minCapacity < 0)
22         newCapacity = minCapacity;
23     if (newCapacity - MAX_ARRAY_SIZE > 0)
24         newCapacity = hugeCapacity(minCapacity);
25     elementData = Arrays.copyOf(elementData, newCapacity);
26 }
27 
28 private static int hugeCapacity(int minCapacity) {
29     if (minCapacity < 0) // overflow
30         throw new OutOfMemoryError();
31     return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE :
32         MAX_ARRAY_SIZE;
33 }

那个求2的幂的算法在我的博客中也有提到过，详情见 concurrentHashMap中的2的n次幂上舍入方法

ArrayList
ArrayList是List的数组实现，使用数组作为元素存储的数据结构，使用的是一个Object[]。下面是ArrayList数组扩容的方法。

 1 private void grow(int minCapacity) {
 2   // 下面代码考虑了int的溢出
 3   int oldCapacity = elementData.length;
 4   int newCapacity = oldCapacity + (oldCapacity >> 1);
 5   if (newCapacity - minCapacity < 0)
 6   newCapacity = minCapacity;
 7   if (newCapacity - MAX_ARRAY_SIZE > 0)
 8     newCapacity = hugeCapacity(minCapacity);
 9   elementData = Arrays.copyOf(elementData, newCapacity);
10 }
11 
12 private static int hugeCapacity(int minCapacity) {
13   if (minCapacity < 0) // overflow
14     throw new OutOfMemoryError();
15   return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE;
16 }

另外，subList()方法提供的是ArrayList的一个视图，列表的修改冲突使用一个modCount计数器作为判断依据。Iterator中有一个modCount的快照，在修改数组的时候如果快照与modCount不相等说明列表被同时修改了，这时候操会抛出异常。
ArrayList不是线程安全的。

LinkedList

LinkedList是列表的双向链表实现，在LinkedList中有first，last两个Node的引用，分别是链表的头和尾。
Node的数据结构如下。

 1 private static class Node<E> {
 2     E item;
 3     Node<E> next;
 4     Node<E> prev;
 5     Node(Node<E> prev, E element, Node<E> next) {
 6         this.item = element;
 7         this.next = next;
 8         this.prev = prev;
 9     }
10 }

LinkedList因为使用的是链表的实现，所以不存在数组扩容的问题，其他的实现与ArrayList类似，只是换成了链表的相关操作。
下面是获取对应index的节点的操作，还是做了一些优化的：

 1 /**
 2  * Returns the (non-null) Node at the specified element index.
 3  */
 4 Node<E> node(int index) {
 5     // assert isElementIndex(index);
 6     // 如果index小于size的二分之一则从头遍历，否则从链尾遍历
 7     if (index < (size >> 1)) {
 8         Node<E> x = first;
 9         for (int i = 0; i < index; i++)
10             x = x.next;
11         return x;
12     } else {
13         Node<E> x = last;
14         for (int i = size - 1; i > index; i--)
15             x = x.prev;
16         return x;
17     }
18 }

LinkedList也不是线程安全的。

HashMap

HashMap使用一个Node数组作为桶的数据结构。在有元素冲突发生的时候，使用链表和红黑树解决冲突。当一个桶中的元素小于TREEIFY_THRESHHOLD的时候，使用链表处理冲突，否则用将链表转换成红黑树。当桶中的元素个数小于UNTREEIFY_THRESHOLD的时候，将红黑树转换成链表。
由于红黑树是一种部分平衡的二叉搜索树，这使得在一个桶中元素较多的时候HashMap避免遍历链表，还能有较好的查询性能。

  1 /**
  2  * 对于传入的size，计算能够容纳该size的2的最小次幂，这个神奇的算法在《算法心得》中有提到。
  3  */
  4 static final int tableSizeFor(int cap) {
  5     int n = cap - 1;
  6     n |= n >>> 1;
  7     n |= n >>> 2;
  8     n |= n >>> 4;
  9     n |= n >>> 8;
 10     n |= n >>> 16;
 11     return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
 12 }
 13 /**
 14  * 元素插入
 15  */
 16 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
 17                    boolean evict) {
 18     Node<K,V>[] tab; Node<K,V> p; int n, i;
 19     if ((tab = table) == null || (n = tab.length) == 0)
 20         n = (tab = resize()).length;
 21     if ((p = tab[i = (n - 1) & hash]) == null)
 22         tab[i] = newNode(hash, key, value, null);
 23     else {
 24         Node<K,V> e; K k;
 25         if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
 26             e = p;
 27         else if (p instanceof TreeNode) // 如果是红黑树节点
 28             e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
 29         else {
 30             for (int binCount = 0; ; ++binCount) {
 31                 if ((e = p.next) == null) { // 没有已存在的相等节点
 32                     p.next = newNode(hash, key, value, null);
 33                     // 是否超过转化成红黑树的阀值
 34                     if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
 35                         treeifyBin(tab, hash);
 36                     break;
 37                 }
 38                 if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) // 找到想等节点
 39                     break;
 40                 p = e;
 41             }
 42         }
 43         if (e != null) { // existing mapping for key
 44             V oldValue = e.value;
 45             if (!onlyIfAbsent || oldValue == null)
 46                 e.value = value;
 47             afterNodeAccess(e);
 48             return oldValue;
 49         }
 50     }
 51     ++modCount;
 52     if (++size > threshold)
 53         resize();
 54     afterNodeInsertion(evict);
 55     return null;
 56   }
 57 
 58 /**
 59  * 扩容Map
 60  */
 61 final Node<K,V>[] resize() {
 62     Node<K,V>[] oldTab = table;
 63     int oldCap = (oldTab == null) ? 0 : oldTab.length;
 64     int oldThr = threshold;
 65     int newCap, newThr = 0;
 66     if (oldCap > 0) {
 67         if (oldCap >= MAXIMUM_CAPACITY) {
 68             threshold = Integer.MAX_VALUE;
 69             return oldTab;
 70         } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
 71                   oldCap >= DEFAULT_INITIAL_CAPACITY)
 72             newThr = oldThr << 1; // 原来的两倍
 73     } else if (oldThr > 0) // initial capacity was placed in threshold
 74         newCap = oldThr;
 75     else {               // zero initial threshold signifies using defaults
 76         newCap = DEFAULT_INITIAL_CAPACITY;
 77         newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
 78     }
 79     if (newThr == 0) {
 80         float ft = (float)newCap * loadFactor;
 81         newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
 82                  (int)ft : Integer.MAX_VALUE);
 83     }
 84     threshold = newThr;
 85     @SuppressWarnings({"rawtypes","unchecked"})
 86     Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
 87     table = newTab;
 88     if (oldTab != null) {
 89         // rehash
 90         for (int j = 0; j < oldCap; ++j) {
 91             Node<K,V> e;
 92             if ((e = oldTab[j]) != null) {
 93                 oldTab[j] = null;
 94                 if (e.next == null) // 如果桶中只有一个元素
 95                     newTab[e.hash & (newCap - 1)] = e;
 96                 else if (e instanceof TreeNode) // 处理红黑树
 97                     ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
 98                 else { // preserve order
 99                     Node<K,V> loHead = null, loTail = null;
100                     Node<K,V> hiHead = null, hiTail = null;
101                     Node<K,V> next;
102                     // 这一段处理比较巧妙，用e.hash & oldCap根据oldCap的为1的那一位是否是1来判断该元素是在新的桶数组的前一半还是后一半
103                     do {
104                         next = e.next;
105                         if ((e.hash & oldCap) == 0) {
106                             if (loTail == null)
107                                 loHead = e;
108                             else
109                                 loTail.next = e;
110                                 loTail = e;
111                             }
112                         else {
113                             if (hiTail == null)
114                                 hiHead = e;
115                             else
116                                 hiTail.next = e;
117                                 hiTail = e;
118                             }
119                     } while ((e = next) != null);
120                     // lowHalf 在新桶数组的前一半
121                     if (loTail != null) {
122                         loTail.next = null;
123                         newTab[j] = loHead;
124                     }
125                     // highHalf 在新桶数组的后一半
126                     if (hiTail != null) {
127                         hiTail.next = null;
128                         newTab[j + oldCap] = hiHead;
129                     }
130                 }
131             }
132         }
133     }
134     return newTab;
135 }

另外，在HashMap中的KeySet, Values, EntrySet都只是一个view，遍历通过相应的Iterator进行。HashMap不是线程安全的

HashTable

HashTable是线程安全的Map实现，使用方法级的synchronized保证线程安全。
默认的初始化大小是11，size并不是2的幂，与HashMap 的不一样，它也是使用链表法处理冲突。下面是几个关键的操作方法，足以让我们了解HashTable的数据结构操作。

 1 private transient Entry<?,?>[] table;
 2 
 3 protected void rehash() {
 4     int oldCapacity = table.length;
 5     Entry<?,?>[] oldMap = table;
 6 
 7     // 考虑了溢出的情况
 8     // 新的size是两倍＋1，跟HashMap的不一样
 9     int newCapacity = (oldCapacity << 1) + 1;
10     if (newCapacity - MAX_ARRAY_SIZE > 0) {
11         if (oldCapacity == MAX_ARRAY_SIZE)
12             // Keep running with MAX_ARRAY_SIZE buckets
13             return;
14         newCapacity = MAX_ARRAY_SIZE;
15     }
16     Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];
17 
18     modCount++;
19     threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
20     table = newMap;
21 
22     for (int i = oldCapacity ; i-- > 0 ;) {
23         for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {
24             Entry<K,V> e = old;
25             old = old.next;
26             int index = (e.hash & 0x7FFFFFFF) % newCapacity;
27             e.next = (Entry<K,V>)newMap[index];
28             newMap[index] = e;
29         }
30     }
31 }
32 
33 private void addEntry(int hash, K key, V value, int index) {
34     modCount++;
35     Entry<?,?> tab[] = table;
36     if (count >= threshold) {
37         // Rehash the table if the threshold is exceeded
38         rehash();
39         tab = table;
40         hash = key.hashCode();
41         // 这里用了我们常见的取模操作
42         index = (hash & 0x7FFFFFFF) % tab.length;
43     }
44     // Creates the new entry.
45     @SuppressWarnings("unchecked")
46     Entry<K,V> e = (Entry<K,V>) tab[index];
47     tab[index] = new Entry<>(hash, key, value, e);
48     count++;
49 }
50 
51 @Override
52 public synchronized boolean remove(Object key, Object value) {
53     Objects.requireNonNull(value);
54     Entry<?,?> tab[] = table;
55     int hash = key.hashCode();
56     int index = (hash & 0x7FFFFFFF) % tab.length;
57     @SuppressWarnings("unchecked")
58     Entry<K,V> e = (Entry<K,V>)tab[index];
59     for (Entry<K,V> prev = null; e != null; prev = e, e = e.next) {
60         if ((e.hash == hash) && e.key.equals(key) && e.value.equals(value)) {
61             modCount++;
62             if (prev != null) {
63                 prev.next = e.next;
64             } else {
65                 tab[index] = e.next;
66             }
67             count--;
68             e.value = null;
69             return true;
70         }
71     }
72     return false;
73 }

ArrayDeque

双端队列Deque接口的数组实现，非线程安全，使用数组作为存储数据结构，可以在使用的时候自动扩容。

 1 transient Object[] elements; // non-private to simplify nested class access
 2 // 队列头索引
 3 transient int head;
 4 // 队列尾索引
 5 transient int tail;
 6 /**
 7   * 又是这个神奇的算法
 8   * Allocates empty array to hold the given number of elements.
 9   */
10 private void allocateElements(int numElements) {
11     int initialCapacity = MIN_INITIAL_CAPACITY;
12     // Find the best power of two to hold elements.
13     // Tests "<=" because arrays aren't kept full.
14     if (numElements >= initialCapacity) {
15         initialCapacity = numElements;
16         initialCapacity |= (initialCapacity >>>  1);
17         initialCapacity |= (initialCapacity >>>  2);
18         initialCapacity |= (initialCapacity >>>  4);
19         initialCapacity |= (initialCapacity >>>  8);
20         initialCapacity |= (initialCapacity >>> 16);
21         initialCapacity++;
22         if (initialCapacity < 0)   // Too many elements, must back off
23             initialCapacity >>>= 1;// Good luck allocating 2 ^ 30 elements
24     }
25     elements = new Object[initialCapacity];
26 }
27 
28 private void doubleCapacity() {
29     assert head == tail;
30     int p = head;
31     int n = elements.length;
32     int r = n - p; // number of elements to the right of p
33     int newCapacity = n << 1;
34     if (newCapacity < 0)
35         throw new IllegalStateException("Sorry, deque too big");
36     Object[] a = new Object[newCapacity];
37     System.arraycopy(elements, p, a, 0, r);
38     System.arraycopy(elements, 0, a, r, p);
39     elements = a;
40     head = 0;
41     tail = n;
42 }

对于ArrayDeque的操作，就要看内部类DeqIterator的实现了，以下是部分代码。

 1 private class DeqIterator implements Iterator<E> {
 2     // 头尾索引
 3     private int cursor = head;
 4     private int fence = tail;
 5 
 6     // next方法返回的位置, 如果有元素被删除，那么重置为-1
 7     private int lastRet = -1;
 8 
 9     public boolean hasNext() {
10         return cursor != fence;
11     }
12 
13     public E next() {
14         if (cursor == fence)
15             throw new NoSuchElementException();
16         @SuppressWarnings("unchecked")
17         E result = (E) elements[cursor];
18         if (tail != fence || result == null)
19             throw new ConcurrentModificationException();
20         lastRet = cursor;
21         cursor = (cursor + 1) & (elements.length - 1); // 相当于取模
22         return result;
23     }
24 
25     public void remove() {
26         if (lastRet < 0)
27             throw new IllegalStateException();
28         if (delete(lastRet)) { // if left-shifted, undo increment in next()
29             cursor = (cursor - 1) & (elements.length - 1);
30             fence = tail;
31         }
32         lastRet = -1;
33     }
34     /**
35      * 这个方法中注意为尽量少移动元素而进行的优化
36      */
37     private boolean delete(int i) {
38         checkInvariants();
39         final Object[] elements = this.elements;
40         final int mask = elements.length - 1;
41         final int h = head;
42         final int t = tail;
43         final int front = (i - h) & mask; // i到head的距离
44         final int back  = (t - i) & mask; // i到tail的距离
45         // Invariant: head <= i < tail mod circularity
46         if (front >= ((t - h) & mask))
47             throw new ConcurrentModificationException();
48         // 为尽量少移动元素优化
49         if (front < back) {
50             // 离head比较近
51             if (h <= i) {
52                 // 正常情况
53                 System.arraycopy(elements, h, elements, h + 1, front);
54             } else { // Wrap around
55                 // oioootail****heado这种情况，o表示有元素*表示没有元素
56                 System.arraycopy(elements, 0, elements, 1, i);
57                 elements[0] = elements[mask];
58                 System.arraycopy(elements, h, elements, h + 1, mask - h);
59             }
60             elements[h] = null;
61             head = (h + 1) & mask;
62             return false;
63         } else {
64             // 离head比较远
65             if (i < t) { // Copy the null tail as well
66                 System.arraycopy(elements, i + 1, elements, i, back);
67                 tail = t - 1;
68             } else { // Wrap around
69                 // otail****headooio这种情况，o表示有元素*表示没有元素
70                 System.arraycopy(elements, i + 1, elements, i, mask - i);
71                 elements[mask] = elements[0];
72                 System.arraycopy(elements, 1, elements, 0, t);
73                 tail = (t - 1) & mask;
74             }
75             return true;
76         }
77     }

LinkedHashMap

在Map的实现中，HashMap是无序的。而LinkedHashMap则是有序Map的一种，这个顺序可以是访问顺序或者插入顺序，这个根据构造函数的参数而定，默认是插入顺序。
LinkedHashMap维护了一个Entry的双向链表，通过重写父类HashMap中的操作后处理方法和TreeNode操作方法来维护链表。

 1 Node<K,V> replacementNode(Node<K,V> p, Node<K,V> next) {
 2     // ...
 3     transferLinks(q, t);
 4     return t;
 5 }
 6 
 7 TreeNode<K,V> newTreeNode(int hash, K key, V value, Node<K,V> next) {
 8     // ...
 9     linkNodeLast(p);
10     return p;
11 }
12 
13 TreeNode<K,V> replacementTreeNode(Node<K,V> p, Node<K,V> next) {
14     // ...
15     transferLinks(q, t);
16     return t;
17 }
18 
19 void afterNodeRemoval(Node<K,V> e) { // unlink
20     LinkedHashMap.Entry<K,V> p =
21         (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
22     p.before = p.after = null;
23     if (b == null)
24         head = a;
25     else
26         b.after = a;
27     if (a == null)
28         tail = b;
29     else
30         a.before = b;
31 }
32 
33 void afterNodeInsertion(boolean evict) { // possibly remove eldest
34     LinkedHashMap.Entry<K,V> first;
35     if (evict && (first = head) != null && removeEldestEntry(first)) {
36         K key = first.key;
37         removeNode(hash(key), key, null, false, true);
38     }
39 }
40 
41 void afterNodeAccess(Node<K,V> e) { // move node to last
42     LinkedHashMap.Entry<K,V> last;
43     if (accessOrder && (last = tail) != e) {
44         LinkedHashMap.Entry<K,V> p = (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
45         p.after = null;
46         if (b == null)
47             head = a;
48         else
49             b.after = a;
50         if (a != null)
51             a.before = b;
52         else
53             last = b;
54         if (last == null)
55             head = p;
56         else {
57             p.before = last;
58             last.after = p;
59         }
60         tail = p;
61         ++modCount;
62     }
63 }

TreeMap

说到有序的Map就不能不提TreeMap了。它是基于红黑树（Red-Black tree）的 NavigableMap 实现。该映射根据其键的自然顺序进行排序，或者根据创建映射时提供的 Comparator 进行排序，具体取决于使用的构造方法。此实现为 containsKey、get、put 和 remove 操作提供受保证的 log(n) 时间开销。这些算法是 Cormen、Leiserson 和 Rivest 的 Introduction to Algorithms 中的算法的改编。
TreeMap的操作更多是跟红黑树的实现相关，在这里我就不仔细说了（其实我也说不清楚哈哈），详情可以参考红黑树的wiki百科Red Black Tree

WeakHashMap

以弱键实现的基于哈希表的Map。在WeakHashMap中，当某个键不再正常使用时，将自动移除其条目。更精确地说，对于一个给定的键，其映射的存在并不阻止垃圾回收器对该键的丢弃，这就使该键成为可终止的，被终止，然后被回收。丢弃某个键时，其条目从映射中有效地移除，因此，该类的行为与其他的Map实现有所不同。
它将Key关联到一个弱引用，而元素的KV对象Entry继承自WeakReference，并绑定了一个队列，弱引用不影响GC对于Key的回收，当Key被回收以后，Entry会被添加到ReferenceQueue中。
WakHashMap适合内存敏感的应用场景。

 1 /**
 2   * Reference queue for cleared WeakEntries
 3   */
 4 private final ReferenceQueue<Object> queue = new ReferenceQueue<>();
 5 
 6 private static class Entry<K,V> extends WeakReference<Object> implements Map.Entry<K,V> {
 7     V value;
 8     final int hash;
 9     Entry<K,V> next;
10 
11     /**
12      * Creates new entry.
13      */
14     Entry(Object key, V value, ReferenceQueue<Object> queue, int hash, Entry<K,V> next) {
15         super(key, queue);  // 调用WeakReference构造函数
16         this.value = value;
17         this.hash  = hash;
18         this.next  = next;
19     }
20 }
21 
22 /**
23  * 这个方法将队列中的Node清除
24  */
25 private void expungeStaleEntries() {
26     for (Object x; (x = queue.poll()) != null; ) {
27         synchronized (queue) {
28             @SuppressWarnings("unchecked")
29                 Entry<K,V> e = (Entry<K,V>) x;
30             int i = indexFor(e.hash, table.length);
31 
32             Entry<K,V> prev = table[i];
33             Entry<K,V> p = prev;
34             while (p != null) {
35                 Entry<K,V> next = p.next;
36                 if (p == e) {
37                     if (prev == e)
38                         table[i] = next;
39                     else
40                         prev.next = next;
41                     // Must not null out e.next;
42                     // stale entries may be in use by a HashIterator
43                     e.value = null; // Help GC
44                     size--;
45                     break;
46                 }
47                 prev = p;
48                 p = next;
49             }
50         }
51     }
52 }

以上，接下来应该就到Concurrent包里面的集合类了吧。

posted @ 2017-02-26 23:53 Katsura 阅读(1566) 评论(0) 收藏举报

刷新页面返回顶部

Katsura's blog

Keep It Simple, Stupied.

JDK源码阅读—基本集合类(java.util)