HashMap的数据结构(一)

JDK1.8 以前HashMap的实现是 数组+链表

JDK1.8 开始HashMap的实现是 数组+链表+红黑树,如下图:

 

HashMap类中有两个常量:

static final int TREEIFY_THRESHOLD = 8;
static final int UNTREEIFY_THRESHOLD = 6;

  

当链表中节点数量大于等于TREEIFY_THRESHOLD时,链表会转成红黑树。

当链表中节点数量小于等于UNTREEIFY_THRESHOLD时,红黑树会转成链表。

为什么TREEIFY_THRESHOLD的默认值被设定为8?

HashMap中有这样一段注释

    /* Because TreeNodes are about twice the size of regular nodes, we
     * use them only when bins contain enough nodes to warrant use
     * (see TREEIFY_THRESHOLD). And when they become too small (due to
     * removal or resizing) they are converted back to plain bins.  In
     * usages with well-distributed user hashCodes, tree bins are
     * rarely used.  Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million
     */

意思就是HashMap节点分布遵循泊松分布,按照泊松分布的计算公式计算出了链表中元素个数和概率的对照表,可以看到链表中元素个数为8时的概率已经非常小。

另一方面红黑树平均查找长度是log(n),长度为8的时候,平均查找长度为3,如果继续使用链表,平均查找长度为8/2=4,这才有转换为树的必要。链表长度如果是小于等于6,6/2=3,虽然速度也很快的,但是链表和红黑树之间的转换也很耗时。还有选择6和8,中间有个差值7可以有效防止链表和树频繁转换。

 

下一节详细讲解HashMap的Put方法流程。

 

posted @ 2018-10-23 15:09  shileishmily  阅读(1303)  评论(0编辑  收藏  举报