三十七、Collection - Set

1、概述

java.util.Set接口继承自Collection接口，它与Collection接口中的方法基本一致，并没有对Collection接口进行功能上的扩充，只是比Collection接口更加严格了。与List接口不同的是，Set接口都会以某种规则保证存入的元素不出现重复。

Set接口特点：

元素不能保证添加和取出顺序（无序）
元素是没有索引的
元素唯一

2、HashSet集合

2.1 概述

java.util.HashSet是Set接口的一个实现类

java.util.HashSet底层的实现其实由java.util.HashMap支持

HashSet是根据对象的哈希值来确定元素在集合中的存储位置，因此具有良好的存储和查找性能。保证元素唯一性的方式依赖于：hashCode与equals方法。

我们先来使用一下Set集合存储，看下现象：

public class HashSetDemo {
    public static void main(String[] args) {
        //创建 Set集合
        HashSet<String>  set = new HashSet<String>();

        //添加元素
        set.add(new String("cba"));
        set.add("abc");
        set.add("bac"); 
        set.add("cba");  
        //遍历
        for (String name : set) {
            System.out.println(name);
        }
    }
}

/*
cba
abc
bac
据结果我们发现字符串"cba"只存储了一个，也就是说重复的元素set集合不存储。
*/

2.2 HashSet集合存储数据的结构

HashSet集合存储数据的结构（哈希表）

在JDK1.8之前，哈希表底层采用【数组 + 链表】实现，即使用数组处理冲突，同一hash值的链表都存储在一个数组元素里（如果内容也一样就进行删除）。但是当位于一个桶中的元素较多，即hash值相等的元素较多时，通过key值依次查找的效率较低。

而JDK1.8中，哈希表存储采用【数组 + 链表 + 红黑树】实现，当链表长度超过阈值（8）时，将链表转换为红黑树，这样大大减少了查找时间。

2.3 HashSet存储流程

JDK1.8引入红黑树大程度优化了HashMap的性能。如果我们往集合中存放自定义的对象，那么保证其唯一，就必须复写hashCode和equals方法建立属于当前对象的比较方式。

2.4 代码体现

创建自定义Student类:

public class Student {
    private String name;
    private int age;

	//get/set
    @Override
    public boolean equals(Object o) {
        if (this == o)
            return true;
        if (o == null || getClass() != o.getClass())
            return false;
        Student student = (Student) o;
        return age == student.age &&
               Objects.equals(name, student.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, age);
    }
}

创建测试类:

public class HashSetDemo2 {
    public static void main(String[] args) {
        //创建集合对象   该集合中存储 Student类型对象
        HashSet<Student> stuSet = new HashSet<Student>();
        //存储 
        Student stu = new Student("于谦", 43);
        stuSet.add(stu);
        stuSet.add(new Student("郭德纲", 44));
        stuSet.add(new Student("于谦", 43));
        stuSet.add(new Student("郭麒麟", 23));
        stuSet.add(stu);

        for (Student stu2 : stuSet) {
            System.out.println(stu2);
        }
    }
}
执行结果：
Student [name=郭德纲, age=44]
Student [name=于谦, age=43]
Student [name=郭麒麟, age=23]

小结：

如果一个类没有重写hashCode方法，那么根据地址获取哈希值。
如果一个类重写hashCode方法，虽然内容不单是获取的哈希值是有可能相等的。这是就需要用equals方式进行内容比对了。

2.5 HashSet的源码分析

HashSet的成员属性及构造方法

public class HashSet<E> extends AbstractSet<E>
   					implements Set<E>, Cloneable, java.io.Serializable{
   
   //内部一个HashMap——HashSet内部实际上是用HashMap实现的
   private transient HashMap<E,Object> map;
   // 用于做map的值
   private static final Object PRESENT = new Object();
   /**
    * 构造一个新的HashSet，
    * 内部实际上是构造了一个HashMap
    */
   public HashSet() {
       map = new HashMap<>();
   }
}

HashSet的add方法源码解析

public class HashSet{
    //......
    public boolean add(E e) {
       return map.put(e, PRESENT)==null;//内部实际上添加到map中，键：要添加的对象，值：Object对象
    }
    //......
}

HashMap的put方法源码解析

public class HashMap{
    //......
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
    //......
    static final int hash(Object key) {//根据参数，产生一个哈希值
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
    //......
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; //临时变量，存储"哈希表"——由此可见，哈希表是一个Node[]数组
        Node<K,V> p;//临时变量，用于存储从"哈希表"中获取的Node
        int n, i;//n存储哈希表长度；i存储哈希表索引
        
        if ((tab = table) == null || (n = tab.length) == 0)//判断当前是否还没有生成哈希表
            n = (tab = resize()).length;//resize()方法用于生成一个哈希表，默认长度：16，赋给n
        if ((p = tab[i = (n - 1) & hash]) == null)//(n-1)&hash等效于hash % n，转换为数组索引
            tab[i] = newNode(hash, key, value, null);//此位置没有元素，直接存储
        else {//否则此位置已经有元素了
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))//判断哈希值和equals
                e = p;//将哈希表中的元素存储为e
            else if (p instanceof TreeNode)//判断是否为"树"结构
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {//排除以上两种情况，将其存为新的Node节点
                for (int binCount = 0; ; ++binCount) {//遍历链表
                    if ((e = p.next) == null) {//找到最后一个节点
                        p.next = newNode(hash, key, value, null);//产生一个新节点，赋值到链表
                        if (binCount >= TREEIFY_THRESHOLD - 1) //判断链表长度是否大于了8
                            treeifyBin(tab, hash);//树形化
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))//跟当前变量的元素比较，如果hashCode相同，equals也相同
                        break;//结束循环
                    p = e;//将p设为当前遍历的Node节点
                }
            }
            if (e != null) { // 如果存在此键
                V oldValue = e.value;//取出value
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;//设置为新value
                afterNodeAccess(e);//空方法，什么都不做
                return oldValue;//返回旧值
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
}

3、LinkedHashSet

3.1 概述

我们知道HashSet保证元素唯一，可是元素存放进去是没有顺序的，那么我们要保证有序，怎么办呢？

在HashSet下面有一个子类java.util.LinkedHashSet，它是链表和哈希表组合的一个数据存储结构。

代码体现：

ublic class LinkedHashSetDemo {
	public static void main(String[] args) {
		Set<String> set = new LinkedHashSet<String>();
		set.add("bbb");
		set.add("aaa");
		set.add("abc");
		set.add("bbb");
        Iterator<String> it = set.iterator();
		while (it.hasNext()) {
			System.out.println(it.next());
		}
	}
}
/*
  bbb
  aaa
  abc
  */

3.2 LinkedHashSet数据结构

底层数据结构：链表（有序）+哈希表（唯一）

即保证元素的唯一，也保证元素的顺序。

posted @ 2021-06-27 16:09 火烧云Z 阅读(48) 评论(0) 收藏举报

刷新页面返回顶部

火烧云