MySQL中的exists与in的使用

该文章总结各种博客文章，再加上自己的理解形成的。

结论：

子查询表大的用exists，子查询表小的用in。

1、exists的用法。

（1）exists子语句返回的是true或者false，当exists中子语句能查询出任意记录行时返回true，查不到任何记录时返回false。exists子语句查询出来的记录行没有任何用，因此很多时候用select 1就行，只需要判断是否能查询出记录即可。

（2）对于外表查询出来的记录，每条记录都会当作exists的条件去查询exists中的子语句，当exists子语句为false时，该条数据则会被丢弃。因此exists子语句中可以有外表的字段和表名。

例如：

mysql> select * FROM MYSQL.help_keyword where help_keyword_id < 5;
+-----------------+--------------+
| help_keyword_id | name         |
+-----------------+--------------+
|               0 | HELP_DATE    |
|               1 | HELP_VERSION |
|               2 | DEFAULT      |
|               3 | SERIAL       |
|               4 | VALUE        |
+-----------------+--------------+
5 rows in set (0.00 sec)
-- 首先help_keyword表查询出来的5条记录

mysql> select * from mysql.help_keyword hk where hk.help_keyword_id < 5 and exists(select 1 from mysql.help_relation hr where hr.help_keyword_id = hk.help_keyword_id);
+-----------------+--------------+
| help_keyword_id | name         |
+-----------------+--------------+
|               0 | HELP_DATE    |
|               1 | HELP_VERSION |
|               2 | DEFAULT      |
|               3 | SERIAL       |
|               4 | VALUE        |
+-----------------+--------------+
5 rows in set (0.00 sec)
-- 加上exists语句后，每条记录都要当作exists的条件去执行select 1 from mysql.help_relation hr where hr.help_keyword_id = hk.help_keyword_id。如第一条记录hk.help_keyword_id = 0，则查询select 1 from mysql.help_relation hr where hr.help_keyword_id = 0，如下。可以查询出数据，因此exists返回true，因此在help_keyword表中help_keyword_id = 0的这条记录是合法的。
mysql> select 1 from mysql.help_relation hr where hr.help_keyword_id = 0;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0.00 sec)

-- 同样对于剩余四条记录都要执行exists子语句，都能查询出记录，因此这四条记录都是合法的。
-- 若将exists子语句中的条件修改为如下的100000，而不是外表的字段值，则exists子语句永远不能查出记录，exists子语句返回false，因此这五条记录都会被过滤掉。
mysql> select * from mysql.help_keyword hk where hk.help_keyword_id < 5 and exists(select 1 from mysql.help_relation hr where hr.help_keyword_id = 100000);
Empty set (0.00 sec)

（3）not exists和exists正好相反。当子语句能查询出任意记录时，exists返回true，此时not exists就是false；当子语句不能查询出任意记录时，exists返回false，而not exists就是true；

-- 还是上面那个例子，换成not exists后，对于每条外表查询出来的数据，not exists都为false，则外表五条记录都会被过滤掉。
mysql> select * from mysql.help_keyword hk where hk.help_keyword_id < 5 and not exists(select 1 from mysql.help_relation hr where hr.help_keyword_id = hk.help_keyword_id);
Empty set (0.00 sec)

（4）从以上可以看出，外表查询出来的记录条数即为exists子语句的查询次数。

2、in的用法

（1）in和exists一样返回true或者false，用于判断某个或某几个字段是否存在于in的子语句查询记录中。如下：

mysql> select * from mysql.help_keyword hk where hk.help_keyword_id < 5 and hk.help_keyword_id in (select hr.help_keyword_id from mysql.help_relation hr);
+-----------------+--------------+
| help_keyword_id | name         |
+-----------------+--------------+
|               0 | HELP_DATE    |
|               1 | HELP_VERSION |
|               2 | DEFAULT      |
|               3 | SERIAL       |
|               4 | VALUE        |
+-----------------+--------------+
5 rows in set (0.00 sec)

（2）in和exists不一样的地方在于，无论外表查询记录为多少条，in子语句只会查询一次，并将结果缓存起来，然后对于遍历外表查询出来的记录，判断in语句是否成立。如下：

mysql> select hr.help_keyword_id from mysql.help_relation hr; -- 首先会把该条语句查询出来的记录缓存起来
+-----------------+
| help_keyword_id |
+-----------------+
|               0 |
|               1 |
......
|             710 |
|             710 |
+-----------------+
1635 rows in set (0.00 sec)
-- 然后遍历外表查询出来的五条数据，依次判断hk.help_keyword_id是否存在于缓存记录中。

（3）in相当于用or连接的=判断。因此in前面的字段是可以使用索引的。如下key字段都为primarykey。

-- mysql.help_category表有如下索引
mysql> show index from mysql.help_category;
+---------------+------------+----------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table         | Non_unique | Key_name | Seq_in_index | Column_name      | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+---------------+------------+----------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| help_category |          0 | PRIMARY  |            1 | help_category_id | A         |          44 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| help_category |          0 | name     |            1 | name             | A         |          44 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
+---------------+------------+----------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
2 rows in set (0.00 sec)

-- 因此以下两条语句都使用了key_name为name的索引。
mysql> explain select * from mysql.help_category hc where hc.name in ('Contents', 'Help Metadata', 'Data Types');
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | hc    | NULL       | range | name          | name | 192     | NULL |    3 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from mysql.help_category hc where hc.name = 'Contents' or hc.name = 'Help Metadata' or hc.name = 'Data Types';
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | hc    | NULL       | range | name          | name | 192     | NULL |    3 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

3、exists和in的区别

区别在于：in子语句只查询了一次，但是in的子语句会遍历整张help_relation表（遍历了1635次）；exists子语句会查询五次，但是每次查询都有where条件，where条件中可能会使用到索引，因此每次查询可能不用遍历整张表。例如：

-- 前提假设：假设true
-- 情景1：
-- 假设A表：一共100条记录，有主键字段aid，通过主键aid查询记录只需要遍历索引树1次。
-- 假设B表：一共10000000条记录，有主键bid，字段aid，字段aid设置了索引，假设通过该索引进行查询时平均需要遍历索引树10次。
select * from A where exists (select 1 from B where B.aid = A.aid);
-- exists子语句遍历次数：100 * 10 = 1000，整条语句遍历次数：1000 + 100（判断100次ture or false） = 1100。
select * from A where aid in (select aid from B);
-- in子语句遍历次数：10000000，整条语句遍历次数：10000000 + 1（通过主键aid只需要遍历1次） = 10000001。

-- 情景2：
-- 假设A表：一共10000000条记录，有主键字段aid，通过主键aid查询记录只需要遍历索引树10次。
-- 假设B表：一共100条记录，有主键bid，字段aid，字段aid设置了索引，假设通过该索引进行查询时平均需要遍历索引树2次。
select * from A where exists (select 1 from B where B.aid = A.aid);
-- exists子语句遍历次数：10000000 * 2 = 20000000，整条语句遍历次数：20000000 + 10000000（判断这么多次true or false）
select * from A where aid in (select aid from B);
-- in子语句遍历次数：100，整条语句遍历次数：100 + 10（通过主键aid只需要遍历10次） = 110。

情景1可以发现in语句虽然只查询一次，但是会遍历整张表，当B表很大时，in的效率会非常低，也更占用空间。情景2可以发现当A表很大时，虽然exists子语句只需要遍历2次，但是由于外表很大，导致整个查询过程中exists会遍历很多次，导致exists的效率很低。因此，当子查询表远大于外表时用exists，子查询表远小于外表时用in。无论时外表还是子查询里面的内表都要善用索引。

posted @ 2020-10-18 19:30 yury757 阅读(508) 评论(0) 收藏举报

刷新页面返回顶部

yury757

这个人很懒，什么都没写......

MySQL中的exists与in的使用

MySQL中的exists与in的使用

公告