MySQL之索引

索引（在MySQL中也叫键（key））是存储引擎用于快速找到记录的一种数据结构。

索引类型有：Normal，Unique，FullText。

索引方法有：BTREE、HASH。

我有一个user_info的测试表

里面随机生成了300个姓名

一、索引基础

创建一个普通索引：

mysql> create index myindex on user_info(username);
Query OK, 0 rows affected
Records: 0  Duplicates: 0  Warnings: 0

也可以指定索引的长度：

create index myindex on user_info (username(3));

创建一个唯一索引：

mysql> create unique index uindex on user_info (id);
Query OK, 0 rows affected
Records: 0  Duplicates: 0  Warnings: 0

创建一个全文索引：（InnoDB引擎对全文索引的支持是5.6版本引入的）

mysql> create fulltext index findex on user_info (username);
Query OK, 0 rows affected
Records: 0  Duplicates: 0  Warnings: 1

到目前为止创建了三个索引：

可以看出，默认的索引方法是BTREE。

删除索引：

mysql> drop index findex on user_info;
Query OK, 0 rows affected
Records: 0  Duplicates: 0  Warnings: 0
mysql> drop index uindex on user_info;
Query OK, 0 rows affected
Records: 0  Duplicates: 0  Warnings: 0

关键词解释：

主键：当我们创建主键的时候，同时分配了一个唯一索引，也就是“主索引”，不过与唯一索引不一样的是，主键用关键字Primary 而不是 Unique。

外键：一般外键字段为某个表的主键，关键字为Foreign Key

普通索引：这种索引是最基本的索引，作用就是加快数据的访问速度。

唯一索引：普通索引允许数据列包括重复的数据，而唯一索引不允许。

全文索引：该索引可以用于全文搜索。

二、索引设计的原则

1. 最适合索引的列是出现在where子句后面的列，或连接子句中出现的列。

2. 使用唯一索引。考虑列中值的分布。索引的列基数越大，索引的效果越好。

3. 使用短索引。如果对字符串列进行索引，应该指定一个前缀长度。

4. 利用最左前缀。

5. 不要过度索引。

这里解释一下最左前缀：

我创建了一个组合索引：

mysql> create index nindex on user_info (username(3),password(6));
Query OK, 0 rows affected
Records: 0  Duplicates: 0  Warnings: 0

实际上我相当于建立了两个索引，分别是：

username，password

username

而没有password。

因为这是MySQL组合索引“最左前缀”的结果。简单理解就是只从最左边开始组合。

例如：

mysql> explain select * from user_info where username = '茹芬慧' and password = '123456';
+----+-------------+-----------+------+----------------+---------+---------+-------+------+------------------------------------+
| id | select_type | table     | type | possible_keys  | key     | key_len | ref   | rows | Extra                              |
+----+-------------+-----------+------+----------------+---------+---------+-------+------+------------------------------------+
|  1 | SIMPLE      | user_info | ref  | myindex,nindex | myindex | 767     | const |    1 | Using index condition; Using where |
+----+-------------+-----------+------+----------------+---------+---------+-------+------+------------------------------------+
1 row in set

mysql> explain select * from user_info where password = '123456';
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table     | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | user_info | ALL  | NULL          | NULL | NULL    | NULL |  300 | Using where |
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
1 row in set

第一个使用了索引，d第二个没有。

三、语句什么时候执行索引，什么时候不执行？

为什么要写这个，这个标题的意思不是告诉你什么时候用索引，而是告诉你当你使用索引的时候，索引什么时候不干活。

例子1：

mysql> select * from user_info where username like '%茹%';
+-----+----------+----------+------------+
| id  | username | password | mydate     |
+-----+----------+----------+------------+
|  24 | 茹芬慧   | 123456   | 2017-04-28 |
| 144 | 茹瑛炫   | 123456   | 2017-04-28 |
+-----+----------+----------+------------+
2 rows in set

mysql> select * from user_info where username like '茹%';
+-----+----------+----------+------------+
| id  | username | password | mydate     |
+-----+----------+----------+------------+
| 144 | 茹瑛炫   | 123456   | 2017-04-28 |
|  24 | 茹芬慧   | 123456   | 2017-04-28 |
+-----+----------+----------+------------+
2 rows in set

这两个查询语句都查出了结果，可是到底有没有好好干活呢？

mysql> explain select * from user_info where username like '%茹%';
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table     | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | user_info | ALL  | NULL          | NULL | NULL    | NULL |  300 | Using where |
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
1 row in set

mysql> explain select * from user_info where username like '茹%';
+----+-------------+-----------+-------+---------------+---------+---------+------+------+-----------------------+
| id | select_type | table     | type  | possible_keys | key     | key_len | ref  | rows | Extra                 |
+----+-------------+-----------+-------+---------------+---------+---------+------+------+-----------------------+
|  1 | SIMPLE      | user_info | range | myindex       | myindex | 767     | NULL |    2 | Using index condition |
+----+-------------+-----------+-------+---------------+---------+---------+------+------+-----------------------+
1 row in set

看吧，第一个扫描了全部行，而第二个就令我们满意了。

mysql> set profiling = 1;
Query OK, 0 rows affected

mysql> select * from user_info where username like '%茹%';
+-----+----------+----------+------------+
| id  | username | password | mydate     |
+-----+----------+----------+------------+
|  24 | 茹芬慧   | 123456   | 2017-04-28 |
| 144 | 茹瑛炫   | 123456   | 2017-04-28 |
+-----+----------+----------+------------+
2 rows in set

mysql> select * from user_info where username like '茹%';
+-----+----------+----------+------------+
| id  | username | password | mydate     |
+-----+----------+----------+------------+
| 144 | 茹瑛炫   | 123456   | 2017-04-28 |
|  24 | 茹芬慧   | 123456   | 2017-04-28 |
+-----+----------+----------+------------+
2 rows in set

mysql> show profiles;
+----------+-----------+----------------------------------------------------+
| Query_ID | Duration  | Query                                              |
+----------+-----------+----------------------------------------------------+
|        1 | 0.0005955 | select * from user_info where username like '%茹%' |
|        2 |  0.000492 | select * from user_info where username like '茹%'  |
+----------+-----------+----------------------------------------------------+
2 rows in set

用了索引的快。

结论一：MySQL能在索引中做最左前缀匹配的LIKE比较，因为该操作可以转换为简单的比较操作，但是！如果是以通配符开头的LIKE查询，就完蛋了。

例子2：

今天重新生成了一下数据，一共798条

我想测试一下group by。

mysql> select count(*),mydate from user_info group by mydate;
+----------+---------------------+
| count(*) | mydate              |
+----------+---------------------+
|      300 | 2017-04-30 10:07:04 |
|      199 | 2017-04-30 10:37:19 |
|      299 | 2017-04-30 10:37:34 |
+----------+---------------------+
3 rows in set

把mydate字段加上索引，并再次执行上面的语句

mysql> create index tindex on user_info (mydate);
Query OK, 0 rows affected
Records: 0  Duplicates: 0  Warnings: 0
mysql> select count(*),mydate from user_info group by mydate;
+----------+---------------------+
| count(*) | mydate              |
+----------+---------------------+
|      300 | 2017-04-30 10:07:04 |
|      199 | 2017-04-30 10:37:19 |
|      299 | 2017-04-30 10:37:34 |
+----------+---------------------+
3 rows in set

最后，比较加上索引与不加索引的区别

mysql> show profiles;
+----------+------------+-----------------------------------------------------------+
| Query_ID | Duration   | Query                                                     |
+----------+------------+-----------------------------------------------------------+
|        1 |  0.0007455 | select count(*),username from user_info group by username |
|        2 |   0.003184 | select count(*),password from user_info group by password |
|        3 |   0.000651 | select * from user_info where password = 'aedd902d'       |
|        4 |  0.0002475 | select count(*) from user_info where group by mydate      |
|        5 |  0.0182485 | select count(*) from user_info group by mydate            |
|        6 |  0.0012275 | select count(*),mydate from user_info group by mydate     |
|        7 | 0.00123675 | select count(*),username from user_info group by username |
|        8 |  0.2569555 | create index tindex on user_info (mydate)                 |
|        9 | 0.00111575 | select count(*),mydate from user_info group by mydate     |
+----------+------------+-----------------------------------------------------------+
9 rows in set

只需要看6和9，虽然索引快了一些，但是基本上可以忽略了，这是因为我mydate字段重复的太多，不利于索引。

结论2：索引的选择性越高则查询效率越高，因为选择性高的索引可以让MySQL在执行的时候过滤掉更多的行。（选择性=不重复的索引值（也称为基数）/数据表的记录总数）

例如我的：

username的选择性为1，效率最高

mydate的选择性为6/798，效率非常差

例子3：

mysql> select * from user_info where id + 1 = 798;
+-----+----------+----------+---------------------+
| id  | username | password | mydate              |
+-----+----------+----------+---------------------+
| 797 | 燕淞可   | 3e68fcdd | 2017-04-30 10:37:34 |
+-----+----------+----------+---------------------+
1 row in set

mysql> select * from user_info where id = 797;
+-----+----------+----------+---------------------+
| id  | username | password | mydate              |
+-----+----------+----------+---------------------+
| 797 | 燕淞可   | 3e68fcdd | 2017-04-30 10:37:34 |
+-----+----------+----------+---------------------+
1 row in set

这两条查询语句代表的意思一样，但是执行起来就一样吗？效率差太多了

mysql> show profiles;
|       11 | 0.00116025 | select * from user_info where id + 1 = 798                |
|       12 |   0.000483 | select * from user_info where id = 797                    |

查看Explain

mysql> explain select * from user_info where id + 1 = 798;
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table     | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | user_info | ALL  | NULL          | NULL | NULL    | NULL |  798 | Using where |
+----+-------------+-----------+------+---------------+------+---------+------+------+-------------+
1 row in set

mysql> explain select * from user_info where id = 797;
+----+-------------+-----------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table     | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+-----------+-------+---------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | user_info | const | PRIMARY       | PRIMARY | 4       | const |    1 | NULL  |
+----+-------------+-----------+-------+---------------+---------+---------+-------+------+-------+
1 row in set

这是因为如果查询的列而不是独立的，就不会使用索引。“独立的列”指的是：不能是表达式的一部分，也不能是函数的一部分。

结论三：查询的列需要独立

例子4：

前缀索引，什么时候用呢。对于BLOB，TEXT或者很长的Varchar，我们就不得不使用前缀索引了，并且MySQL不允许索引这些列的完整长度。

但是！对于前缀索引，多长才合适呢？

首先，计算完整列的选择性：

mysql> select count(distinct password)/count(*) from user_info;
+-----------------------------------+
| count(distinct password)/count(*) |
+-----------------------------------+
| 1.0000                            |
+-----------------------------------+
1 row in set

(⊙o⊙)…这是个失误，都怪我这个生成的都好唯一啊。那就干讲吧。

完整列的选择性 = （目标列不重复数目）/（数据总数）

然后计算前缀的选择性，并且让前缀的选择性接近完整列的选择性。

前缀的选择性 = count(distinct left(【列名】,【长度】))/（数据总数）

前缀的选择性需要去换数值一个一个试的。

等你找到了合适的长度，就可以添加这个索引了

create index yourindex on yourtable (yourcol(yourlength));

不过无法用前缀索引做Group By 和Order By。

结论四：如果需要索引很长的字符列，那就使用前缀索引

例子5：

最近比较忙，未完待续...

posted @ 2017-05-13 20:15 露娜妹阅读(377) 评论(0) 收藏举报

刷新页面返回顶部

露娜妹

MySQL之索引

公告