05【执行计划、SQL语句调优、海量数据插入】

一、获取SQL瓶颈

1.1 执行计划

项目开发中，性能往往都是是我们重点关注的问题，其实很多时候一个SQL往往是整个请求中瓶颈最大的地方，因此我们必须了解SQL语句的执行过程、数据库中是如何扫描表、如何使用索引的、是否命中索引等信息来帮助我们做SQL语句的优化。MySQL提供了explain/desc语句，来显示这条SQL语句的执行计划，执行计划可以帮助我们查看SQL语句的执行情况，我们可以根据反馈的结果来进行SQL的优化。

准备数据

use test;
CREATE TABLE `role`  (
  `id` int(11) NOT NULL,
  `name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
CREATE TABLE `user`  (
  `id` int(11) NOT NULL,
  `name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
  `role_id` int(11) NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
insert into role values(1,'保洁');
insert into role values(2,'保安');
insert into role values(3,'厨师');
insert into user values(1,'zs',1);

1）id 字段

id字段存在三种情况：

1）id相同：id越靠前的表越先执行

explain select * from user u left join role r on u.role_id=r.id;

2）id不同：id越大的表越先执行

explain select * from user u where u.role_id=(select id from role r where r.id=1);

3）id有相同，也有不同：id越大的表越先执行，在id相同的表中，id越靠前的表越先执行

2）select_type 字段

SIMPLE：简单的 select 查询，查询中不包含子查询或者 union

explain select * from user;

PRIMARY：查询条件中包含有子查询时最外层的表（u1）

explain select * from user u1 where u1.id =(select id from user u2 where u2.id=1);

UNION（u2）：使用到union关联时，union关联的表
UNION RESULT（<union1,2>）：使用union时，最终的结果集表

explain select * from user u1 union select * from user u2;

SUBQUERY（u2）：条件子查询中的表

explain select * from user u1 where u1.id =(select id from user u2 where u2.id=1);

SUBQUERY（u2,u3）：条件中的子查询中的表（包括多重层级）

explain select * from user u1 
where u1.name =(
	select name from user u2 where u2.name=(select name from user u3 where u3.name='zs')
);

DEPENDENT SUBQUERY（r1）: 子查询中的条件依赖于外部的查询（r1的条件是u1表中的数据）

explain select * from user u1 where u1.role_id=(select id from role r1 where u1.id=1);

3）table 字段

表示该SQL语句是作用于那张表的，取值为：表名、表别名、衍生表名等。

explain select * from user;
explain select * from user u1;

4）partitions 字段

涉及到分区的表

准备数据：

create table goods_partitions (
	id int auto_increment, 
	name varchar(12),
    primary key(id)
)
partition by range(id)(
		partition p0 values less than(10000),
		partition p1 values less than MAXVALUE
);

查看mysql的物理存储路径：

show variables like '%dir%';

查看物理存储文件，发现多了不同的文件来存储

查看查询语句所使用到的分区：

explain select * from goods_partitions;

explain select * from goods_partitions where id<1000;

5）type 字段

反应一段sql性能指标的重要参数，可以通过此参数来判断是否使用到了索引、是否全表扫描等。

null：代表不访问任何表

explain select 1;

system：表中只有一条记录，并且此表为系统表(一般很少出现)，const类型的特例；

use mysql;			-- 切换到mysql数据库
explain select * from db where host='localhost';

const：通过唯一索引或者是主键查询到的数据（极高）

分别根据name和id查询，发现只有id的type为const。

explain select * from user where id=1;
explain select * from user where name='zs';

给name字段加上唯一索引（必须要是唯一索引，普通索引不行）：

create unique index user_name_unique on user(name);

测试完毕删除唯一索引：

drop index user_name_unique on user;

eq_ref：使用主键的关联查询，并且表中只有一条记录与主表匹配；

explain select * from user u left join role r on u.role_id=r.id;

代表r表使用到了主键来关联查询

ref：通过非唯一索引查询到的数据

创建普通索引：

create index user_name_index on user(name);

查询执行计划：

explain select * from user where name='zs';

测试完毕删除索引：

drop index user_name_index on user;

range：使用索引的范围查询（普通列的范围查询不会是range）

我们执行如下两句sql查看执行计划：

explain select * from user u where u.id>20;				-- 使用索引列进行范围查询
explain select * from user u where u.role_id>20;		-- 使用普通列进行范围查询

给role_id列添加索引，再次执行sql，查看执行计划：

create index user_role_id_index on user(role_id);			
explain select * from user u where u.role_id>20;

测试完毕，删除索引：

drop index user_role_id_index on user;

由于我们的表建立的是逻辑外键，不是真实的外键，因此role_id不是索引字段，如果是真实外键那么该列是索引字段

index：查询的是索引列，遍历了索引树

explain select id from user;

ALL：效率最低,遍历全表

explain select * from user;

查询效率从高到底的取值为：

NULL > system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL
system > const > eq_ref > ref > range > index > ALL

tips：一般我们在开发中，应该尽量达到range级别，避免all，最好优化成ref级别。

6）possible_keys 字段

查询语句中，可能应用到的索引，并非实际使用到的索引。实际使用到的索引根据key字段来反应。

7）key 字段

key字段反应sql语句实际使用的索引，为null代表没有使用索引

我们知道MySQL底层是有优化器的，具体是否走索引MySQL会根据优化器来决定，有时候我们指定了索引查询，MySQL优化器认为不走索引效率可能会更高，因此执行计划possible_keys 字段显示走了索引，但实际上key字段为null（没走索引）

8）key_len 字段

表示索引中使用的字节数

查看如下sql的执行计划：

explain select * from user u where id=1;

possible_keys ：PRIMARY，代表此sql语句可能使用到了主键索引。

key：PRIMARY，代表此sql语句使用到了主键索引。

key_len：此主键索引的长度为4个字节（int占4个字节）。

9）ref 字段

表示某表的某个字段引用到了本表的索引字段，并且表（r表）中只有一条记录与主表匹配；

explain select * from user u,role r where u.role_id=r.id;

表示u表的role_id引用了本表（r表）的索引字段（PRIMARY）

使用其他索引列关联表：

create index role_name_index on role(name);		-- 给name列加索引。
explain select * from user u ,role r where u.name=r.name;   -- 使用name列来关联

表示u表的name字段引用了本表（r表）的索引字段（role_name_index）

测试完毕删除索引：

drop index role_name_index on role;

ref值为const：代表使用索引查询

explain select * from user u where u.id=1;

无论是主键索引还是唯一索引或者是普通索引ref的值都为const

create index user_index_name on user(name);  -- 创建普通索引
explain select * from user u where u.name='1';

测试完毕删除索引：

drop index user_index_name on user;

10）rows 字段

根据表统计信息及索引选用情况，大致估算出找到所需的记录或所需读取的行数。

explain select * from user;
explain select * from role;

user表中有1条记录，role表中有3条记录。

11）filtered 字段

表示返回结果的行数占需读取行数的百分比， filtered 的值越大越好。

在user表中，插入一条新的记录，这条记录在role中关联不到，然后使用表管理查询数据

insert into user values(2,'ls','4');
explain select * from user u inner join role r on u.role_id=r.id;

r表实际记录3条，上述sql语句关联查询出来的结果只能得出一条结果集，因此命中率为33.33%。

select * from user u inner join role r on u.role_id=r.id;

12）extra 字段

显示其他扩展信息

Using filesort：排序时无法使用到索引时，就会出现这个。常见于order by和group by语句中。效率低

explain select name from user order by name;

Using temporary：表示SQL语句的操作使用到了临时表。

explain select name from user group by name;

Using index：代表使用到了索引，效率高

create index user_name_index on user(name);			-- 创建索引
explain select name from user order by name;
explain select name from user where name='1';

测试完毕删除索引：

drop index user_name_index on user;

Using where：扫描全表。通常是查询条件中不是索引字段。

explain select * from user where name='zs';

NULL：没有用到额外的附加条件

explain select * from user where id=1;

性能对比：

Using index>NULL>Using where>Using temporary>Using filesort

1.2 SQL监控

1.2.1 show processlist

show processlist：显示连接当前数据库的一些信息，包括连接者的ip、在做什么操作、连接的数据库等。

show processlist;

注意：如果当前用户是root账号，则可以查阅到所有用户的连接信息，否则只能查询到自己的连接信息。

id：用户登录mysql时，系统自动分配的一个用于唯一标识此用户的id号，可以通过connection_id()函数查询

select connection_id();

user：连接mysql数据库的用户，如果不是root用户，那么只显示自己
host：连接mysql数据库用户的ip、端口
db：显示对方连接的数据库，如果没有选择数据库，默认为NULL
command：显示当前连接正在执行的命令，取值如下：
- sleep：表示当前连接并没有做任何操作，处于休眠状态
- query：表示当前连接正在执行查询操作
- connect：表示当前连接正在连接mysql数据库
- Locked：线程正在等待表锁的释放
- Sorting result：线程正在对结果进行排序
- Sending data：线程正在向请求端返回数据
time：显示command的某个状态连续持续了多长时间，单位：秒。
state：显示当前连接执行的sql语句状态。
info：显示当前连接执行的sql语句。

注意：show processlist只会显示100条记录，如果记录超过100条，那么请使用show full processlist

1.2.2 查询SQL消耗时间

数据准备：准备300W的数据做测试。

CREATE TABLE `userinfo` (
  `id` int(10) NOT NULL COMMENT '用户id',
  `username` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '用户名',
  `age` int(3) NULL DEFAULT NULL COMMENT '年龄',
  `phone` varchar(15) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '手机号',
  `gender` char(1) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '性别: ‘0’-男   ‘1’-女',
  `desc` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL COMMENT '自我介绍',
  `register_time` datetime(0) NULL DEFAULT NULL COMMENT '注册时间',
  `login_time` datetime(0) NULL DEFAULT NULL COMMENT '上一次登录时间',
  `pic` varchar(250) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '头像地址',
  `look` int(10) NULL DEFAULT NULL COMMENT '查看数',
  PRIMARY KEY (`id`)
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
create procedure test_insert(count int)
begin
		declare i int default 1;
		while i<=count do 
			INSERT INTO userinfo values(
			i,								-- id
			uuid(),							-- username
			CEILING(RAND()*90+10),			-- age
			FLOOR(RAND()*100000000000),		-- phone
			round(FORMAT(rand(),1)),		-- gender
			uuid(),							-- desc
			now(),							-- register_time
			now(),							-- login_time
			uuid(),							-- pic
			CEILING(RAND()*90+10)			-- look
			);
			set i=i+1;
		end while;
end;
start transaction;
call test_insert(5000000);			-- 模拟500W的数据
commit;

注意：插入海量数据时，尽量在同一个事务中，因为频繁的开启、提交事务会不停的消耗锁资源，效率非常低。

show profiles：用于查询SQL语句执行时，具体步骤消耗的时间，以及资源消耗的情况。

show profiles是MySQL5.0版本推出的款新的SQL语句调优工具，可以通过select @@have_profiling来查看当前数据库是否支持show profiles命令。

可以看出我们当前安装的数据库是支持show profiles命令的。

show profiles命令默认是关闭的，我们可以通过命令查看是否开启：

show variables like 'profiling';

开启show profiles命令：

set profiling=1;

我们随便执行几条SQL语句之后执行show profiles命令：

select count(*) from userinfo;
select * from userinfo where phone='26635084791';
select * from userinfo where id=1000;
show profiles;

Query_ID：系统给此SQL语句派发的查询ID号。

Duration：执行SQL语句所消耗的时间。

Query：执行的SQL语句。

根据查询ID号获取具体某条SQL的详细执行情况：

show profile for query query_id;
show profile for query 5;

经排查发现，SQL语句大部分时间都消耗在Sending data阶段了。

Tips：Sending data代表的是执行SQL语句，到服务端完全响应结果的整个过程消耗的时间。而不是单单发送这条SQL语句所消耗的时间。

1.2.3 查询SQL执行频率

在mysql的系统变量中，维护着许多我们执行的记录，包括基本的增、删、改、查，索引、改表操作、执行存储过程等，我们可以查询这些系统变量来翻阅我们执行某些SQL语句的频次。

show status like '%com%';

我们可以通过字符个数筛选我们需要的数据：

show status like 'Com_______';

默认查询的是本次连接执行的次数，可以通过加上global来查询全局的执行情况，即数据库服务器开启到现在执行SQL的频次。

show global status like 'Com_______';

只查询InnoDB存储情况的SQL语句执行情况：

show status like 'Innodb_rows_%';

二、索引使用原则

1.1 索引匹配规则

1.1.1 全值匹配

全值匹配是针对建立了复合索引的索引列，在查询条件中，复合索引的所有列都指定具体的列。

为了提升查询效率，我们把表清空：

truncate userinfo;

创建复合索引：

create index idx on userinfo(username,age,phone,gender);

执行全值匹配SQL：

explain select * from userinfo where username='1' and age=1 and phone='1' and gender='1';

1.1.2 最左前缀法则

最左前缀法则指的是，查询条件中如果包含有复合索引，必须从索引的最左列开始查询。并且不跳过中间列。

测试最左前缀法则（1）：

explain select * from userinfo where age=20;			-- 不走索引
explain select * from userinfo where phone='1';			-- 不走索引
explain select * from userinfo where gender='1';		-- 不走索引
explain select * from userinfo where username='1';		-- 走索引

测试最左前缀法则（2）：

最左匹配原则，必须要匹配最左的索引列，不可跳过最左边索引列

-- 不走索引
explain select * from userinfo where phone='1' and age=20 and gender='1';		
-- 不走索引
explain select * from userinfo where gender='1' and age=20 and phone='1';		
-- 不走索引
explain select * from userinfo where phone='1' and gender=20 and age='1';
-- 命中1个字段(username)
explain select * from userinfo where username='1' and phone='1' and gender='1';
-- 命中2个字段(username、age)
explain select * from userinfo where username='1' and age=1 and gender='1';
-- 命中3个字段(username、age、phone)
explain select * from userinfo where username='1' and age=20 and phone='1';

注意：最左前缀法则指的是条件匹配时，必须匹配上创建索引时最左边的列，跟where指定的条件顺序无关！

-- 命中1个字段(username)
explain select * from userinfo where gender='1' and phone='1' and username='1';
-- 命中2个字段(username、age)
explain select * from userinfo where age=1 and gender='1' and username='1';
-- 命中3个字段(username、age、phone)
explain select * from userinfo where age=1 and phone='1' and username='1';

如果四个值都匹配了，既符合最左前缀法则，也符合全值匹配

-- 全部命中
explain select * from userinfo where username='1' and age=1 and phone='1' and gender='1';

测试完毕删除复合索引：

drop index idx on userinfo;

1.1.3 复合索引底层原理

我们刚刚了解到使用复合索引时，要遵循最左前缀法则，即查询时必须要匹配最左的索引列，不可跳过最左边索引列，但具体是为什么呢？我们必须从复合索引底层的B+Tree说起

假设我们建立一个以gender和age列的复合索引：

create index idx on userinfo(gender,age);

假设我们插入的数据是：

insert into userinfo(gender,age) values
(1,20),(0,18),(0,21),(1,25),(0,26),(1,17),(1,20);

底层B+Tree构建情况：

从复合索引的底层B+Tree上可以看出来gender列是有顺序的，age列是没有顺序的，但是细心观察发现在gender列值相同的情况下，age列是有顺序的；这是因为MySQL在构建复合索引的B+Tree时默认根据最左边的第一个字段进行排序，在第一个字段的基础上然后对第二个字段进行排序，以此类推......；因此我们在查询的过程中如果跳过了最左边的列那么这颗B+Tree是不能起作用的；

测试完毕删除索引：

drop index idx on userinfo;

1.1.4 复合索引的好处

1.1.4.1 使用覆盖索引

假设此时有复合索引（username，age，gender)，再查询如下的SQL：

创建复合索引：

create index idx_name on userinfo(username,age,gender);

执行SQL：

explain select username,age,gender from userinfo where username='' and age=20;

MySQL可以直接通过遍历索引取得数据，而无需回表操作（使用覆盖索引）；

删除复合索引：

drop index idx_name on userinfo;

分别为username、age、gender列创建索引：

create index idx_name on userinfo(username);
create index idx_age on userinfo(age);
create index idx_gender on userinfo(gender);

执行SQL：

explain select username,age,gender from userinfo where username='';
explain select username,age,gender from userinfo where username='' and age=1;
explain select username,age,gender from userinfo where username='' and age=1 and gender='1';

发现使用单列索引多个值匹配时，会造成索引失效；即使用到了idx_name索引做username条件的检索，剩下的数据并没有使用idx_age和idx_gender索引进行数据的检索；

测试完毕删除索引：

drop index idx_name on userinfo;
drop index idx_age on userinfo;
drop index idx_gender on userinfo;

1.1.4.2 减少磁盘开销

我们通过前面分析的复合索引底层原理时不难发现，当我们建立了一列（name，age，gender）复合索引时，其实是建立了（name）、（name、age）、（name、age、gender）三个索引，每增加一个索引都会增加写操作的开销。尤其在大数量的表中，使用复合索引可以大大减少IO的开销，提升效率！

1.1.4.3 查询效率高

索引的列越多，通过索引筛选出来的数据就会越少；假设此时表中有1000W的数据，有如下SQL：select from table where col1=1 and col2=2 and col3=3;，假设每个条件可以筛选出10%的数据；

单列索引：通过第一个索引筛选出100W条数据，然后进行二次筛选查询（不使用索引）剩下10W数据，然后再进行第三次查询（不使用索引），剩下1W数据
复合索引：三次筛选都是用到索引，速度快；

1.2 索引失效

1.2.1 单列索引失效

创建单列索引：

create index idx_name on userinfo(username);

1）失效情况一：

不要在索引列上做任何的操作，否则索引将失效

explain select * from userinfo where username='1';		-- 走索引
explain select * from userinfo where concat(username,'1')='1';	-- 不走索引

2）失效情况二：

字符串类索引在条件查询时，条件值没有加双引号，索引失效

explain select * from userinfo where username='1';			-- 走索引
explain select * from userinfo where username=1; 			-- 不走索引

3）失效情况三：

尽量避免**select ***，以及多余的字段。由于索引树上没有直接的数据，需要回表查询（但比Using where 要快的多）

explain select * from userinfo where username='1';				-- 回表
explain select username,age from userinfo where username='1';	-- 回表
explain select username from userinfo where username='1';		-- 不需要回表

4）失效情况四：

使用不等于（**!=**** 或者**<>**）不会命中索引**

测试SQL：

explain select * from userinfo where username='1';			-- 走索引
explain select * from userinfo where username!='1';			-- 不走索引

5）失效情况五：

like，前通配符会导致索引失效，没有使用通配符或者使用后通配符会使用索引，但是会造成回表

explain select * from userinfo where username like '%1%';		-- 不走索引
explain select * from userinfo where username like '%1';		-- 不走索引
explain select * from userinfo where username like '1%';		-- 走索引,回表
explain select * from userinfo where username like '1';			-- 走索引,回表

tips：Using index condition和NULL都是走了索引，但是会回表查询。

因此一般需要频繁模糊查询并且数据量比较大的情况下，一般会取业务前缀；比如数据前面加某个前缀符。

explain select * from userinfo where username like 'T%1';

我们插入300W条数据，观察一下加前缀和未加前缀的效率：

start transaction;			-- 开启事务(控制在一个事务中，避免频繁开启/提交事务)
		
set unique_checks=0;		-- 关闭唯一性校验
call test_insert(3000000);	-- 插入300W测试数据
commit;						-- 提交事务

执行查询：

-- 索引生效
select * from userinfo where username like 'T%1';
-- 索引失效
select * from userinfo where username like '%1';
-- 索引生效
select * from userinfo where username like 'T%1%';

测试完毕清空表：

truncate userinfo;

6）失效情况六：

is null会走索引，is not null不会走索引。

explain select * from userinfo where username is null;
explain select * from userinfo where username is not null;

7）失效情况七

使用not in会导致索引失效

explain select * from userinfo where username in ('1');
explain select * from userinfo where username not in ('1');

测试完毕删除索引：

drop index idx_name on userinfo;

8）失效情况八：

使用or来拼接条件会导致复合索引失效。

我们知道MySQL要想命中复合索引，必须按照最左前缀法则，回顾最左前缀法则：

create index idx_name_age_phone on userinfo(username,age,phone);	-- 创建复合索引
show index from userinfo; 				-- 查询当前索引有哪些

只有一个复合索引（username、age、phone）；

执行如下SQL语句，分析执行计划

explain select * from userinfo where username='1';				-- 命中一个索引
explain select * from userinfo where username='1' and age=1;	-- 命中两个索引
explain select * from userinfo where username='1' and phone='1';-- 命中一个索引(username)
explain select * from userinfo where age=1 and phone='1';		-- 不符合最左前缀法则

上述条件都是使用and拼接，如果一旦使用了or，索引会立即失效，不管是否符合最左前缀法则。也不管是否复合全值匹配

-- 符合最左前缀法则(不走索引)
explain select * from userinfo where username='1' or age=1;
-- 复合最左前缀法则也复合全值匹配(不走索引)
explain select * from userinfo where username='1' or age=1 or phone='1';

测试完毕删除索引：

drop index idx_name_age_phone on userinfo;

1.2.2 复合索引解决部分单列索引失效问题

当多个列都创建了索引（不是复合索引），同时使用这几个索引列查询时，MySQL只会选择其中一个索引去查询，查询到结果之后，根据另一个索引条件进行条件筛选。

给name、age列创建索引：

create index idx_name on userinfo(username);
create index idx_age on userinfo(age);

测试条件使用到多个单列索引：

explain select * from userinfo where username='1';				-- 走索引
explain select * from userinfo where age=1;						-- 走索引
-- 先走idx_name索引查询，然后根据age筛选剩余的数据(这个时候不走索引)
explain select * from userinfo where age=1 and username='1';

其实这条SQL语句的效率也是非常高的，因为使用了idx_name索引先检索出来一部分数据，这个时候是非常快的，然后剩下的数据再通过age条件筛选，但是通过第一轮的筛选剩下来的数据已经非常少了。此时即使不通过索引筛选剩下的数据也是非常快的，除非第一轮筛选过后的的数据量还是非常多（十万、百万），那么第二次不使用索引筛选效率就会比较低了。

针对于上述情况，我们可以根据username和age两列建立复合索引，来解决这个问题。

建立复合索引之前删除前面的两个索引（不删除不行，因为idx_name索引先创建的，如果不删那么会默认选择idx_name索引来查询数据，而不会使用后面创建的复合索引）：

drop index idx_name on userinfo;
drop index idx_age on userinfo;

建立复合索引：

create index idx_name_age on userinfo(username,age);

执行SQL：

explain select * from userinfo where age=1 and username='1';

1.3 ICP索引条件下推

1.3.1 什么是ICP

ICP（Index Condition Pushdown）：即索引条件下推；是MySQL 5.6版本中的新特性，是一种在存储引擎层使用索引过滤数据的一种优化方式。不支持ICP之前，当进行索引查询时，首先根据索引来查找数据，然后再根据where条件来过滤（没有用到索引），扫描了大量不必要的数据，增加了数据库IO操作。

索引条件下推是对使用索引从表中检索行的一种优化。

不使用icp：存储引擎先根据索引查询然后去表（聚集索引）中去定位所需要的行，并将其返回给MySQL服务器，然后服务器进行where条件的过滤。
使用icp：如果where的部分列可以使用索引中的列来过滤，则MySQL服务器会将这部分条件下推到存储引擎，存储引擎使用索引条目来计算已推入的索引条件，只有满足这个条件，才从表中读取行。从而减少存储引擎访问表的次数和mysql服务器访问存储引擎的次数

1.3.2 没有启用ICP

使用二级索引查询到记录（假设有4行数据），之后回表读取完整的记录行（4行），然后MySQL服务器在返回的记录行中根据where条件进行过滤（在4行数据中进行过滤），最终得到数据结果集；

Tips：没有启用ICP的情况下，第一次根据二级索引查询出来的数据，第二次查询没有使用到索引，效率低；

1.3.3 启用了ICP：

使用二级索引查询到记录（4条记录），然后再根据二级索引的列在这4条数据里面进行筛选出符合条件的数据（条件下推到存储引擎），最终得到数据结果集；

Tips：开启了ICP之后，其他列的条件不能通过索引来检索数据时，MySQL服务器会将条件下推到存储引擎再次进行条件过滤，最终命中一条结果集；

1.3.4 ICP案例

开启ICP：

MySQL是默认开启ICP的，我们可以通过优化器开关参数查看ICP的状态：

-- 查询优化器相关参数
show variables like 'optimizer_switch';
-- 关闭ICP
set optimizer_switch='index_condition_pushdown=off'
-- 开启ICP
set optimizer_switch='index_condition_pushdown=on'

案例SQL：

CREATE TABLE `t_emp`  (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `last_name` varchar(255) CHARACTER SET latin1 COLLATE latin1_swedish_ci NULL DEFAULT NULL COMMENT '姓',
  `first_name` varchar(255) CHARACTER SET latin1 COLLATE latin1_swedish_ci NULL DEFAULT NULL COMMENT '名',
  PRIMARY KEY (`id`) USING BTREE,
  INDEX `idx_last`(`last_name`) USING BTREE,
  INDEX `idx_first`(`first_name`) USING BTREE			-- 两个索引
) ENGINE = InnoDB AUTO_INCREMENT = 9 CHARACTER SET = latin1 COLLATE = latin1_swedish_ci ROW_FORMAT = Dynamic;
INSERT INTO `t_emp` VALUES (1, 'li', 'ming');
INSERT INTO `t_emp` VALUES (2, 'wang', 'gang');
INSERT INTO `t_emp` VALUES (3, 'zhang', 'xin');
INSERT INTO `t_emp` VALUES (4, 'wang', 'xiaoxiao');
INSERT INTO `t_emp` VALUES (5, 'chen', 'fei');
INSERT INTO `t_emp` VALUES (6, 'wang', 'xie');
INSERT INTO `t_emp` VALUES (7, 'zhang', 'long');
INSERT INTO `t_emp` VALUES (8, 'wang', 'jun');

开启ICP的执行计划：

desc select * from t_emp where last_name='wang' and first_name like 'gang%';

关闭ICP的执行计划：

set optimizer_switch='index_condition_pushdown=off';
desc select * from t_emp where last_name='wang' and first_name like 'gang%';

三、SQL语句优化

3.1 海量数据插入优化

3.1.1 load 加载数据

当我们有海量数据需要插入时（几百万、几千万），我们可以借助load命令，帮我们加载数据

准备表：

CREATE TABLE `userinfo_1` (
  `id` int(10) NOT NULL COMMENT '用户id',
  `username` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '用户名',
  `age` int(3) NULL DEFAULT NULL COMMENT '年龄',
  `phone` varchar(15) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '手机号',
  `gender` char(1) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '性别: ‘0’-男   ‘1’-女',
  `desc` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL COMMENT '自我介绍',
  `register_time` datetime(0) NULL DEFAULT NULL COMMENT '注册时间',
  `login_time` datetime(0) NULL DEFAULT NULL COMMENT '上一次登录时间',
  `pic` varchar(250) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '头像地址',
  `look` int(10) NULL DEFAULT NULL COMMENT '查看数',
  PRIMARY KEY (`id`)
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;

查看原始数据：

使用load命令加载数据：

load data local infile '/root/orderly.txt' into table userinfo_1 fields terminated by ',' lines terminated by '\n';

/root/orderly.sql：要加载的数据脚本

,：数据的分隔符

\n：数据的换行符

从上面可以看出，插入100W的数据只需要7.2秒左右的时间。

3.1.2 顺序优化

InnoDB存储引擎的按照主键的顺序来排列行的，如果主键的顺序大量不一致，插入数据时需要频繁更新索引结构，影响插入效率。因此我们可以将数据按照主键的顺序进行排列，来提高插入速度。需要注意的是，如果表中没有主键，InnoDB会自动创建一个默认的内部列作为主键

刚刚在上面看到了顺序插入时，100W数据的插入时间为7.2秒左右，我们重新创建一张表，测试插入100W无序的数据所消耗的时间。

先查看两个数据文件的大小是否一致：

完全一致；

创建表：

CREATE TABLE `userinfo_2` (
  `id` int(10) NOT NULL COMMENT '用户id',
  `username` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '用户名',
  `age` int(3) NULL DEFAULT NULL COMMENT '年龄',
  `phone` varchar(15) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '手机号',
  `gender` char(1) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '性别: ‘0’-男   ‘1’-女',
  `desc` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL COMMENT '自我介绍',
  `register_time` datetime(0) NULL DEFAULT NULL COMMENT '注册时间',
  `login_time` datetime(0) NULL DEFAULT NULL COMMENT '上一次登录时间',
  `pic` varchar(250) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '头像地址',
  `look` int(10) NULL DEFAULT NULL COMMENT '查看数',
  PRIMARY KEY (`id`)
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;

原始数据：

主键无序

执行load命令：

load data local infile '/root/disorder.txt' into table userinfo_2 fields terminated by ',' lines terminated by '\n';

同样的数据量，顺序排列的数据插入时间为7.6秒，无序排列插入的数据为46.73秒。

3.1.3 锁优化

在海量数据插入时，保证插入的数据在同一个事务，或者改为手动提交事务。否则MySQL将每一个插入操作都包含这三个操作：开启事务、执行插入、提交事务。频繁的开启锁、释放锁，给性能上带来的极大的消耗。

清空userinfo表：

truncate userinfo;

使用我们前面创建的存储过程插入100W条记录（保证在同一个事务）：

start transaction;
call test_insert(1000000);
commit;

花费1分03秒。

直接执行存储过程，不在同一个事务（频繁开启、释放锁、记录事务日志等）：效率极低

call test_insert(1000000);

花费时间超高，有兴趣的可以测试一下；

3.1.4 唯一性校验

在MySQL插入时，会有数据的唯一性校验操作，我们在导入海量数据时，关闭唯一性校验，提高插入速度。

我们先将表清空：

truncate userinfo;

关闭唯一性校验：

show variables like "UNIQUE_CHECKS";
SET UNIQUE_CHECKS=0;

调用存储过程插入100W数据：

start transaction;
call test_insert(1000000);
commit;

花费24.32秒。

3.1.5 insert优化

我们在插入的时候，尽量把多个insert语句优化成一个insert语句进行插入，这样可以有效提高插入的速度

insert into xx values(col1,col2);
insert into xx values(col1,col2);
insert into xx values(col1,col2);
优化成:
insert into xx values(col1,col2),(col1,col2),(col1,col2);

我们做个测试：

准备两个SQL文件（各100W数据），一个是优化前的，一个是优化后的，执行优化后的sql语句导入：

mysql -uroot -padmin < /root/test.sql

执行test.sql脚本（优化后的），导入100W数据，大约花了14s。

执行优化前的SQL：

mysql -uroot -padmin < /root/test2.sql

有兴趣的可以测试一下，速度超慢

3.2 语句优化

3.2.1 排序优化

1）索引优化：

在排序时，尽量使用索引字段进行排序，否则会采用文件排序（filesort），效率低。

根据普通字段排序：

show index from userinfo;								-- 查询当前表有多少索引
explain select * from userinfo order by username;		-- 根据普通字段排序

给排序字段创建索引：

create index idx_name on userinfo(username);							-- 创建索引
explain select * from userinfo order by username;						-- filesort
explain select username from userinfo order by username;				-- Using index
explain select username,age from userinfo order by username;			-- filesort

根据索引字段排序，之后查询的数据必须是索引数上的数据，不可查询额外字段，更不能查询全表字段

2）算法优化：

MySQL在排序时，如果不能够借助索引直接完成排序，那么将会使用文件排序（filesort）。如果使用了filesort，那么MySQL会将数据在内存中进行排序，排序内存由系统变量sort_buffer_size控制。默认为256KB。

注意，排序缓冲区是每个线程是独享的。

MySQL的排序算法分为两种：

多扫描排序：首先根据排序条件取出排序字段的行指针信息，然后在排序缓冲区（sort_buffer_size）中进行排序。排序完毕之后会根据排序缓冲区中的行指针回表查询。操作磁盘次数多（两次），效率较低。
单扫描排序：根据条件取出所有字段的信息（不仅仅是排序字段），然后在排序缓冲区中进行排序，排序完毕之后直接将结果集返回。这一步对排序缓冲区要求比较大，但排序效率高。

1、如果排序缓冲区大小不足，那么则会采用临时表（temporary table）存储排序结果。之后临时表的行指针信息重新回表查询记录。效率低

2、每个线程都有自己独自的排序缓冲区，如果排序缓冲区设置过大，会浪费内存。

那MySQL到底采用哪种排序算法？

MySQL4.1版本之前只有多扫描排序算法，单扫描排序是MySQL4.1版本推出的新排序算法，用于优化多扫描排序。MySQL主要根据系统变量max_length_for_sort_data的大小和此次Query语句所取出的所有字段类型大小之后对比，如果max_length_for_sort_data大，则使用单扫描排序，反之使用多扫描排序。

查询max_length_for_sort_data的默认值：

select @@max_length_for_sort_data;

max_length_for_sort_data默认1KB。

3.2.2 分组优化

在MySQL中group by语句会触发一次默认的order by排序操作，造成不必要的性能浪费。我们可以手动的禁止分组操作带来的排序操作。

测试：

CREATE TABLE `student`  (				-- 学生表
  `id` int(11) NOT NULL,
  `name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
  `age` int(11) NULL DEFAULT NULL,
  `c_id` int(11) NULL DEFAULT NULL,		-- 班级id
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
CREATE TABLE `class`  (					-- 班级表
  `id` int(11) NOT NULL,
  `name` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
INSERT INTO `test`.`class`(`id`, `name`) VALUES (1, 'Java01');
INSERT INTO `test`.`class`(`id`, `name`) VALUES (2, 'Java02');
INSERT INTO `test`.`class`(`id`, `name`) VALUES (3, 'Java03');
INSERT INTO `test`.`student`(`id`, `name`, `age`, `c_id`) VALUES (1, '小红', 28, 3);
INSERT INTO `test`.`student`(`id`, `name`, `age`, `c_id`) VALUES (2, '小军', 25, 1);
INSERT INTO `test`.`student`(`id`, `name`, `age`, `c_id`) VALUES (3, '小明', 24, 2);
INSERT INTO `test`.`student`(`id`, `name`, `age`, `c_id`) VALUES (4, '小龙', 19, 1);
INSERT INTO `test`.`student`(`id`, `name`, `age`, `c_id`) VALUES (5, '小蓝', 29, 3);
INSERT INTO `test`.`student`(`id`, `name`, `age`, `c_id`) VALUES (6, '小刚', 25, 2);

根据班级id（c_id）计算每个班级平均年龄：

mysql> select avg(age),c_id from student group by c_id;
+----------+------+
| avg(age) | c_id |
+----------+------+
|  22.0000 |    1 |
|  24.5000 |    2 |
|  28.5000 |    3 |
+----------+------+
3 rows in set (0.00 sec)
mysql>

发现默认根据c_id排序了。

如果我们自己手动去除排序结果应该是这样的：

mysql> select avg(age),c_id from student group by c_id order by null;
+----------+------+
| avg(age) | c_id |
+----------+------+
|  28.5000 |    3 |
|  22.0000 |    1 |
|  24.5000 |    2 |
+----------+------+
3 rows in set (0.00 sec)
mysql>

查询如下两个SQL的执行计划：

explain select avg(age),c_id from student group by c_id;
explain select avg(age),c_id from student group by c_id order by null;

发现每次的group by 分组操作都会触发一次默认的排序操作，如果我们没有这样的需求，无疑是增加了SQL语句的响应时间

3.2.3 分页优化

通常使用分页查询是来提高我们的查询效率的，因为通常用户不希望一下子查询到那么多的数据，一般是查询前几条数据，此时的分页效率会比较高。但是在分页查询中也会经常遇到一个非常麻烦的问题，那就是limit N,10，即前面跳过N多条记录，只查询N多条记录的后面几条记录。如果N的值非常大，那么效率必然就会很低。

此时数据库有300W记录：

select count(*) from userinfo;

假设我们需要查询2900000-2900010记录：

explain select * from userinfo limit 2900000,10;			-- 获取sql的执行计划
select * from userinfo limit 2900000,10;					-- 执行sql,查看消耗的时间

从执行计划可以看出，进行了全表扫描，花费的时间为：

显然，效率非常低。

1）优化一

我们可以借助索引，在索引上面排序，然后通过索引关联表查询。

explain select * from userinfo u1,(select id from userinfo order by id limit 2900000,10) t where u1.id=t.id;

查看执行消耗时间：

2）优化二

如果id是顺序排列的话，我们可以先根据id进行排序，然后取后面10条。

explain select * from userinfo where id>2900000 limit 10;

查看执行消耗时间：

3.2.4 表优化

3.2.4.1 数据空洞

当我们对数据库表里面的数据进行删除（delete）时，这些行只是被标记为“已删除”，而不是真的从磁盘中物理删除了，因而空间也没有真的被释放回收。而这些被标记的行就是数据空洞。

我们进入/var/lib/mysql/${db_name}目录下查看当前数据库的一些磁盘存储信息；

300W记录大概花了536M的空间。

我们删除200W数据再次查看磁盘占用：

start transaction;							-- 控制在一个事务中(效率高)
delete from userinfo where id > 1000000;
commit;

再次查看磁盘占用空间，发现还是536M。并没有释放空间。

查看表的数据空洞信息：

mysql> show table status like 'userinfo'\G;
*************************** 1. row ***************************
           Name: userinfo
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 2893050
 Avg_row_length: 184
    Data_length: 533725184
Max_data_length: 0
   Index_length: 50937856
      Data_free: 4194304
 Auto_increment: NULL
    Create_time: 2020-05-13 18:39:36
    Update_time: 2020-05-13 18:38:39
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options: row_format=COMPACT
        Comment: 
1 row in set (0.00 sec)

Index_length：索引的长度
Data_free：碎片的数量

3.2.4.2 optimize优化表

当表中存在有大量的数据空洞时，我们可以使用optimize命令来优化表。即删除数据空洞。

1）InnoDB表优化

InnoDB表优化后会做一个重新构建索引+分析的一个过程。因为我们知道InnoDB引擎将数据与索引是放在一个文件中的，名为.idb，对数据进行整理后，必定会对索引造成影响。而MyISAM是将数据与索引文件分开存储的（.MYD、.MYI），因此MyISAM整理空洞不会对索引造成影响。

执行表优化操作：

optimize table userinfo;

InnoDB表优化后，还会对表中的索引进行重构分析。花费时间比myisam表要多一点。

再次查看表，发现占用磁盘空间变小。

2）MyISAM表优化

创建一张MyISAM表：

CREATE TABLE `userinfo_myisam` (
  `id` int(10) NOT NULL COMMENT '用户id',
  `username` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '用户名',
  `age` int(3) NULL DEFAULT NULL COMMENT '年龄',
  `phone` varchar(15) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '手机号',
  `gender` char(1) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '性别: ‘0’-男   ‘1’-女',
  `desc` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL COMMENT '自我介绍',
  `register_time` datetime(0) NULL DEFAULT NULL COMMENT '注册时间',
  `login_time` datetime(0) NULL DEFAULT NULL COMMENT '上一次登录时间',
  `pic` varchar(250) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '头像地址',
  `look` int(10) NULL DEFAULT NULL COMMENT '查看数',
  PRIMARY KEY (`id`)
) ENGINE = MyISAM CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;

编写存储过程，批量插入300W数据：

CREATE PROCEDURE `test_myisam`(count int)
begin
		declare i int default 1;
		while i<=count do 
			INSERT INTO userinfo_myisam values(
			i,
			uuid(),
			CEILING(RAND()*90+10),
			FLOOR(RAND()*100000000000),
			round(FORMAT(rand(),1)),
			uuid(),
			now(),
			now(),
			uuid(),
			CEILING(RAND()*90+10)
			);
			set i=i+1;
		end while;
end
call test_myisam(3000000);			-- 插入300W数据

删除200W数据：

delete from userinfo_myisam where id>1000000;

执行表优化：

optimize table userinfo_myisam;

花费2.65s，比InnoDB效率高，因为InnoDB需要重新构建索引。

再次查看表磁盘占用情况，发现占用空间减少。

Tips：optimize优化命令会进行锁表操作；

3.2.5.3 临时表

MySQL临时表在很多场景中都会用到，比如用户自己创建的临时表用于保存临时数据，以及MySQL内部在执行复杂SQL时，需要借助临时表进行分组、排序、去重等操作。临时表的作用仅限于本次会话，等连接关闭后重新打开连接临时表将不存在。

临时表类型：

外部临时表：通过create temporary table语句创建的临时表，在创建时可以手动指定临时表的存储引擎

create temporary table temp_table(
	id int,
	name varchar(10)
) ENGINE = InnoDB;
insert into temp_table values (1,'1');
select * from temp_table ;

内部临时表：通常在执行复杂SQL语句时，比如group by，distinct，union等语句，执行计划中如果包含Using temporary，那么MySQL内部将使用自动生成的临时表，以辅助SQL的执行。

explain select username from userinfo group by username;
explain select distinct age from userinfo ;
explain select * from userinfo union select * from userinfo;

临时表相关系统参数变量

show variables like '%tmp%';

tmpdir：临时表存储路径
tmp_table_size：内部临时表内存最大值，超过此值将转移到磁盘中存储。
max_temp_tables：外部临时表内存最大值，超过此值将转移到磁盘中存储。
default_tmp_storage_engine：外部临时表默认采用的存储引擎。
internal_tmp_disk_storage_engine：内部临时表默认采用的存储引擎

临时表相关状态参数

show status like '%tmp%';

Created_tmp_disk_tables：MySQL创建内部磁盘临时表次数。
Created_tmp_tables：MySQL创建内部临时表次数。
Created_tmp_files：创建的临时表文件数（系统内部维护，不用我们管）。

3.3 SQL Hint

SQL Hint也叫SQL提示，是帮助我们优化SQL的一个重要的工具。Hint 作为一种 SQL 补充语法，在关系型数据库中扮演着非常重要的角色。它允许用户通过相关的语法影响 SQL 的执行方式，对 SQL 进行特殊的优化。

1）use index & force index

当SQL语句使用索引列查询时，可能针对这个列创建了多个索引，我们可以让MySQL强制使用某个索引，不考虑其他索引。

创建两个索引：

show index from userinfo;								-- 查看索引
create index idx_name on userinfo(username);			-- 给username创建单列索引
create index idx_name_age on userinfo(username,age);	-- 给username、age创建复合索引

执行如下SQL，观察执行计划：

-- 使用先创建的那个索引(idx_name)
explain select * from userinfo where username='1';
-- 强制使用索引idx_name
explain select * from userinfo use index(idx_name) where username='1';
-- 强制使用索引idx_name_age
explain select * from userinfo use index(idx_name_age) where username='1';

使用use、force都可以让MySQL强制使用某个索引，效果是一样的：

-- 强制使用idx_name索引
explain select * from userinfo force index(idx_name) where username='1';
-- 强制使用idx_name_age索引
explain select * from userinfo force index(idx_name_age) where username='1';

2）ignore index

和use index作用相反，让MySQL忽略某个索引：

-- 忽略idx_name_age索引
explain select * from userinfo ignore index(idx_name_age) where username='1';
-- 忽略idx_name索引
explain select * from userinfo ignore index(idx_name) where username='1';

3）sql_cache

让此sql结果集存入查询缓存中（前提是开启了查询缓存），默认是关闭的，需要我们开启。

查询当前MySQL服务器是否开启查询缓存：（我已经提前开启）

select @@query_cache_type;

查询缓存使用情况：

show global status like '%qcache%';

执行SQL：

select SQL_CACHE * from student; 				-- 将SQL的查询结果放入查询缓存
select SQL_CACHE * from class; 					-- 将SQL的查询结果放入缓存

4）sql_no_cache

和sql_cache相反，查询结果不放入缓存；

select SQL_NO_CACHE * from student;

5）high_priority

用于提升语句的优先级。我们都知道在MyISAM引擎中，写的操作权限远远比读的要大多了，即使读请求先到达执行队列，写请求后到达，那么MySQL也是默认有限执行写请求，如果在大量的写请求来到MySQL服务器时，会造成读请求不可用（排它锁），因此我们可以设置语句的优先级；

查看当前MySQL是否有降低写的优先级：

select @@LOW_PRIORITY_UPDATES;

0（OFF）：表示没有降低写的优先级（写优先级更高）；

1（NO）：代表降低了写的优先级（读的优先级更高）；

可以通过SQL语句改变MySQL写的优先级：

set LOW_PRIORITY_UPDATES=1;

但是修改之后是本次会话全部的写优先级都修改了（如果想修改全局的加上global）。

我们可以通过high_priority命令来提升本次SQL语句的优先级

select high_priority * from userinfo;							-- 提升本次查询操作的优先级
insert high_priority into userinfo(id,username) values(1,'1');	-- 提升本次插入操作的优先级

update high_priority userinfo set username='2' where id=1;		-- 语法错误	
delete high_priority from userinfo where id=1;					-- 语法错误

注意：只有查询（select）和插入（insert）语句才可以提高优先级，修改（delete、update）语句不行。

6）low_priority

降低SQL语句的优先级，只有修改（delete、update、insert）语句才可以降低优先级，查询语句是不能降低优先级的。

update low_priority userinfo set username='1' where id=1;		-- 降低本次修改操作的优先级
delete low_priority from userinfo where id=1;					-- 降低本次删除操作的优先级
insert low_priority into userinfo(id,username) values(2,'1');	-- 降低本次插入操作的优先级
select low_priority from userinfo where id=1;					-- 语法错误

7）insert_delayed

SQL的延时插入；当执行完插入语句时，MySQL并不是立即插入，而是将数据存储在内存队列中，等到MySQL进程有空余时间时再插入，其次这些SQL被集中在一起，执行这些集中的SQL，比执行多个独立的SQL效率要高很多，在实时性要求不高的场景下会给MySQL带来一定的性能提升。

insert delayed into userinfo(id,username) values(3,'3');

8）straight_join

强制指定表的连接顺序，我们知道，所有的SQL语句都会经过MySQL的优化器，进行优化，比如，如下的SQL语句：

explain select * from class c join student s on s.c_id=c.id;
explain select * from student s join class c on c.id=s.c_id;

在上述的内连接中，MySQL的优化器根据"小表驱动大表"的原则，始终选择c表作为驱动表。

我们可以使用straight_join来强制指定驱动表。

explain select * from student s straight_join class c on s.c_id=c.id;
explain select * from class c straight_join student s on s.c_id=c.id;

强制指定左边的表为驱动表

注意：left join 和 right join已经指定了左/右表为驱动表了，因此不能使用straight_join来指定，straight_join一般是应用于内连接查询需要我们来指定驱动表的应用场景。

9）sql_buffer_result

强制将SQL语句的结果集放入内部临时表。

查询当前临时表数量：

mysql> show status like 'Created_tmp_tables';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Created_tmp_tables | 6     |
+--------------------+-------+
1 row in set (0.00 sec)

执行SQL语句：

explain select sql_buffer_result * from userinfo;

再次查询临时表数量：

mysql> show status like 'Created_tmp_tables';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Created_tmp_tables | 7     |
+--------------------+-------+
1 row in set (0.00 sec)

10）sql_big_result & sql_small_result

sql_small_result：告诉优化器结果集会很小，可以将结果集放在内存中的临时表；
sql_big_result：告诉优化器结果集会很大，建议使用磁盘临时表做排序操作；

以上两步操作是我们提示优化器，并不是让优化器强制执行；

explain select sql_small_result username from userinfo group by username;
explain select sql_big_result username from userinfo group by username;
explain select username from userinfo group by username;

posted @ 2023-12-08 21:35 绿水长流*z 阅读(38) 评论(0) 收藏举报

刷新页面返回顶部

緑水長流*z

05【执行计划、SQL语句调优、海量数据插入】

一、获取SQL瓶颈

1.1 执行计划

1）id 字段

2）select_type 字段

3）table 字段

4）partitions 字段

5）type 字段

6）possible_keys 字段

7）key 字段

8）key_len 字段

9）ref 字段

10）rows 字段

11）filtered 字段

12）extra 字段

1.2 SQL监控

1.2.1 show processlist

1.2.2 查询SQL消耗时间

1.2.3 查询SQL执行频率

二、索引使用原则

1.1 索引匹配规则

1.1.1 全值匹配

1.1.2 最左前缀法则

1.1.3 复合索引底层原理

1.1.4 复合索引的好处

1.1.4.1 使用覆盖索引

1.1.4.2 减少磁盘开销

1.1.4.3 查询效率高

1.2 索引失效

1.2.1 单列索引失效

1）失效情况一：

2）失效情况二：

3）失效情况三：

4）失效情况四：

5）失效情况五：

6）失效情况六：

7）失效情况七

8）失效情况八：

1.2.2 复合索引解决部分单列索引失效问题

1.3 ICP索引条件下推

1.3.1 什么是ICP

1.3.2 没有启用ICP

1.3.3 启用了ICP：

1.3.4 ICP案例

三、SQL语句优化

3.1 海量数据插入优化

3.1.1 load 加载数据

3.1.2 顺序优化

3.1.3 锁优化

3.1.4 唯一性校验

3.1.5 insert优化

3.2 语句优化

3.2.1 排序优化

1）索引优化：

2）算法优化：

3.2.2 分组优化

3.2.3 分页优化

1）优化一

2）优化二

3.2.4 表优化

3.2.4.1 数据空洞

3.2.4.2 optimize优化表

1）InnoDB表优化

2）MyISAM表优化

3.2.5.3 临时表

3.3 SQL Hint

1）use index & force index

2）ignore index

3）sql_cache

4）sql_no_cache

5）high_priority

6）low_priority

7）insert_delayed

8）straight_join

9）sql_buffer_result

10）sql_big_result & sql_small_result

公告