【MySQL】order by引起的慢查询问题总结

最近遇到了一个SQL没有走索引导致出现慢查询的问题，SQL本身很简单，两张表联合查询然后进行排序和分页，由于涉及到一些业务，这里以用户表和订单表为例，用户表数据在35W左右，订单表数据只有8条，订单表中有外键userid可以与用户表关联，此时查询在订单表中不存在数据的用户，根据用户的姓名、性别和创建时间排序然后分页取前十条，慢查询SQL写法如下：

SELECT * 
FROM t_user user
LEFT JOIN t_order order ON user.id = order.userid
WHERE order.id IS NULL  
ORDER BY 
  user.name DESC, 
  user.gender DESC, 
  user.createtime DESC  
LIMIT 0, 10;

MySQL版本为5.7，在需要排序的字段name、gender和createtime上添加了联合索引I_USERINFO_MIX，订单表的userid也添加了索引I_USERID，SQL执行时间在5到6秒之间，查看分析计划，并没有用到联合索引，在user表在使用了临时表和文件排序，并且使用了Block Nested-Loop Join算法与订单表进行关联：

解决方式

一、强制使用索引

MySQL可以使用FORCE INDEX强制使用索引，但是此种方式后期不便于维护，比如说索引名称发生了变化，或者更换了数据库，不支持此种写法：

SELECT * 
FROM t_user user FORCE INDEX(I_USERINFO_MIX)
LEFT JOIN t_order order ON user.id = order.userid
WHERE order.id IS NULL  
ORDER BY 
  user.name DESC, 
  user.gender DESC, 
  user.createtime DESC  
LIMIT 0, 10;

FORCE INDEX的方式SQL执行时间在0.4s左右，从查询计划中可以看出没有使用临时表进行排序，对订单表进行了全表扫描：

二、使用子查询

既然是查询在order表不存在数据的用户，那么可以换一种写法，不使用JOIN，使用子查询的方式将在订单表存在数据的用户的ID查出来之后，使用NOT IN过滤：

SELECT * 
FROM t_user user 
WHERE user.id NOT IN (select userid from t_order order where user.id = order.userid)
ORDER BY 
  user.name DESC, 
  user.gender DESC, 
  user.createtime DESC  
LIMIT 0, 10;

子查询SQL执行时间同样在0.4s左右，从查询计划中可以看出没有使用临时表进行排序，并且在订单表上也使用了索引进行查询，由于订单表此时的数据量太小，所以与全表扫描的性能差异不是很大：

总结
从查询计划上看，比较推荐使用子查询的方式。

未走索引分析

name、gender和createtime上的联合索引是有序的，可以省去排序的过程，为什么使用LEFT JOIN的方式没有选择走索引？

个人猜测有可能是因为走索引需要进行回表（也可能有联表的性能损耗？），MySQL认为这个过程的消耗不如全表扫描，所以放弃索引进行全表扫描，然后对结果进行排序，这个过程比较耗时导致了慢SQL。

接下来通过OPTIMIZER_TRACE验证下是否与猜测的一致：

开启OPTIMIZER_TRACE

set OPTIMIZER_TRACE='enabled=on';

执行业务SQL
查看OPTIMIZER_TRACE

SELECT * FROM information_schema.OPTIMIZER_TRACE;

首先看一下未走索引时的TREACE，这里主要关注considered_execution_plans，considered_execution_plans表示MySQL认为可能的执行计划，并选出最优的计划：

计算了用户表全表扫描的成本，扫描行数为354351，为用户表的数据总量，成本值为77151
用户表需要与订单表联表查询，订单表中的userid字段创建了索引，所以MySQL有两种选择：

（1）通过userid上建立的索引进行查询，扫描行数为1.1667，这里为预估的扫描行数（因为订单表存在几条垃圾数据，所以不是所有的数据都可以与用户表关联上），个人猜测MySQL可能认为使用userid索引时需要回表，所以这里成本值较大，成本值为496091

（2）直接对订单表进行全表扫描，扫描函数为5，因为不需要回表成本值为114766
计算总成本，第2步中选择成本较小的执行计划，也就是对订单表进行全表扫描成本值，加上第一步中对用户表全表扫描的成本值：

总成本：扫描user表的77151 + 114766 = 191917，与实际的总成本值191918差了1，猜测可能还要加上一些其他的成本值，这里先不关注。

{
  "steps": [
    {
      "join_preparation": {
        // 预处理
        // ...
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          // ...
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`t_user` `user`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan": 354351,  // 扫描行数
                      "access_type": "scan",
                      "resulting_rows": 354351,
                      "cost": 77151,  // 扫描成本
                      "chosen": true
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 354351,
                "cost_for_plan": 77151, // 成本
                "rest_of_plan": [
                  {
                    "plan_prefix": [
                      "`t_user` `user`"
                    ],
                    "table": "`t_order` `order`",
                    "best_access_path": {
                      "considered_access_paths": [
                        {
                          "access_type": "ref",
                          "index": "I_USERID", // 使用userid索引
                          "rows": 1.1667,  // 扫描行数，这里是预估的扫描行数
                          "cost": 496091, // 扫描成本，猜测是因为需要回表所以成本较大
                          "chosen": true
                        },
                        {
                          "rows_to_scan": 8,  // 全表扫描
                          "access_type": "scan",
                          "using_join_cache": true,
                          "buffers_needed": 18290,
                          "resulting_rows": 1,
                          "cost": 114766, // 成本
                          "chosen": true
                        }
                      ]
                    },
                    "condition_filtering_pct": 100,
                    "rows_for_plan": 354351, // 扫描行数
                    "cost_for_plan": 191918, // 总成本：扫描user表的77151 + 114766 = 191917，与191918差了1，待研究
                    "chosen": true
                  }
                ]
              }
            ]
          },
          // ...
        ]
      }
    },
    {
      "join_execution": {
          // ...
      }
    }
  ]
}

接下来看一下使用了FORCE INDEX的TRACE：

可以看到用户表扫描成本发生了变化，相比之前的全表扫描成本值增加了，成本值为425222：

                    {
                      "rows_to_scan": 354351, // 扫描行数
                      "access_type": "scan", // 全表扫描
                      "resulting_rows": 354351,
                      "cost": 425222, // 成本值
                      "chosen": true
                    }

成本值增加的原因可能是强制使用索引添加了回表的成本消入，所以值变大了。---待验证。

订单表的处理与之前一致
由于用户表的扫描成本增加，所以总成本值增加为539989：

"steps": [
    {
      "join_preparation": {
        // 预处理
        // ...
       }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          // ...
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`t_user` `user` FORCE INDEX (`I_USERINFO_MIX`)", // 用户表
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan": 354351, // 扫描行数
                      "access_type": "scan", // 全表扫描
                      "resulting_rows": 354351,
                      "cost": 425222, // 成本值
                      "chosen": true
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 354351,
                "cost_for_plan": 425222, // 成本值
                "rest_of_plan": [
                  {
                    "plan_prefix": [
                      "`t_user` `user` FORCE INDEX (`I_USERINFO_MIX`)"
                    ],
                    "table": "`t_order`  `order`",
                    "best_access_path": {
                      "considered_access_paths": [
                        {
                          "access_type": "ref",
                          "index": "I_USERID", // 使用userid索引
                          "rows": 1.1667, // 预估扫描行数
                          "cost": 496091, // 成本值
                          "chosen": true // 是否选择
                        },
                        {
                          "rows_to_scan": 8,
                          "access_type": "scan", // 全表扫描
                          "using_join_cache": true,
                          "buffers_needed": 18290,
                          "resulting_rows": 1,
                          "cost": 114766, // 成本值
                          "chosen": true // 是否选择
                        }
                      ]
                    },
                    "condition_filtering_pct": 100,
                    "rows_for_plan": 354351,
                    "cost_for_plan": 539989, // 总成本
                    "chosen": true
                  }
                ]
              }
            ]
          },
          // ...
        ]
      }
    },
    {
      "join_execution": {
        // ...
      }
    }
  ]
}

可能对MySQL的研究还不够深入，以上分析的正确性还有待验证，这里将遇到的问题记录下来，以后有时间再研究。

MySQL 5.7和5.6版本的执行差异

除了索引的问题外，还发现了另外一个问题，同样的SQL在5.7和5.6版本上的执行时间不一致，在5.7上执行时间在5到6秒之间，在5.6上面只需要1秒左右，然后使用了MySQL自带分析工具Query Profiler统计了SQL执行耗费时间：

Query Profiler的开启步骤

开启profiling

 set profiling = 1;

或者

 set profiling = 'ON';

执行业务SQL

SELECT * 
FROM t_user user
LEFT JOIN t_order order ON user.id = order.userid
WHERE order.id IS NULL  
ORDER BY 
  user.name DESC, 
  user.gender DESC, 
  user.createtime DESC  
LIMIT 0, 10;

查看每个SQL的执行信息

 show profiles;

这里找到你执行的SQL的QUERY_ID

查看SQL的详细执行信息

show profile for query 85;

MySQL 5.7下的耗费时间统计

可以看出比较耗时的地方在Sending data，由之前的分析可知，MySQL并没借助联合索引来排序，而是使用全表扫描来查询数据，借助于TRACE的数据可以看出，通过临时表对数据进行排序，由于数据量过大超出了临时表的限制大小，所以将临时表的数据转到磁盘在磁盘中进行排序converting HEAP to ondisk，这个拷贝数据的过程比较慢，最终导致SQL的总执行时间过长。

 {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": { // 创建临时表
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 13530,
                "key_length": 0,
                "unique_constraint": false,
                "location": "memory (heap)", // 临时表位于内存
                "row_limit_estimate": 1240
              }
            }
          },
          {
            "converting_tmp_table_to_ondisk": { // 将临时表数据转到磁盘
              "cause": "memory_table_size_exceeded", // 原因为大小超过限制
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 13530,
                "key_length": 0,
                "unique_constraint": false,
                "location": "disk (InnoDB)", // 磁盘
                "record_format": "packed"
              }
            }
          }，
           {
            "filesort_information": [  // 排序信息
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "name" 
              },
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "gender"
              },
              {
                "direction": "desc",
                "table": "intermediate_tmp_table",
                "field": "creattime"
              }
            ],
            "filesort_priority_queue_optimization": { 
              "limit": 10,
              "rows_estimate": 697,
              "row_size": 1214,
              "memory_available": 262144,
              "chosen": true  // 表示使用优先队列优化排序
            },
            "filesort_execution": [
            ],
            "filesort_summary": {
              "rows": 11,
              "examined_rows": 360451,
              "number_of_tmp_files": 0,  // 使用的临时文件个数
              "sort_buffer_size": 13448,
              "sort_mode": "<sort_key, rowid>"  // 使用rowid排序
            }
          }
        ]
      }
  }

临时表的大小可以通过tmp_table_size和max_heap_table_size参数来控制。
# 查看临时表大小
show variables like ‘tmp_table_size’; 
show variables like ‘max_heap_table_size’;

MySQL5.6下的耗费时间统计

在5.6中的执行步骤与5.7基本一致，只不过converting HEAP to ondisk变成了converting HEAP to MyISAM，MySQL 5.6的默认临时表是myisam表，这里为什么会比5.7的速度快有待研究：

{
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "creating_tmp_table": { // 创建临时表
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 13530,
                "key_length": 0,
                "unique_constraint": false,
                "location": "memory (heap)",
                "row_limit_estimate": 1240
              }
            }
          },
          {
            "converting_tmp_table_to_myisam": { // 将临时表转为myisam
              "cause": "memory_table_size_exceeded",
              "tmp_table_info": {
                "table": "intermediate_tmp_table",
                "row_length": 13530,
                "key_length": 0,
                "unique_constraint": false,
                "location": "disk (MyISAM)",  // myisam引擎
                "record_format": "packed"
              }
            }
          }
          // ...
        ]
    }
}

posted @ 2022-09-03 17:01 shanml 阅读(5830) 评论(0) 收藏举报

刷新页面返回顶部

SHAN_CODE_NOTE

【MySQL】order by引起的慢查询问题总结

未走索引分析

MySQL 5.7和5.6版本的执行差异

公告