使用表连接

truncate t_target;  
insert into t_target  (http://www.my516.com)
select distinct t1.* from t_source t1,  
(select min(item_id) item_id,created_time,item_name from t_source group by created_time,item_name) t2  
where t1.item_id = t2.item_id;
        这种方法用时14秒,查询计划如下:

mysql> explain select distinct t1.* from t_source t1,   (select min(item_id) item_id,created_time,item_name from t_source group by created_time,item_name) t2   where t1.item_id = t2.item_id;
+----+-------------+------------+------------+------+---------------+-------------+---------+-----------------+--------+----------+------------------------------+
| id | select_type | table      | partitions | type | possible_keys | key         | key_len | ref             | rows   | filtered | Extra                        |
+----+-------------+------------+------------+------+---------------+-------------+---------+-----------------+--------+----------+------------------------------+
|  1 | PRIMARY     | t1         | NULL       | ALL  | NULL          | NULL        | NULL    | NULL            | 997282 |   100.00 | Using where; Using temporary |
|  1 | PRIMARY     | <derived2> | NULL       | ref  | <auto_key0>   | <auto_key0> | 5       | test.t1.item_id |     10 |   100.00 | Distinct                     |
|  2 | DERIVED     | t_source   | NULL       | ALL  | NULL          | NULL        | NULL    | NULL            | 997282 |   100.00 | Using temporary              |
+----+-------------+------------+------------+------+---------------+-------------+---------+-----------------+--------+----------+------------------------------+
3 rows in set, 1 warning (0.00 sec)
内层查询扫描t_source表的100万行,建立临时表,找出去重后的最小item_id,生成导出表derived2,此导出表有50万行。
MySQL会在导出表derived2上自动创建一个item_id字段的索引auto_key0。
外层查询也要扫描t_source表的100万行数据,在与导出表做链接时,对t_source表每行的item_id,使用auto_key0索引查找导出表中匹配的行,并在此时优化distinct操作,在找到第一个匹配的行后即停止查找同样值的动作。
---------------------

posted @ 2019-08-06 19:18  水至清明  阅读(146)  评论(0)    收藏  举报