您当前的位置：首页 > 计算机 > 软件应用 > 数据库 > MySQL

MySQL索引失效全部场景（看完这个你还不会使用索引？）

时间：09-14来源：作者：点击数：56

我们知道提升MySQL性能的两大利器，分别为 索引、锁。本片文章主要围绕索引的使用原则进行深入讲解，包括索引匹配原则、索引失效等。

看完本篇文章你能学到什么？

1、索引匹配的基本原则

2、最左前缀法则

3、索引的全值匹配

4、索引的最佳优化方案

5、以后在公司写SQL不会那么随意了~~~~~~

注：阅读本片文章需要具备MySQL执行计划相关知识

MySQL索引失效全部场景

1.1 数据准备

创建测试表：

CREATE TABLE `userinfo` (
  `id` int(10) NOT NULL COMMENT '用户id',
  `username` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '用户名',
  `age` int(3) NULL DEFAULT NULL COMMENT '年龄',
  `phone` varchar(15) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '手机号',
  `gender` char(1) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '性别: ‘0’-男   ‘1’-女',
  `desc` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL COMMENT '自我介绍',
  `register_time` datetime(0) NULL DEFAULT NULL COMMENT '注册时间',
  `login_time` datetime(0) NULL DEFAULT NULL COMMENT '上一次登录时间',
  `pic` varchar(250) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '头像地址',
  `look` int(10) NULL DEFAULT NULL COMMENT '查看数',
  PRIMARY KEY (`id`)
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;

1.2 索引匹配规则

1.2.1 全值匹配

全值匹配是针对建立了复合索引的索引列，在查询条件中，复合索引的所有列都指定具体的列。

创建复合索引：

create index idx on userinfo(username,age,phone,gender);

执行全值匹配SQL：

explain select * from userinfo where username='1' and age=1 and phone='1' and gender=1 ;

命中索引idx_name_age_phone_gender

全值匹配的概念很简单，在复合索引中，必须匹配复合索引的所有字段，索引才会生效。

1.2.2 最左前缀法则

复合属于匹配规则可不止全值匹配一种，使用索引检索时间时，如果复合最左前缀法则，那么也是可以命中索引的。

最左前缀法则指的是，查询条件中如果包含有复合索引，必须从索引的最左列开始查询。并且不跳过中间列。

测试最左前缀法则（1）：

explain select * from userinfo where age=20;			-- 不走索引

explain select * from userinfo where phone='1';			-- 不走索引

explain select * from userinfo where gender='1';		-- 不走索引

explain select * from userinfo where username='1';		-- 走索引

发现前面3个是没有命中索引，最后一条SQL命中索引了。

测试最左前缀法则（2）：
最左匹配原则，必须要匹配最左的索引列，不可跳过最左边索引列

-- 不走索引
explain select * from userinfo where phone='1' and age=20 and gender='1';		

-- 不走索引
explain select * from userinfo where gender='1' and age=20 and phone='1';		

-- 不走索引
explain select * from userinfo where phone='1' and gender=20 and age='1';

-- 命中1个字段(username)
explain select * from userinfo where username='1' and phone='1' and gender='1';

-- 命中2个字段(username、age)
explain select * from userinfo where username='1' and age=1 and gender='1';

-- 命中3个字段(username、age、phone)
explain select * from userinfo where username='1' and age=20 and phone='1';

注意：最左前缀法则指的是条件匹配时，必须匹配上创建索引时最左边的列，跟where指定的条件顺序无关！这里是很多人入的坑！

-- 命中1个字段(username)
explain select * from userinfo where gender='1' and phone='1' and username='1';

-- 命中2个字段(username、age)
explain select * from userinfo where age=1 and gender='1' and username='1';

-- 命中3个字段(username、age、phone)
explain select * from userinfo where age=1 and phone='1' and username='1';

如果四个值都匹配了，既符合最左前缀法则，也符合全值匹配

-- 全部命中
explain select * from userinfo where username='1' and age=1 and phone='1' and gender='1';

测试完毕删除复合索引：

drop index idx_name_age_phone_gender on userinfo;

1.3 索引失效问题

1.3.1 单列索引失效

创建单列索引：

create index idx_name on userinfo(username);

1.3.1.1 不要在索引列上做任何操作

1）不要在索引列上做任何的操作，否则索引将失效

explain select * from userinfo where username='1';		-- 走索引

explain select * from userinfo where concat(username,'1')='1';	-- 不走索引

1.3.1.2 字符串索引必须加引号

2）字符串类索引在条件查询时，条件值没有加双引号，索引失效

explain select * from userinfo where username='1';			-- 走索引

explain select * from userinfo where username=1;			-- 不走索引

1.3.1.3 不要使用`!=`或者`<>`

3）使用不等于（!= 或者<>）不会命中索引

explain select * from userinfo where username='1';			-- 走索引

explain select * from userinfo where username!='1';			-- 不走索引

1.3.1.4 模糊查询时不要使用前通配符匹配

4）like，前通配符会导致索引失效，没有使用通配符或者使用后通配符会使用索引，但是会造成回表

explain select * from userinfo where username like '%1%';		-- 不走索引

explain select * from userinfo where username like '%1';		-- 不走索引

explain select * from userinfo where username like '1%';		-- 走索引,回表

explain select * from userinfo where username like '1';			-- 走索引,回表

Using index condition和NULL都是走了索引，但是会回表查询。

在使用like模糊查询的时候，前模糊查询会导致索引失效，因此一般需要频繁模糊查询并且数据量比较大的情况下，一般会取业务前缀；比如数据前面加某个前缀符。

explain select * from userinfo where username like 'T%1';

我们插入300W条数据，观察一下加前缀和未加前缀的效率：

编写存储过程进行批量插入：

create procedure test_insert(count int)
begin
		declare i int default 1;
		while i<=count do 
			INSERT INTO userinfo values(
			i,								-- id
			uuid(),							-- username
			CEILING(RAND()*90+10),			-- age
			FLOOR(RAND()*100000000000),		-- phone
			round(FORMAT(rand(),1)),		-- gender
			uuid(),							-- desc
			now(),							-- register_time
			now(),							-- login_time
			uuid(),							-- pic
			CEILING(RAND()*90+10)			-- look
			);
			set i=i+1;
		end while;
end;

执行插入：

start transaction;			-- 开启事务(控制在一个事务中，避免频繁开启/提交事务)
		
set unique_checks=0;		-- 关闭唯一性校验

call test_insert(3000000);	-- 插入300W测试数据

commit;						-- 提交事务

注意：批量插入的时候控制在一个事务中，否则插入速度极低！同时关闭唯一性校验，插入速度会明显提升，有兴趣的同学可以按照我的方法插入，看看花费的时间是多少~~~~

执行查询：

-- 索引生效
select * from userinfo where username like 'T%1';

-- 索引失效
select * from userinfo where username like '%1';

-- 索引生效
select * from userinfo where username like 'T%1%';

测试完毕清空表：

truncate userinfo;

记住这里不要使用delete，否则速度巨慢，牵扯到一个记录日志的问题。

1.3.1.5 `is not null`不会走索引

5）is null会走索引，is not null不会走索引。

explain select * from userinfo where username is null;

explain select * from userinfo where username is not null;

仔细一看两个都是命中了索引的。不是说is not null不会走索引吗？

注意：这里有个坑

执行上述的执行计划的时候userinfo表中的记录已经被情空了的！

如果userinfo表中有记录的情况下，又会出现不同的情况！

往下看：

call test_insert(20);	-- 往userinfo表中插入20条记录

explain select * from userinfo where username is null;		-- 走了索引

explain select * from userinfo where username is not null;	-- 根本没走索引！！！

当表中有记录时，is not null 会导致索引失效！这个可能是MySQL底层优化器的选择，我们记住is not null不会走索引即可；

测试完毕清空表：

truncate userinfo;

1.3.1.6 `not int`会导致索引失效

使用in会走索引，not in则不会走索引。

-- 清空表
truncate userinfo;

-- in走索引
explain select * from userinfo where id in (1,2);

-- not int不走索引
explain select * from userinfo where id not in (1,2);

发现in走索引，not in没有索引，万事大吉，请你继续往下看！

注意：这里又有一个坑！！！！

请注意看我在上面的SQL语句故意加了一个清空表的操作，那是因为如果表中有数据，那么情况又不一样了！

-- 插入10条记录
call test_insert(10);

-- in走索引
explain select * from userinfo where id in (1,2);

-- 
explain select * from userinfo where id not in (1,2);

表中有数据时，发现not int语句又走了索引，这下有点晕了吧（这个有可能是MySQL执行计划的Bug），到这里咱们先这样认为：in 会走索引，not int 不走索引！具体咱们最后一章再测试。

1.3.2 `or` 导致复合索引失效

我们知道MySQL要想命中复合索引，必须按照最左前缀法则，回顾最左前缀法则：

create index idx_name_age_phone on userinfo(username,age,phone);	-- 创建复合索引
show index from userinfo; 				-- 查询当前索引有哪些

只有一个复合索引（username、age、phone）；

执行如下SQL语句，分析执行计划

explain select * from userinfo where username='1';				-- 命中一个索引

explain select * from userinfo where username='1' and age=1;	-- 命中两个索引

explain select * from userinfo where username='1' and phone='1';-- 命中一个索引(username)

explain select * from userinfo where age=1 and phone='1';		-- 不符合最左前缀法则

上述条件都是使用and拼接，如果一旦使用了or，索引会立即失效，不管是否符合最左前缀法则。也不管是否复合全值匹配

-- 符合最左前缀法则(不走索引)
explain select * from userinfo where username='1' or age=1;

-- 符合最左前缀法则也复合全值匹配(不走索引)
explain select * from userinfo where username='1' or age=1 or phone='1';

测试完毕删除索引：

drop index idx_name on userinfo;
drop index idx_name_age_phone on userinfo;

1.3.3 复合索引解决部分索引失效问题

当多个列都创建了索引（不是复合索引），同时使用这几个索引列查询时，MySQL只会选择其中一个索引去查询，查询到结果之后，根据另一个索引条件进行条件筛选。

给name、age列创建索引：

create index idx_name on userinfo(username);
create index idx_age on userinfo(age);

测试使用到多个单列索引：

explain select * from userinfo where username='1';				-- 走索引

explain select * from userinfo where age=1;						-- 走索引

-- 先走idx_name索引查询，然后根据age筛选剩余的数据(这个时候不走索引)
explain select * from userinfo where age=1 and username='1';

其实这条SQL语句的效率也是非常高的，因为使用了idx_name索引先检索出来一部分数据，这个时候是非常快的，然后剩下的数据再通过age条件筛选，但是通过第一轮的筛选剩下来的数据已经非常少了。此时即使不通过索引筛选剩下的数据也是非常快的，除非第一轮筛选过后的的数据量还是非常多（十万、百万），那么第二次不使用索引筛选效率就会比较低了。

针对于上述情况，我们可以根据username和age两列建立复合索引，来解决这个问题。

建立复合索引之前删除前面的两个索引（不删除不行，因为idx_name索引先创建的，如果不删那么会默认选择idx_name索引来查询数据，而不会使用后面创建的复合索引）：

drop index idx_name on userinfo;

drop index idx_age on userinfo;

建立复合索引：

create index idx_name_age on userinfo(username,age);

执行SQL：

explain select * from userinfo where age=1 and username='1';

发现走了两个索引

1.3.4 诡异的`or`语句

首先说明一下：or不会导致索引失效

执行SQL：

truncate userinfo;

explain select * from userinfo where username='1' or username='2';

发现没有走索引

我们在表中插入10条数据测试一下：

call test_insert(10);

explain select * from userinfo where username='1' or username='2';

此时发现又走了索引。那or到底是走索引还是不走呢？答案是走索引的，具体我们看下一篇测试。

1.4 索引性能测试

咱们前面测试时发现有如下问题：

1）is not null：表中没有数据时走索引，有数据时不走索引
2）not in：表中没有数据时不走索引，走索引
3）or：表中没有数据时不走索引，有数据时走索引

实际上是is not null不会走索引，not in不会走索引，or拼接会走索引

关于为什么会出现这样的问题，应该是MySQL底层的优化器自己根据某种规则计算出来的，或者在某些情况下，not in、is not null等又使用到了索引；还有一个原因是，可能执行计划的显示出现了bug（但这个在官网暂时没找到）

这一章节主要是围绕前面出的几个小问题来，主要有以下三个问题：

1.4.1 `is null`走索引，`is not null`不走索引

在1.3.1.5小节中，在表中没有数据的情况下，is not null 会走索引，一旦表中有数据了，is not null则不走索引，实际是不走索引的。

我们来实际测试一下，看SQL语句的执行时间

在表中准备300W数据，测试如下SQL语句：

select * from userinfo where username is null;				-- 0.00s

select * from userinfo where username is not null;			-- 6.76s（但凡使用到了索引也不可能这么慢）

1.4.2 `in`走索引，`not in`不走索引

在1.3.1.6小节中，在表中没有数据的情况下，not in 不走索引，有数据的情况下，走索引（和is not null 那边刚好相反）。实际是不走索引。

测试如下SQL（表中有300W记录）：

select * from userinfo where id in (1,2);			-- 0.00s

select * from userinfo where id not in (1,2);		-- 6.72s（但凡使用到了索引也不可能这么慢）

由于条件会查询出来结果集，不方便截图

1.4.3 `or`条件拼接不会导致索引失效（单列索引）

在1.3.2小节中，在表中没有数据的情况下不走索引，有数据的情况下走索引，实际是走索引的。

测试如下SQL语句（表中有300W记录）：

-- 创建索引
create index idx_name on userinfo(username);

-- 查看userinfo表中有多少个索引
show index from userinfo;

-- 查看此SQL的执行效率（有索引的情况下）
select * from userinfo where username='1' or username='2';		-- 0.00s

-- 删除索引
drop index idx_name on userinfo;

-- 查看此SQL的执行效率（没有索引的情况下）
select * from userinfo where username='1' or username='2';		-- 2.61s