MDL元数据锁

MetaData Lock即元数据锁,MetaData Lock主要为了保证元数据的一致性,用于处理不同线程操作同一数据对象的同步与互斥问题。MySQL 5.5版本开始,引入了MDL锁,但是因为MDL锁,会导致表级别的锁,无论是读或者写操作,都无法进行,导致SQL的阻塞。


[TOC]

MDL锁与实现

MySQL5.5版本引入了MDL锁(metadata lock),用于解决或者保证DDL操作与DML操作之间的一致性。例如下面的这种情形:

会话1 会话2
BEGIN;
SELECT * FROM xxx;
DROP TABLE xxx;
SELECT * FROM xxx;

若没有MDL锁的保护,则事务2可以直接执行DDL操作,并且导致事务1出错,5.1版本即是如此。5.5版本加入MDL锁就在于保护这种情况的发生,由于事务1开启了查询,那么获得了MDL锁,锁的模式为SHARED_READ,事务2要执行DDL,则需获得EXCLUSIVE锁,两者互斥,所以事务2需要等待。

目前MDL有如下锁模式:

锁模式 对应SQL
MDL_INTENTION_EXCLUSIVE GLOBAL对象、SCHEMA对象操作会加此锁
MDL_SHARED FLUSH TABLES with READ LOCK
MDL_SHARED_HIGH_PRIO 仅对MyISAM存储引擎有效
MDL_SHARED_READ SELECT查询
MDL_SHARED_WRITE DML语句
MDL_SHARED_WRITE_LOW_PRIO 仅对MyISAM存储引擎有效
MDL_SHARED_UPGRADABLE ALTER TABLE
MDL_SHARED_READ_ONLY LOCK xxx READ
MDL_SHARED_NO_WRITE FLUSH TABLES xxx,yyy,zzz READ
MDL_SHARED_NO_READ_WRITE FLUSH TABLE xxx WRITE
MDL_EXCLUSIVE ALTER TABLE xxx PARTITION BY …

MDL锁的诊断

MySQL 5.7版本之前并没有提供一个方便的途径来查看MDL锁,而在MySQL 5.7中的 performance_schea 库下新增了一张表 metadata_locks,用其来查看MDL锁。

前提:开启 performance_schema(对性能影响有限),并重启实例生效。

[mysqld]
performance_schema=ON

不过默认 performance_schema 并没有打开此功能,需要手工将wait/lock/metadata/sql/mdl监控给打开:

show variables like '%performance_schema%';
SELECT * FROM performance_schema.setup_instruments;
UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME ='global_instrumentation';
UPDATE performance_schema.setup_instruments SET ENABLED = 'YES' WHERE NAME ='wait/lock/metadata/sql/mdl';
SELECT * FROM performance_schema.metadata_locks;

准备数据

drop table if exists test;
create table test(id int,name varchar(4));
insert into test values(1,'a'),(2,'b');

会话1:不提交

begin;
select * from test;
delete from test where id=2;
update test set name='c' where id=1;
SELECT * FROM performance_schema.metadata_locks;

主:重点关注 lock_status,"PENDING"代表线程在等待MDL,而"GRANTED"则代表线程持有MDL。

会话2:通过metadata_locks查看mdl锁

alter table test add c1 int;
SELECT * FROM performance_schema.metadata_locks;

会话3:可看到 Waiting for table metadata lock

show processlist;
SELECT * FROM performance_schema.metadata_locks;

如何找出引起阻塞的会话

select object_type,object_schema,object_name,lock_type,lock_duration,lock_status,owner_thread_id from performance_schema.metadata_locks;

结果如下:

object_type object_schema object_name lock_type lock_duration lock_status owner_thread_id
TABLE slowtech test SHARED_WRITE TRANSACTION GRANTED 27
GLOBAL NULL NULL INTENTION_EXCLUSIVE STATEMENT GRANTED 29
SCHEMA slowtech NULL INTENTION_EXCLUSIVE TRANSACTION GRANTED 29
TABLE slowtech test SHARED_UPGRADABLE TRANSACTION GRANTED 29
TABLE slowtech test EXCLUSIVE TRANSACTION PENDING 29
TABLE performance_schema metadata_locks SHARED_READ TRANSACTION GRANTED 28

结合owner_thread_id,可以可到,是29号线程在等待27号线程的MDL,此时,可kill掉xx号线程。

但需要注意的是,owner_thread_id给出的只是线程ID,并不是show processlist中的ID。如果要查找线程对应的processlist id,需查询performance_schema.threads表。

select * from performance_schema.threads where thread_id in (27,29);

将这两张表结合,借鉴sys.innodb_lock _waits的输出,实际上我们也可以直观地呈现MDL的等待关系。

SELECT
    a.OBJECT_SCHEMA AS locked_schema,
    a.OBJECT_NAME AS locked_table,
    "Metadata Lock" AS locked_type,
    c.PROCESSLIST_ID AS waiting_processlist_id,
    c.PROCESSLIST_TIME AS waiting_age,
    c.PROCESSLIST_INFO AS waiting_query,
    c.PROCESSLIST_STATE AS waiting_state,
    d.PROCESSLIST_ID AS blocking_processlist_id,
    d.PROCESSLIST_TIME AS blocking_age,
    d.PROCESSLIST_INFO AS blocking_query,
    concat('KILL ', d.PROCESSLIST_ID) AS sql_kill_blocking_connection
FROM
    performance_schema.metadata_locks a
JOIN performance_schema.metadata_locks b ON a.OBJECT_SCHEMA = b.OBJECT_SCHEMA
AND a.OBJECT_NAME = b.OBJECT_NAME
AND a.lock_status = 'PENDING'
AND b.lock_status = 'GRANTED'
AND a.OWNER_THREAD_ID <> b.OWNER_THREAD_ID
AND a.lock_type = 'EXCLUSIVE'
JOIN performance_schema.threads c ON a.OWNER_THREAD_ID = c.THREAD_ID
JOIN performance_schema.threads d ON b.OWNER_THREAD_ID = d.THREAD_ID;

输出结果:

*************************** 1. row ***************************
               locked_schema: slowtech
                locked_table: t1
                 locked_type: Metadata Lock
      waiting_processlist_id: 4
                 waiting_age: 259
               waiting_query: alter table slowtech.t1 add c1 int
               waiting_state: Waiting for table metadata lock
     blocking_processlist_id: 2
                blocking_age: 301
              blocking_query: NULL
sql_kill_blocking_connection: KILL 2

输出一目了然,DDL操作如果要获得MDL,执行kill 2即可。

遇到的问题

2020.07.14.dms工单执行ddl语句报错"无法获取到原表上的MDL锁"

工单里的ddl语句如下:

ALTER TABLE one_activity ADD collect_user_num int(10) DEFAULT 0 NOT NULL COMMENT '线索数,默认0' after collect_type

工单的调度详情里看到如下报错,重试依旧报错。

------[2020-07-14 15:03:14]------
任务执行失败:
iDB-OnlineDDL Execute Error:无法获取到原表上的MDL锁,已尝试3次,请排查原表上的大事务、大查询,或到业务低峰期时重试任务
TraceId : 0b736e9816262416409597186d428a

------[2020-07-14 15:03:14]------
jobFail callback End.

开发事后反馈如下报错日志:

responseBody error  
org.springframework.dao.CannotAcquireLockException: 
### Error updating database.  Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction
### The error may exist in com/oneonline/soa/activity/dao/WechatUserDAO.java (best guess)
### The error may involve com.oneonline.soa.activity.dao.WechatUserDAO.updateById-Inline
### The error occurred while setting parameters
### SQL: UPDATE one_wechat_user SET openid=?, unionid=?, origin=?, session_key=?, phone_number=?, pure_phone_number=?, country_code=?, created_at=?, updated_at=? WHERE id=?
### Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction
; Lock wait timeout exceeded; try restarting transaction; nested exception is com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction
        at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:262)
        at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
        at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:73)
        at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:446)
        at com.sun.proxy.$Proxy158.update(Unknown Source)
        at org.mybatis.spring.SqlSessionTemplate.update(SqlSessionTemplate.java:294)
        at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:63)
        at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:62)
        at com.sun.proxy.$Proxy228.updateById(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor539.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)

验证过程

  • 1.该ddl一直处于执行中,执行如下语句:

    SELECT id, USER,db,SUBSTRING_INDEX(HOST,':',1),COMMAND,TIME,info,state
    FROM information_schema.processlist WHERE command !='Sleep';
    

    发现如下越来越多的block语句:

    UPDATE one_wechat_user SET openid=?, unionid=?, origin=?, session_key=?, phone_number=? WHERE id=?
    
  • 2.执行如下语句:

    show engine innodb status\G;
    

看到 TRANSACTIONS 有如下信息:

---TRANSACTION 421875086800592, not started
 0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 955083244, ACTIVE 48 sec starting index read
 mysql tables in use 1, locked 1
 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
 MySQL thread id 5195068, OS thread handle 140351464961792, query id 18076351285 10.200.131.51 jronline updating
 UPDATE one_wechat_user  SET openid='om3P64pq-jvLI-_mHrh82k83OXLg',
 unionid='',
 origin=1,
 session_key='sSI9dQ14GZPKfiwsGgAAug==',
 phone_number='18608774809',
 pure_phone_number='18608774809',
 country_code='86',
 created_at='2020-06-02 16:23:18.0',
 updated_at='2020-06-02 16:23:18.0'  WHERE id=30684
 ------- TRX HAS BEEN WAITING 48 SEC FOR THIS LOCK TO BE GRANTED:
 RECORD LOCKS space id 521 page no 261 n bits 208 index PRIMARY of table `jronline_activity`.`one_wechat_user` trx id 955083244 lock_mode X locks rec but not gap waiting
 Record lock, heap no 136 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
  0: len 4; hex 000077dc; asc   w ;;
  1: len 6; hex 000038b20927; asc   8  ';;
  2: len 7; hex 2d0000023739cf; asc -   79 ;;
  3: len 28; hex 6f6d3350363470712d6a764c492d5f6d48726838326b38334f584c67; asc om3P64pq-jvLI-_mHrh82k83OXLg;;
  4: len 0; hex ; asc ;;
  5: len 1; hex 01; asc  ;;
  6: len 24; hex 49436550514539522b475a5a3371315a3843725355513d3d; asc ICePQE9R+GZZ3q1Z8CrSUQ==;;
  7: len 11; hex 3138363038373734383039; asc 18608774809;;
  8: len 11; hex 3138363038373734383039; asc 18608774809;;
  9: len 2; hex 3836; asc 86;;
  10: len 4; hex 60b73ff6; asc ` ? ;;
  11: len 4; hex 60b73ff6; asc ` ? ;;
 ------------------
Copyright © www.sqlfans.cn 2023 All Right Reserved更新时间: 2022-01-14 17:30:13

results matching ""

    No results matching ""