CMU15-445 POJECT#4 - CONCURRENCY CONTROL

并发控制理论：事务的正确性标准ACID

1.原子性Atomicty：一个事物的行为要么全部完成，要不一个都不完成；

执行一个事务可能的结果：

所有行为完成，事务提交；
执行部分行为后终止事务；

确保一致性的方式：

日志：dbms记录所有行为，以便在终止事务后可以撤销（undo）所有行为；
影子页面（Shadow Paging）：dbms对页面进行拷贝，事务改变的是拷贝的页面，当事务提交后这些页面才对他人可见；

2.一致性Consistency：如果事务和数据开始时处于一致性状态，那么他结束的时候也处于一致的状态；

The "world" represented by the database is
logically correct. All questions asked about the data are given logically correct answers.(数据库世界的逻辑要是真正确的，所有有关数据的问题都能有正确的回答)，数据库是对真实世界的建模，并且遵循一些完整性约束；Transaction consistency is the application's responsibility. DBMS cannot control this.（那这里也就不考虑了~）

3.隔离性Isolation：每个事物之间相互隔离

事务是并发进行的，所以我们需要交替执行这些事务；

确保隔离性的机制：并发控制协议，告诉dbms如何在有多个事物的时候选择合适的交替运行方式；协议有两个种类：

悲观：Don't let problems arise in the first place。一开始就不让问题发生；
乐观：Assume conflicts are rare, deal with them after they happen，问题发生后再处理；

调度的规范属性（formal properties of schedules）

串行调度：等价于不同事务之间不交替执行的调度；
等价调度：For any database state, the effect of executing the first schedule is identical to the effect of executing the second schedule；（执行的结果相同的调度）
串行化调度：A schedule that is equivalent to some serial execution of the transactions.
冲突操作：a. 不同的事务； b. 他们对于相同对象操作并且至少有一次写操作；满足上述条件会产生冲突操作；

冲突操作有：

Read-Write Conflicts (R-W)-->不可重复读
Write-Read Conflicts (W-R)-->脏读（读未提交的数据）
Write-Write Conflicts (W-W)（覆盖了未提交的数据）

串行化级别：

冲突可串行化；（大多数dbms试图支持）：Schedule S is conflict serializable if you can transform S into a serial schedule by swapping consecutive non-conflicting operations of different transactions.
视图可串行化（没有dbms可以做到）

依赖图(Also known as a precedence graph)：

One node per txn.
Edge from Ti to Tj if:
→ An operation Oi of Ti conflicts with an operation Oj of Tj and
→ Oi appears earlier in the schedule than Oj.

锁类别

两阶段锁协议（可能会有死锁和脏读）

协议不用需要提前知道事务所有的查询就可以决定一个事务是否能够访问数据库中的对象；

阶段一Growing（增长阶段）：每个事务从数据库中请求锁；锁管理器来决定给予或拒绝锁请求；
阶段二Shrinking（收缩阶段）：事务只允许释放之前获得过的锁，不允许获得新锁；

严格的两阶段锁协议（只会有死锁）

严格：A schedule is strict if a value written by a txn is not read or overwritten by other txns until that txn finishes.

两阶段锁协议会导致死锁

死锁检测
创建等待waits-for，系统周期性的检测图中的环，如果有就需要决定如何打破他；需要选出一个受害者“victim”事务，让其回滚来打破环（受害者事务要么回滚要么重新开始）；
那么，如何选择受害者呢？可以依照年龄、处理的查询数量、当前被锁的项目数量、我们将要回滚的事务数目等
死锁预防
这种方法确保锁的请求始终是往‘一个方向’，从而不会有死锁；

Wait-Die ("Old Waits for Young")
→ If requesting txn has higher priority than holding txn, then requesting txn waits for holding txn.
→ Otherwise requesting txn aborts.

Wound-Wait ("Young Waits for Old")【本实验用的】
→ If requesting txn has higher priority than holding txn, then
holding txn aborts and releases lock.
→ Otherwise requesting txn waits.

当一个事务被abort或回滚后，他的时间戳为原始的时间戳，这样可以避免饥饿；

意向锁（INTENTION LOCK）

An intention lock allows a higher-level node to be locked in shared or exclusive mode without having to check all descendent nodes.

Intention-Shared (IS)
→ Indicates explicit locking at lower level with shared locks.

Intention-Exclusive (IX)
→ Indicates explicit locking at lower level with exclusive locks.

Shared+Intention-Exclusive (SIX)~S+IX
→ The subtree rooted by that node is locked explicitly in shared mode and explicit locking is being done at a lower
level with exclusive-mode locks.

隔离级别的实现：

SERIALIZABLE: Obtain all locks first; plus index locks, plus strict 2PL.
REPEATABLE READS: Same as above, but no index locks.
READ COMMITTED: Same as above, but S locks are released immediately.
READ UNCOMMITTED: Same as above but allows dirty reads (no S locks).

4.持久性Durability：如果一个事务已经提交，那么他的影响应该持久；

多版本并发控制

（有时间再补）

log

（有时间再补）

PROJECT4 代码实现

主要实现的部分为锁管理器lock_manager,管理的行（tuple）级别锁，在管理器完成后，实现一些查询插入等操作的并发处理；
事务的锁管理是通过队列管理的，每行对应一个队列，记录有当前请求该行锁的事务；

// 记录所有的锁请求队列
std::unordered_map<RID, LockRequestQueue> lock_table_;
// 锁请求类
class LockRequest {
   public:
  LockRequest(txn_id_t txn_id, LockMode lock_mode)
      : txn_id_(txn_id), lock_mode_(lock_mode), granted_(false), borted_(false) {}

  txn_id_t txn_id_;
  LockMode lock_mode_;
  bool granted_;
  bool borted_;

  void SetGrant(bool grant) { granted_ = grant; }
};
// 锁请求队列
class LockRequestQueue {
 public:
  LockRequestQueue() = default;
  LockRequestQueue(const LockRequestQueue &rhs) = delete;
  LockRequestQueue operator=(const LockRequestQueue &rhs) = delete;

  std::list<LockRequest> request_queue_;
  // for notifying blocked transactions on this rid
  std::condition_variable cv_;

  std::mutex queue_lock_;

  // txn_id of an upgrading transaction (if any)
  txn_id_t upgrading_ = INVALID_TXN_ID;

  int exclusive_count_ = 0;
}

1. 锁管理器和死锁处理

使用的算法是2PL锁协议和Wound-Wait死锁预防算法；

2PL锁获取实现思路（以X锁为例）：
1.判断隔离级别（S锁不能在READ UNCOMMITED时获得）和当前事务阶段（要处于GROWING阶段）是否能够获取锁；
2.然后通过锁请求队列，判断是否有锁冲突并是否能通过Wound-Wait死锁预防算法解决，如果有冲突且无法被解决，则阻塞直至被唤醒且无锁冲突；（有可能在阻塞的过程中被终止，对应的事务被回滚）；如果无冲突或冲突能够被解决，则给予锁并返回true；

2PL解锁实现思路：
1.判断隔离级别；
2.判断是否锁请求有该锁，如果没有直接返回true；否则进入下一步；
3.从锁队列中移除该锁，并通过信号量唤醒等待该锁的其他请求；返回true；

死锁处理：
1.对于位于锁请求队列q中请求r，遍历r之前的所有请求，如果有请求t的txn_id（用于代表时间戳，因为是递增的）小于r的txn_id，则需要将其移除队列，并回滚对应的事务，此时，有两种处理方式；
2.如果t已经获得了锁，那么可以直接将其移除队列，并唤醒队列中其他阻塞的请求；此外，还需要将该事务其他位于队列（请求别的资源的）中的请求移除队列，如果该请求获得了锁，也需要唤醒队列中其他阻塞的请求，最后，将事务设置为ABORT状态；
3.如果t未获得锁，则说明它对应事务正在处于阻塞状态，我们只需要设置一个标志位，告诉那个事务他要被ABORT即可，然后唤醒该事务，此时该事务执行2的移除锁操作即可；

2. 遇到的问题

bug：通过遍历删除map中元素，正确的方式：

bool LockManager::RemoveLock(txn_id_t txn_id, const RID &rid, bool is_notify = true) {
  bool is_remove = false;
  auto &request_queue = lock_table_[rid].request_queue_;
  for (auto it = request_queue.begin(); it != request_queue.end();) {
    if (it->txn_id_ == txn_id) {
      is_remove = true;
      bool flag = it->granted_;
      it = request_queue.erase(it);  // KEY!!!!
      if(flag && is_notify){
        lock_table_[rid].cv_.notify_all();
      }
    } else {
      it++;
    }
  }
  return is_remove;
}

C++11的并发编程：C++11 并发指南 by Forhappy && Haippy

（笔记有时间再补充）

lambda表达式的一些使用规范