Proj THUDBFuzz Paper Reading: SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs

Abstract

介绍Sanitizer; 除掉无用Sanitizer checks
本文: SANRAZOR
方法: 获取动态coverage和静态data dependencies?
实验1：
数据集：SPEC benchmarks
效果:

from 73.8% to 28.0–62.0% for AddressSanitizer, and from 160.1% to 36.6–124.4% for UndefinedBehaviorSanitizer (depending on the applied reduction scheme).
实验2:
数据集:10个常用程序
效果: 38个CVEs
实验3: 与AASAP结合
效果: 虽然在检测方面有减弱，但运行成本仅7.0%

1. Intro

P1 介绍Sanitizers
P2 具体实例介绍Sanitizer的Overhead
P3 难点:
已有工作: 多为静态分析，需要重量级、针对性的分析
ASAP: 使用user-provied overhead budget，可能会造成过度删除
P4: SANRAZOR基本流程

profiling phase（基于已存在的测试集）获取覆盖率统计信息
用静态数据依赖和得到的覆盖率统计信息来删除冗余checks
SANRAZOR unsound，但是在发现缺陷上有效

P5: SPEC CPU2006 benchmark

2. Preliminaries

介绍Sanitizers

ASan

包括插桩模块和runtime library
插桩模块会为每个使用到的地址分配shadow memory regions，并在load和store操作上插桩
内存地址a会被映射到对应的shadow memroy address sa
检查sa并确定从a进行的访问是否安全
会为每个shadow memory region加一个坏区，对这个理应不能访问的坏区的访问会导致错误
malloc会在分配的合法内存后面制造类似的坏区
free会把deallocated memory作为坏区

UBSan

如out-of-bounds access,除零，invalid shift

3. Problem Formulation

程序p，有N个checks，Ci是第i个，Ci.v为第i个check的参数，Ci.P则为要检查的属性
目的：删掉没有变的checks

3.1 Sample Redundant Checks in bzip2

3.2 Redundant Sanitizer Checks

定义identical check为Ci在控制流上dominate Cj或者反过来+Ci的语义与Cj相似

难点

直接判定[[Ci.v]]==[[Cj.v]]是非常麻烦的，比如Rice定理
从控制流回复dominate tree并以此为依据可能导致false positive

本文：近似抽取identical check pairs

Ci和Cj在执行了一定量以上的有意义的输入之后，其代码覆盖模式类似（完全相等或者被覆盖）
使用插桩+profile获取
[[CiP(Ci.v)]] ≈ [[CjP(Cj.v)]]
checking data dependency

4. Design

P1: 基本步骤
P2: Application Scope
主要写能够在任何LLVM IR上使用，去除冗余ASan和UBSan
P3: Application Scenario

应用场景，如在生产场景中少用checks
1. 用数据，也即overhead差别来说明用的好处

4.1 Check Identification

基本模式: 比较指令接一个跳转指令
此外，使用_ASan_report 和_UBSan_handle_XXX等方式

4.2 Dynamic Check Pattern Capturing

使用软件带来的default test cases

4.3 Static Check Pattern Capturing

4.3.1 Extracting Static Features with Three Schemes

3种静态分析等级
L0: 直接把dependency tree上的leaf nodes全收集起来
L1: 不要除了icmp上的常数之外的常数
L2: 不要常数

4.3.2 Security Consideration

4.3.3 Extension Using Static Analysis

P1: we envision that symbolic techniques, e.g., (underconstrained) symbolic execution [27] and constraint solving, can be used to prove the equivalence of sanitizer checks.

P2:
已有研究：使用符号执行收集输入-输出关系来确定相等性
Future Work：符号执行

P3:
假阳性可能通过进程间分析去除？

4.4 Sanitizer Check Reduction

We also note that we did not observe any alerts yielded by sanitizer checks in our experiments.

6. Evaluation

6.1 Cost Study

使用数据集SPEC CPU 2006(只用了11/19个projects，因为这些能被clang 9+ASan + UBSam编译)
特点：

工业标准，CPU intensive
有training workload(用来profile）, test workload(太短了不用), reference workload(用来测试)三部分组成
Metris:
M0: the execution time reduction after eliminating redundant checks
M1: count the number of removed sanitizer checks
M2: the execution cost saved

6.2 Vulnerablilty Detectability Study

posted @ 2021-11-28 02:49 雪溯阅读(162) 评论(0) 收藏举报

刷新页面返回顶部

雪溯

总之心情不好的话大概就会来这边做两道OJ，此处顺便储存部分笔记