文献学习-An Empirical Study of Branching Heuristics through the Lens of Global Learning Rate

An Empirical Study of Branching Heuristics through the Lens of Global Learning Rate

Liang J.H., V.K. H.G., Poupart P., Czarnecki K., Ganesh V. (2017) An Empirical Study of Branching Heuristics Through the Lens of Global Learning Rate. In: Gaspers S., Walsh T. (eds) Theory and Applications of Satisfiability Testing – SAT 2017. SAT 2017. Lecture Notes in Computer Science, vol 10491. Springer, Cham. https://doi-org-s.era.lib.swjtu.edu.cn/10.1007/978-3-319-66263-3_8

Abstract

In this paper, we analyze a suite of 7 well-known branching heuristics proposed by the SAT community and show that the better heuristics tend to generate more learnt clauses per decision, a metric we define as the global learning rate (GLR).译文：在这篇论文中，我们分析了SAT社区提出的7种著名的分支启发式，并表明更好的启发式倾向于在每个决策中生成更多学习子句，我们定义为全局学习率(GLR)。

Like our previous work on the LRB branching heuristic, we once again view these heuristics as techniques to solve the learning rate optimization problem.译文：就像我们之前在LRB分支启发式上的工作一样，我们再次将这些启发式视为解决学习率优化问题的技术。

First, we show that there is a strong positive correlation between GLR and solver efficiency for a variety of branching heuristics.译文：对于各种分支启发式算法，GLR与求解器效率之间存在着很强的正相关关系。

Second, we test our hypothesis further by developing a new branching heuristic that maximizes GLR greedily. 译文：其次，我们通过开发一个新的最大化GLR的分支启发式进一步检验我们的假设。We show empirically that this heuristic achieves very high GLR and interestingly very low literal block distance (LBD) over the learnt clauses. 译文：我们的经验表明，这种启发式获得非常高的GLR和非常有趣的非常低的文字块距离(LBD)学习子句。

In our experiments this greedy branching heuristic enables the solver to solve instances faster than VSIDS, when the branching time is taken out of the equation. 译文：在我们的实验中，当分支时间从方程中去掉时，这种贪婪分支启发式使求解者比VSIDS更快地解决实例。

This experiment is a good proof of concept that a branching heuristic maximizing GLR will lead to good solver performance modulo the computational overhead. 译文：该实验很好地证明了最大化GLR的分支启发式算法在计算开销的基础上可以获得较好的求解性能。

Third, we propose that machine learning algorithms are a good way to cheaply approximate the greedy GLR maximization heuristic as already witnessed by LRB.译文：第三，我们提出机器学习算法是一个很好的廉价逼近贪婪GLR最大化启发式的方法，LRB已经证明。

In addition, we design a new branching heuristic, called SGDB, that uses a stochastic gradient descent online learning method to dynamically order branching variables in order to maximize GLR. We show experimentally that SGDB performs on par with the VSIDS branching heuristic.译文：此外，我们还设计了一种新的分支启发式算法，称为SGDB，它使用随机梯度下降在线学习方法对分支变量进行动态排序，以使GLR最大化。我们通过实验证明，SGDB的性能与VSIDS的分支启发式相当。

1 Introduction

	Searching through a large, potentially exponential, search space is a reoccurring problem in many fields of computer science. Rather than reinventing the wheel and implementing complicated search algorithms from scratch, many researchers in fields as diverse as software engineering [7], hardware verification [9], and AI [16] have come to rely on SAT solvers as a general purpose tool to efficiently search through large spaces.译文：在软件工程[7]、硬件验证[9]和AI[16]等不同领域的许多研究人员并没有重新发明轮子并从头开始实现复杂的搜索算法，而是开始依赖于SAT求解器作为通用工具来在大空间中高效搜索。 By reducing the problem of interest down to a Boolean formula, engineers and scientists can leverage off-the-shelf SAT solvers to solve their problems without needing expertise in SAT or developing special-purpose algorithms.译文：通过将感兴趣的问题简化为一个布尔公式，工程师和科学家可以利用现成的SAT解决方案来解决他们的问题，而不需要SAT方面的专业知识或开发特殊用途的算法。 Modern conflict-driven clause-learning (CDCL) SAT solvers can solve a wide-range of practical problems with surprising efficiency, thanks to decades of ongoing research by the SAT community.译文：现代冲突驱动的clause-learning (CDCL) SAT解决器能够以惊人的效率解决广泛的实际问题，这要归功于SAT社区进行了几十年的持续研究。 Two notable milestones that are key to the success of SAT solvers are the Variable State Independent Decaying Sum (VSIDS) branching heuristic (and its variants) [23] and conflict analysis techniques [22]. The VSIDS branching heuristic has been the dominant branching heuristic since 2001, evidenced by its presence in most competitive solvers such as Glucose [4], Lingeling [5], and CryptoMiniSat [26].

	One of the challenges in designing branching heuristics is that it is not clear what constitutes a good decision variable.译文：设计分支启发式的挑战之一是不清楚什么构成了一个好的决策变量。 We proposed one solution to this issue in our LRB branching heuristic paper [19], which is to frame branching as an optimization problem. We defined a computable metric called learning rate and defined the objective as maximizing the learning rate.译文：我们定义了一个可计算的度量称为学习率，并将目标定义为最大化学习率。 Good decision variables are ones with high learning rate. 译文：好的决策变量是学习率高的决策变量。 Since learning rate is expensive to compute a priori, we used a multi-armed bandit learning algorithm to estimate the learning rate on-the-fly as the basis for the LRB branching heuristic [19].

	In this paper, we deepen our previous work and our starting point remains the same, namely, branching heuristics should be designed to solve the optimization problem of maximizing learning rate. In LRB, the learning rate metric is defined per variable. In this paper, we define a new metric, called the global learning rate (GLR) to measure the solver’s overall propensity to generate conflicts, rather than the variable-specific metric we defined in the case of LRB. Our experiments demonstrate that GLR is an excellent objective to maximize. 1.1 Contributions A new objective for branching heuristic optimization: In our previous work with LRB, we defined a metric that measures learning rate per variable. In this paper, we define a metric called the global learning rate (GLR), that measures the number of learnt clauses generated by the solver per decision, which intuitively is a better metric to optimize since it measures the solver as a whole. We show that the objective of maximizing GLR is consistent with our knowledge of existing branching heuristics, that is, the faster branching heuristics tend to achieve higher GLR. We perform extensive experiments over 7 well-known branching heuristics to establish the correlation between high GLR and better solver performance (Sect. 3). A new branching heuristic to greedily maximize GLR: To further scientifically test the conjecture that GLR maximization is a good objective, we design a new branching heuristic that greedily maximizes GLR by always selecting decision variables that cause immediate conflicts. It is greedy in the sense that it optimizes for causing immediate conflicts, and it does not consider future conflicts as part of its scope. Although the computational overhead of this heuristic is very high, the variables it selects are “better” than VSIDS. More precisely, if we ignore the computation time to compute the branching variables, the greedy branching heuristic generally solves more instances faster than VSIDS. Another positive side-effect of the greedy branching heuristic is that relative to VSIDS, it has lower learnt clause literal block distance (LBD) [3], a sign that it is learning higher quality clauses. The combination of learning faster (due to higher GLR) and learning better (due to lower LBD) clauses explains the power of the greedy branching heuristic. Globally optimizing the GLR considering all possible future scenarios a solver can take is simply too prohibitive. Hence, we limited our experiments to the greedy approach. Although this greedy branching heuristic takes too long to select variables in practice, it gives us a gold standard of what we should aim for. We try to approximate it as closely as possible in our third contribution (Sect. 4). A new machine learning branching heuristic to maximize GLR: We design a second heuristic, called stochastic gradient descent branching (SGDB), using machine learning to approximate our gold standard, the greedy branching heuristic. SGDB trains an online logistic regression model by observing the conflict analysis procedure as the CDCL algorithm solves an instance. As conflicts are generated, SGDB will update the model to better fit its observations. Concurrently, SGDB also uses this model to rank variables based on their likelihood to generate conflicts if branched on. We show that in practice, SGDB is on par with the VSIDS branching heuristic over a large and diverse benchmark but still shy of LRB. However, more work is required to improve the learning in SGDB (Sect. 5).

2 Background

	Clause Learning: Clause learning produces a new clause after each conflict to prevent the same or similar conflicts from reoccurring [22]. This requires maintaining an implication graph where the nodes are assigned literals and edges are implications forced by Boolean constraint propagation (BCP). When a clause is falsified, the CDCL solver invokes conflict analysis to produce a learnt clause from the conflict. It does so by cutting the implication graph, typically at the first-UIP [22], into the reason side and the conflict side with the condition that the decision variables appear on the reason side and the falsified clause appears on the conflict side. A new learnt clause is constructed by negating the reason side literals incident to the cut. Literal block distance (LBD) is a popular metric for measuring the “quality” of a learnt clause [3]. The lower the LBD the better.

7 Related Work

The VSIDS branching heuristic, currently the most widely implemented branching heuristic in CDCL solvers, was introduced by the authors of the Chaff solver in 2001 [23] and later improved by the authors of the MiniSat solver in 2003 [11].

Carvalho and Marques-Silva introduced a variation of VSIDS in 2004 where the bump value is determined by the learnt clause length and backjump size [8] although their technique is not based on machine learning.译文：Carvalho和Marques-Silva在2004年引入了一种VSIDS的变体，其中凸点值由学到的子句长度和回跳大小[8]决定，尽管他们的技术并不是基于机器学习。

Lagoudakis and Littman introduced a new branching heuristic in 2001 that dynamically switches between 7 different branching heuristics using reinforcement learning to guide the choice [17].译文：Lagoudakis和Littman在2001年引入了一种新的分支启发式，它使用强化学习来引导选择[17]，在7种不同的分支启发式之间动态切换。

Liang et al. introduced two branching heuristics, CHB and LRB, in 2016 where a stateless reinforcement learning algorithm selects the branching variables themselves. 译文：Liang等人在2016年引入了两种分支启发式算法CHB和LRB，其中无状态强化学习算法自己选择分支变量。

CHB does not view branching as an optimization problem, whereas LRB, GGB, SGDB do. As stated earlier, LRB optimizes for learning rate, a metric defined with respect to variables. GGB and SGDB optimize for global learning rate, a metric defined with respect to the solver.译文：如前所述，LRB优化了学习率，这是一个关于变量定义的度量。GGB和SGDB优化全局学习率，这是一个关于求解器定义的度量。

8 Conclusion and Future Work

Finding the optimal branching sequence is nigh impossible, but we show that using the simple framework of optimizing GLR has merit.译文：找到最优分支序列几乎是不可能的，但我们证明了使用优化GLR的简单框架的优点。

The crux of the question since the success of our LRB heuristic is whether solving the learning rate optimization problem is indeed a good way of designing branching heuristics.译文：由于我们的LRB启发式的成功，问题的关键在于求解学习率优化问题是否确实是设计分支启发式的好方法。

A second question is whether machine learning algorithms are the way to go forward. 译文：第二个问题是机器学习算法是否是前进的方向。

We answer both questions via a thorough analysis of 7 different notable branching heuristics, wherein we provide strong empirical evidence that better branching heuristics correlate with higher GLR.译文：我们通过对7种不同显著的分支启发式的深入分析来回答这两个问题，其中我们提供了强有力的经验证据，表明更好的分支启发式与更高的GLR相关。

Further, we show that higher GLR correlates with lower LBD, a popular measure of quality of learnt clauses.译文：此外，我们发现高的GLR与低的LBD相关，LBD是衡量学习从句质量的常用指标。

Additionally, we designed a greedy branching heuristic to maximize GLR and showed that it outperformed VSIDS, one of the most competitive branching heuristics.译文：此外，我们设计了一个贪婪的分支启发式来最大化GLR，并证明了它优于最具竞争力的分支启发式之一VSIDS。

To answer the second question, we designed the SGDB that is competitive vis-a-vis VSIDS. With the success of LRB and SGDB, we are more confident than ever before in the wisdom of using machine learning techniques as a basis for branching heuristics in SAT solvers.译文：为了回答第二个问题，我们设计了与VSIDS竞争的SGDB。随着LRB和SGDB的成功，我们对使用机器学习技术作为SAT求解器分支启发式的基础的智慧比以往任何时候都更有信心。

posted on 2020-11-05 11:20 海阔凭鱼跃越阅读(128) 评论(0) 收藏举报

刷新页面返回顶部