Loading

北航面向对象设计与构造2021第一单元作业总结

一、程序特征分析

本单元作业是实现一个多项式的求导程序,支持四则运算、乘方、三角函数、嵌套表达式和错误格式检测。

1. 复杂性分析

复杂性矩阵如下所示,除了解析常量、生成逆波兰表达式、生成表达式树等工具方法和格式检查、解析表达式等工具类外,核心架构中没有特别复杂的方法或类。

Method CogC ev(G) iv(G) v(G)
derivative.Main.main(String[]) 1 1 2 2
derivative.atom.Constant.Constant(BigInteger) 0 1 1 1
derivative.atom.Constant.add(Constant) 0 1 1 1
derivative.atom.Constant.getValue() 0 1 1 1
derivative.atom.Constant.multiply(Constant) 0 1 1 1
derivative.atom.Constant.takeDerivative() 0 1 1 1
derivative.atom.Constant.toString() 0 1 1 1
derivative.atom.Variable.takeDerivative() 0 1 1 1
derivative.atom.Variable.toString() 0 1 1 1
derivative.compound.Add.Add(Compound, Compound) 0 1 1 1
derivative.compound.Add.takeDerivative() 0 1 1 1
derivative.compound.Add.toString() 3 4 3 4
derivative.compound.Cosine.Cosine(Compound) 0 1 1 1
derivative.compound.Cosine.takeDerivative() 0 1 1 1
derivative.compound.Cosine.toString() 2 2 1 4
derivative.compound.Multiply.Multiply(Compound, Compound) 0 1 1 1
derivative.compound.Multiply.takeDerivative() 0 1 1 1
derivative.compound.Multiply.toString() 5 5 5 6
derivative.compound.Negate.Negate(Compound) 0 1 1 1
derivative.compound.Negate.takeDerivative() 0 1 1 1
derivative.compound.Negate.toString() 4 4 3 6
derivative.compound.Power.Power(Compound, Constant) 0 1 1 1
derivative.compound.Power.takeDerivative() 0 1 1 1
derivative.compound.Power.toString() 2 3 3 3
derivative.compound.Sine.Sine(Compound) 0 1 1 1
derivative.compound.Sine.takeDerivative() 0 1 1 1
derivative.compound.Sine.toString() 2 2 1 4
derivative.utility.ExpressionParser.createTree(String) 9 1 5 12
derivative.utility.ExpressionParser.getRpnFrom(String) 8 1 4 7
derivative.utility.ExpressionParser.operator2rank(char) 1 11 11 11
derivative.utility.ExpressionParser.orderBetween(char, char) 0 1 1 1
derivative.utility.ExpressionParser.parse(String) 0 1 1 1
derivative.utility.ExpressionParser.preProcess(String) 0 1 1 1
derivative.utility.ExpressionParser.scanToEndOfInt(IterableString) 0 1 1 1
derivative.utility.FormatChecker.check(String) 2 3 1 3
derivative.utility.FormatChecker.checkConstant(IterableString) 10 4 6 7
derivative.utility.FormatChecker.checkExponent(IterableString) 2 3 1 3
derivative.utility.FormatChecker.checkExpression(IterableString) 4 1 3 5
derivative.utility.FormatChecker.checkExpressionFactor(IterableString) 2 3 1 3
derivative.utility.FormatChecker.checkFactor(IterableString) 4 1 5 5
derivative.utility.FormatChecker.checkPowerFunction(IterableString) 2 2 2 3
derivative.utility.FormatChecker.checkTerm(IterableString) 3 1 3 4
derivative.utility.FormatChecker.checkTrigFunction(IterableString) 5 4 3 6
derivative.utility.FormatChecker.checkVariable(IterableString) 2 1 2 2
derivative.utility.FormatChecker.checkWhiteSpace(IterableString) 4 3 3 4
derivative.utility.IterableString.IterableString(String) 0 1 1 1
derivative.utility.IterableString.current() 2 2 2 2
derivative.utility.IterableString.hasNext() 0 1 1 1
derivative.utility.IterableString.iterator() 0 1 1 1
derivative.utility.IterableString.next() 1 1 2 2
derivative.utility.IterableString.previous() 0 1 1 1
derivative.utility.IterableString.remaining() 0 1 1 1
derivative.utility.IterableString.skipCharsBy(int) 0 1 1 1
derivative.utility.IterableString.startsWith(String) 0 1 1 1
derivative.utility.IterableString.toString() 0 1 1 1
derivative.utility.WrongFormatException.WrongFormatException() 0 1 1 1
Class OCavg OCmax WMC
derivative.Atom n/a n/a 0
derivative.Compound n/a n/a 0
derivative.Main 1.00 1 1
derivative.atom.Constant 1.00 1 6
derivative.atom.Variable 1.00 1 2
derivative.compound.Add 2.00 4 6
derivative.compound.Cosine 1.33 2 4
derivative.compound.Multiply 2.33 5 7
derivative.compound.Negate 2.00 4 6
derivative.compound.Power 1.67 3 5
derivative.compound.Sine 1.33 2 4
derivative.utility.ExpressionParser 5.29 13 37
derivative.utility.FormatChecker 3.18 5 35
derivative.utility.IterableString 1.20 2 12
derivative.utility.Operator n/a n/a 0
derivative.utility.WrongFormatException 1.00 1 1
Package v(G)avg v(G)tot
derivative 2.00 2
derivative.atom 1.00 8
derivative.compound 2.17 39
derivative.utility 3.17 92

2. 代码量分析

代码量矩阵如下所示,除了解析格式检查、解析表达式等工具类外,核心架构中只有部分类的toString方法代码量较大,也只有这部分出现的 bug 最多。但这其实是与架构设计相关,后文会提到

Method CLOC JLOC LOC NCLOC RLOC
derivative.Compound.takeDerivative() 0 n/a n/a 2 33.33%
derivative.Compound.toString() 0 n/a n/a 2 33.33%
derivative.Derivable.takeDerivative() 0 n/a n/a 1 33.33%
derivative.Main.main(String[]) 0 0 11 11 84.62%
derivative.atom.Constant.Constant(BigInteger) 0 0 3 3 13.04%
derivative.atom.Constant.add(Constant) 0 0 3 3 13.04%
derivative.atom.Constant.getValue() 0 0 3 3 13.04%
derivative.atom.Constant.multiply(Constant) 0 0 3 3 13.04%
derivative.atom.Constant.takeDerivative() 0 0 4 4 17.39%
derivative.atom.Constant.toString() 0 0 4 4 17.39%
derivative.atom.Variable.takeDerivative() 0 0 4 4 40.00%
derivative.atom.Variable.toString() 0 0 4 4 40.00%
`derivative.compound.Add.Add(Compound Compound)` 0 0 4 4
derivative.compound.Add.takeDerivative() 0 0 4 4 16.00%
derivative.compound.Add.toString() 0 0 13 13 52.00%
derivative.compound.Cosine.Cosine(Compound) 0 0 3 3 17.65%
derivative.compound.Cosine.takeDerivative() 0 0 4 4 23.53%
derivative.compound.Cosine.toString() 0 0 7 7 41.18%
derivative.compound.Multiply.Multiply(Compound, Compound) 0 0 4 4 14.29%
derivative.compound.Multiply.takeDerivative() 0 0 5 5 17.86%
derivative.compound.Multiply.toString() 0 0 15 15 53.57%
derivative.compound.Negate.Negate(Compound) 0 0 3 3 12.50%
derivative.compound.Negate.takeDerivative() 0 0 4 4 16.67%
derivative.compound.Negate.toString() 0 0 14 14 58.33%
derivative.compound.Power.Power(Compound, Constant) 0 0 4 4 16.00%
derivative.compound.Power.takeDerivative() 0 0 7 7 28.00%
derivative.compound.Power.toString() 0 0 10 10 40.00%
derivative.compound.Sine.Sine(Compound) 0 0 3 3 17.65%
derivative.compound.Sine.takeDerivative() 0 0 4 4 23.53%
derivative.compound.Sine.toString() 0 0 7 7 41.18%
derivative.utility.ExpressionParser.createTree(String) 0 0 53 53 34.64%
derivative.utility.ExpressionParser.getRpnFrom(String) 11 0 35 33 22.88%
derivative.utility.ExpressionParser.operator2rank(char) 1 0 15 15 9.80%
derivative.utility.ExpressionParser.orderBetween(char, char) 1 0 3 3 1.96%
derivative.utility.ExpressionParser.parse(String) 0 0 4 4 2.61%
derivative.utility.ExpressionParser.preProcess(String) 0 0 19 19 12.42%
derivative.utility.ExpressionParser.scanToEndOfInt(IterableString) 0 0 6 6 3.92%
derivative.utility.FormatChecker.check(String) 0 0 10 10 5.32%
derivative.utility.FormatChecker.checkConstant(IterableString) 3 3 24 21 12.77%
derivative.utility.FormatChecker.checkExponent(IterableString) 4 3 13 10 6.91%
derivative.utility.FormatChecker.checkExpression(IterableString) 4 4 19 15 10.11%
derivative.utility.FormatChecker.checkExpressionFactor(IterableString) 0 0 9 9 4.79%
derivative.utility.FormatChecker.checkFactor(IterableString) 3 3 13 10 6.91%
derivative.utility.FormatChecker.checkPowerFunction(IterableString) 4 3 12 9 6.38%
derivative.utility.FormatChecker.checkTerm(IterableString) 3 3 18 15 9.57%
derivative.utility.FormatChecker.checkTrigFunction(IterableString) 4 4 24 20 12.77%
derivative.utility.FormatChecker.checkVariable(IterableString) 3 3 10 7 5.32%
derivative.utility.FormatChecker.checkWhiteSpace(IterableString) 0 0 8 8 4.26%
derivative.utility.IterableString.IterableString(String) 0 0 3 3 6.38%
derivative.utility.IterableString.current() 0 0 7 7 14.89%
derivative.utility.IterableString.hasNext() 0 0 4 4 8.51%
derivative.utility.IterableString.iterator() 0 0 4 4 8.51%
derivative.utility.IterableString.next() 0 0 8 8 17.02%
derivative.utility.IterableString.previous() 0 0 3 3 6.38%
derivative.utility.IterableString.remaining() 0 0 3 3 6.38%
derivative.utility.IterableString.skipCharsBy(int) 0 0 4 4 8.51%
derivative.utility.IterableString.startsWith(String) 0 0 3 3 6.38%
derivative.utility.IterableString.toString() 0 0 4 4 8.51%
derivative.utility.WrongFormatException.WrongFormatException() 0 0 3 3 60.00%
Class CLOC JLOC LOC
derivative.Atom 0 0 2
derivative.Compound 0 0 6
derivative.Main 0 0 13
derivative.atom.Constant 0 0 23
derivative.atom.Variable 0 0 10
derivative.compound.Add 0 0 25
derivative.compound.Cosine 0 0 17
derivative.compound.Multiply 0 0 28
derivative.compound.Negate 0 0 24
derivative.compound.Power 0 0 25
derivative.compound.Sine 0 0 17
derivative.utility.ExpressionParser 28 3 153
derivative.utility.FormatChecker 53 51 188
derivative.utility.IterableString 0 0 47
derivative.utility.Operator 0 0 3
derivative.utility.WrongFormatException 0 0 5
Interface CLOC JLOC LOC NCLOC
derivative.Derivable 0 0 3 4
Package CLOC CLOC(rec) JLOC JLOC(rec) LOC LOC(rec) LOCp LOCp(rec) LOCt LOCt(rec) NCLOC NCLOCp NCLOCp(rec) NCLOCt
total n/a 27 n/a 54 n/a 637 n/a 637 n/a 0 n/a n/a 634 n/a
derivative 0 27 0 54 31 637 31 637 0 0 31 31 634 0
derivative.atom 0 0 0 0 39 39 39 39 0 0 39 39 39 0
derivative.compound 0 0 0 0 151 151 151 151 0 0 151 151 151 0
derivative.utility 0 27 54 54 416 416 416 416 0 0 413 413 413 0

3. 架构与依赖分析

这是我作业 3 的架构,与作业 2 基本一致。基本符合高内聚、低耦合的思想,基本不存在“长臂管辖”的情况。

Throwable Negate Main Iterable Atom Multiply FormatChecker Enum Power Compound Add Iterator ExpressionParser Exception Serializable Operator Constable Derivable Constant WrongFormatException Variable Comparable IterableString Cosine Sine 依赖

二、重构经历与心得体会

1. 控制复杂性方法之一——抽象

在作业 1 中,我尝试了复杂正则,简单设计。关于正则表达式,我的理解是:用给人看的抽象符号描述的复杂规则。简单但扩展性差的设计是:构造一个因子类,存符号、系数、指数;一个表达式类,用TreeMap(红黑树,自动维护顺序)存所有因子,相邻两项是乘法就直接计算,是加法就看指数来计算,但如果变量不再是幂函数就没法办!

在作业 2 中,我可谓是迷途知返,推倒重构。尝试使用巨大的正则失败了,因为 Java 正则不支持递归、不支持操作正则栈(尝试写了,很绝望),并且手动用正则+栈解析仍然复杂。

static final String FACTOR_REGEX =
    "(?<mulSign>\\*)?" +
    "(?:(?<coe>[+-]?\\d+)|" +
    "(?:(?<fSign>[+-]?)(?:x|(?<trig>sin|cos)\\(x\\))(?:\\*\\*(?<pow>[+-]?\\d+))?)|" +
    "(?<subExpr>(?<subExprSign>[+-]?)\\((?:[^()]|(?:\\(.*\\)))*\\)))";
static final Pattern FACTOR_PATTERN = Pattern.compile(FACTOR_REGEX);

于是回归数据结构的解法:用两个栈,解析字符串并生成逆波兰表达式(其实可以略过逆波兰表达式直接构建表达式树),构建表达式树。由于 Java 并不能像 C/C++ 那样显式使用指针,故构造IterableString类实现迭代器并模拟*p++这种字符串指针操作:

import java.util.Iterator;
public class IterableString implements Iterator<Character>, Iterable<Character> {
    private final String value;
    private int cursor = 0;
    @Override
    public boolean hasNext() { return cursor < value.length(); }
    @Override
    public Character next() {
        if (!hasNext()) {
            System.*out*.println("WRONG FORMAT!");
            System.*exit*(0);
        }
        return value.charAt(cursor++);
    }
    @Override
    public Iterator<Character> iterator() { return this; }
}

我最深的体会是,面向对象的核心在于“抽象”,需要从题目定义中抽象出类和关系。于是形成了上面图中所示的架构。假如使用支持运算符重载的语言,则更加抽象,如定义了*+,则在乘法类中,求导的函数可以写为

Add takeDerivative() {
    return leftValue->takeDerivative() * rightValue + leftValue * rightValue->takeDerivative();
}

但我架构的缺陷是,因为没用容器,所以化简不太方便。我只在toString方法中做了必要的化简,而这不但可能增加了逻辑复杂度,而且很容易出现输出格式不满足题目要求的情况

在作业 3 中,我尝试用最少精力,顺利完成。因为格式错误情况难以穷举,而题目中已给出正确的形式化表述,对运行时空效率限制也较宽松。所以决定保持原架构不变,仅添加一个工具类来最先用递归下降检查格式,若格式有误,抛异常或终止;若格式无误,则将原字符串原样传给状态机解析。递归下降像是函数版状态机,并不难。

2. 总体感受

本单元作业大量时间花在了思考并尝试解析字符串的方法,而这是面向过程的。用正则?用状态机?还是学递归下降?不过面向对象设计与构造能力确实得到了提升。前面偷的懒,后面迟早还。一开始设计的缺陷,会严重影响代码复用性和扩展性。还好当时 C 语言数据结构代码风格不错,重构时可以直接拿来翻译。而更好的架构可以偷更大的懒。比如若对运行时空效率限制较宽松时,只要加一层就能解决新需求,就不用重构。

三、互测发现的问题

1. 自己程序的 bug

作业 1 中,由于重复的+-等符号会让正则表达式更加复杂,因此会被程序在解析表达式前直接替换掉。但由于题意理解有误,未意识到-+-的合法性,程序存在 bug。这种情况下,无论是自己手动还是自动构造测试样例,都无法发现该 bug,因为这是题意理解有误造成的。而且自动测试程序基于随机性生成,尽管可以调整生成不同组合之间的权重,但仍缺乏针对性。自动测试程序使用python实现,正确结果以sympy库为准。

import sympy
from random import randint
import subprocess
import os
NEG_HUGE_NUM = -31415926535897932384626433
HUGE_NUM = 31415926535897932384626433
JAR_PATH = "derivative.jar"
x = sympy.symbols("x")
start_symbols = ("", "+", "-")
symbols = ("+", "-", "*")
def gen_factor(option):
    if option == 0:
        return str(num_tup[randint(0, 1)])
    elif option == 1:
        return "x**" + str(num_tup[randint(0, 1)])
    elif option == 2:
        return "x"
    else:
        return str(randint(-1, 1))
cnt = 0
while True:
    cnt += 1
    num = randint(NEG_HUGE_NUM, HUGE_NUM)
    num_tup = (-num, num)
    original_expr = start_symbols[randint(0, 2)]
    for _ in range(20):
        original_expr += gen_factor(randint(0, 5)) + symbols[randint(0, 2)]
    original_expr += gen_factor(randint(0, 5))
    print(f"ORIGINAL: {original_expr}")
    master_expr = sympy.diff(original_expr, x)
    print(f"MASTER: {master_expr}")
    query_expr = subprocess.run(
        f"java -jar {JAR_PATH}",
        input=original_expr, stdout=subprocess.PIPE, encoding='utf-8').stdout
    print(f"QUERY: {query_expr}")
    if master_expr.equals(query_expr):
        print(f"Test case {cnt} passed.\n\n")
    else:
        print(f"Your output on test case {cnt} is wrong.\n\n")
        os.system("pause")

作业 2 中,由于运算符优先级表中取相反数和求幂的优先级设置反,且输出时考虑欠周,乘法输出了*-x之类的字符串被认定为格式错误。此外由于许多地方没有用临时变量保存函数调用的返回值,导致重复计算,有个别数据超时了。

作业 3 中,由于处理格式时调用next方法未判定终止条件而导致索引越界,以及乘法输出了*-sin*-cos之类的字符串被认定为格式错误。由于之前在toString方法中设置的逻辑仍不完备,故可见攻击者的数据经过了精心构造,当然也说明了代码的可读性较好,能够被攻击者轻易找到漏洞。最重要地,印证了前文所述,代码量大的方法,容易出 bug,提醒我在今后的实践中注意设计更方便的架构,进一步控制复杂度。

2. 他人程序的 bug 和问题

我一般首先通过代码风格大致猜测对方的水平。代码风格优雅的最容易吸引我学习的兴趣,而代码风格糟糕的最容易引起我攻击的兴趣。我一般先测试最简单的输入,结果在第二次作业中发现有同学一点都没有化简,只要括号一多,输出立刻超过一万个字符。简单的输入没有问题,再自动生成数据进行测试。

3. 控制复杂性方法之二——规范的代码风格

我发现很多同学的代码风格都有问题,特别是使用奇怪的名称。比如使用了缩写、用拼音命名或写注释,使读者不能一眼看出含义;类名用了动词,感觉很别扭;甚至使用无意义的名字,虽然我能理解他的心情,但毕竟无意义;MainClass里东西太多,完全可以把这些静态方法单独放到一个工厂类里;等等。我感觉这些不好的习惯主要都是从大一一开始接触 C 语言程序设计时造成的。本来 C 语言的程序就不好懂,但当时老师并没有强调这么多,结果这些不好的习惯一直被带到了现在。其实,看那些大项目,都有着很好的代码风格。比如,Git 是用 C 写的,但和很多同学的风格肯定大不相同。

git

假如像下面这样写,感受相同吗?

int main(int argc, const char **argv) {
    init_clk_trace();
    stdfds_sanit();
    default_sigpipe();
    exe_dir(argv[0]);
    gettext_setup();
    repo_init();
    start_attr();
    init_trace();
    cmd_start_trace(argv);
    proc_info_collect_trace(0);
    int res = cmd(argc, argv);
    exit_cmd_trace(result);
    return res;
}

Programs must be written for people to read, and only incidentally for machines to execute.
——Harold Abelson

If you don’t know what a thing should be called, you cannot know what it is.
If you don’t know what it is, you cannot sit down and write the code.
——Sam Gardiner

与大家共勉。

注:本文将要求的 5 部分合并到了 3 部分中。

posted @ 2021-03-26 20:34  人生就像一盘棋  阅读(290)  评论(2编辑  收藏  举报