结对项目

软件工程	网络工程1934
这个作业要求在哪里	结对项目
这个作业的目标	学会结对编程，合作编写程序

3119005389 麦俊宇
3119005385 刘宇

PSP2.1	Personal Software Process Stages	预估耗时（分钟）	实际耗时（分钟）
Planning	计划
Estimate	估计这个任务需要多少时间	30	30
Development	开发
Analysis	需求分析 (包括学习新技术)	120	120
Design Spec	生成设计文档	30	20+?
Design Review	设计复审	60	60
Coding Standard	代码规范 (为目前的开发制定合适的规范)	5	5
Design	具体设计	20	15
Coding	具体编码	60	80
Code Review	代码复审	30	60
Test	测试（自我测试，修改代码，提交修改）	60	90
Reporting	报告
Test report	测试报告	60	60
Size Measurement	计算工作量	20	10
Postmortem & Process Improvement	事后总结, 并提出过程改进计划	10	30
Total	合计	505	580+?

Github目录

设计思路

用一个邻接表表示各模块之间的调用关系

main -> generateExercise -> checkExercise

generateExercise -> generateFunction

generateFunction -> generateAnswer

checkExercise -> generateAnswer

1.主程序

  程序进入点，处理参数，对于不同的参数输入用不同的模块进行处理，使用-ae参数的进行批改，而使用-nr参数的生成题目，否则报错；

2.生成题目

generateExercise()
  使用主程序传递nr作为参数，并调用n次生成算式模块进行算式的生成，生成完成后将题目和答案分别输入到文件中以待检阅，本身不参与具体的运算；

3.生成算式

generateFunction()
  在生成时避免重复，在随机生成时添加限制，使式子不会重复，生成后检查答案是否为负数，并且有除号时是否为真分数（分子一定小于分母），如果不是则重新生成
  并将符合条件的式子返回到上一级函数；

4.生成答案

generateAnswer()
  根据给出的式子生成答案，要对给出的式子进行处理，使其能被解释器理解并计算，传入时保证式子合法；

5.对比答案

checkExercise()
  提取题目文本上Exercises.txt的题目并计算其正确答案，用之与答案文本answerfile.txt上的答案对比，并记录相同答案的题号，最后在结果文本Grade.txt输出答案比较情况：
  Correct 相同答案的数目（题号，...）
  Wrong 不相同答案的数目（题号，...）

代码说明

主程序

处理各种参数，判断参数是否正确，处理异常抛出。
主要就是处理参数的一部分，这里简单介绍一下

    try:                    # reading arguments
        opts, args = getopt.getopt(argv, "hn:r:e:a:", ["num=", "rad=", "eFile=", "aFile="])
    except getopt.GetoptError:
        print('Arguments ERROR')
        sys.exit(2)
    n, r = 50, -1
    flage, flaga = 0, 0
    for opt, arg in opts:   # process arguments
        if opt == '-h':
            print('Myapp.exe -r <radius> [-n <number=10>] to generate n Exercise which between 0 and r. Or')
            print('Myapp.exe -e <exerciseFile> -a <answerFile> to check if answer right')
        if opt in ('-n', '--num'):
            try:
                n = int(arg)
                if n > 1e5:
                    print('n cannot larger than 1e6. Set r to 1e6')
                    n = int(1e5)
            except ValueError:
                print('n must be a number')
                sys.exit(2)
        if opt in ('-r', '--rad'):
            try:
                r = int(arg)
                if r > 1e6:
                    print('r cannot larger than 1e6. Set r to 1e6')
                    r = int(1e6)
            except ValueError:
                print('r must be a number')
                sys.exit(2)
        if opt in ('-e', '--eFile'):
            try:
                efile = open(arg, mode='r')
                flage = 1
            except [OSError, FileNotFoundError]:
                print('Exercise file not found. Please double check the path and your spelling.')
                sys.exit(2)
        if opt in ('-a', '--aFile'):
            try:
                afile = open(arg, mode='r')
                flaga = 1
            except [OSError, FileNotFoundError]:
                print('Answer file not found. Please double check the path and your spelling.')
                sys.exit(2)

读入参数以及处理异常，还有返回错误信息，除了题目给出的nrea参数之外，还增加了一个h简要说明程序如何使用。

题目生成

这里占了随机生成的大头，因为我的程序为了避免生成重复，设置了前验和后验，前验可以避免生成交换之后重复的式子，不需要等到完成生成式子之后再去检验重复，大大提高了效率，而且这种重复占了绝大多数非法式子，而且编写随机生成时花了比较长的时间调参，让程序保证题目的多样性，这里的随机函数会随机生成五个参数交给下一个函数生成式子，经过设计的参数能够尽最大努力防止生成重复式子。

    random.seed()       # initial random
    ans = []
    again = False
    while len(ans) < n:
        been = (0, 1,)
        div = 1
        for i in range(1, r + 1):
            for j in range(0, r - i + 1):
                if i == 1 and j == 0 and random.random() < 0.8:
                    continue
                for k in (2, 3, 4):
                    if again and k == 2:
                        continue
                    if i + (j * (k - 1)) <= r:
                        if random.random() < 0.6:   # call generateFunction to generate integer function
                            ans += generateFunction(i, k, j, div, r)
                            if len(ans) >= 0.9 * n:
                                break
                    if len(ans) >= 0.9 * n:
                        break
                if len(ans) >= 0.9 * n:
                    break
            if len(ans) >= 0.9 * n:
                break

这里仅生成包含整数的式子，并符合需求和给出参数。

式子生成

根据给定的参数生成式子，根据五个参数大致确定了式子的结构，剩下来的是填充数字、符号以及括号，填充数字时要注意带分数形式，所以写了一个函数标准化分数，最后传入参数计算出答案进行后验（是否出现小于0，是否除以0，是否有除号时结果不为分数），通过检验之后才将式子返回。

def fractionsNormalize(s, r):       # formalize fractions
    x = fractions.Fraction(s)
    x = x.limit_denominator(r)
    if x.denominator == 1:
        return str(x.numerator)     # integer
    elif x > 1:
        t = int(x.numerator / x.denominator)
        b = int(x.numerator % x.denominator)
        return str(t) + '\'' + str(b) + '/' + str(x.denominator)    # fractions which > 1
    else:
        return str(x.numerator) + '/' + str(x.denominator)          # fractions which < 1

标准化分数，分为整数，带分数和真分数。

        for i in range(0, num):
            if i == a:      # insert left parentheses before number
                dec[i] += '('
            if div == 1:
                dec[i] += str(arr[(-i - 1)])
            else:
                dec[i] += fractionsNormalize(arr[i] / (div + random.choice(range(-div + 2, div))), div)
            if i == b:      # insert right parentheses after number
                dec[i] += ')'
        char = [random.choice(calcChar), ]
        for i in range(1, num - 1):
            char += [random.choice(calcChar), ]
            
        func = str(dec[0])
        for i in range(1, num):
            func += ' ' + char[i - 1] + ' ' + dec[i]

选定数字、符号和括号并向向字符串中填充。

计算答案

将式子翻译成合法的，能被解释器读懂的字符串，然后使用eval()运算出结果。

def repl(matched):
    res = '(' + matched.group() + ')'
    res = res.replace('\'', '+')
    return str(res)


def generateAnswer(func):
    func = str(re.sub("[1-9][0-9]*\'[1-9][0-9]*/[1-9][0-9]*", repl, func))
    func = str(re.sub("[1-9][0-9]*/[1-9][0-9]*", repl, func))   # use Regular Expression to translate the function
    func = func.replace('÷', '/')
    func = func.replace('×', '*')
    ans = decimal.Decimal(eval(func))
    return ans

要注意优先级，特别是带分数和真分数的加法和除法，举个例子：\(5 × 2'1/4\) 和 \(1 ÷ 6/8\) 不加括号就会出现优先级错误。

批改练习

将两个文件传入后，每一行计算并比对等号两边是否相等，因为用了Decimal，所以要使用误差控制，控制在了\(1e^{-6}\)内。

def checkExercise(efile, afile):
    correct = ()
    wrong = ()
    exc = efile.readline()
    ans = afile.readline()
    pid = 1
    while exc != "" and ans != "":
        a = generateAnswer(exc[len(str(pid)) + 1:-3])
        b = generateAnswer(ans[len(str(pid)) + 1:])
        if abs(a - b) < 0.000001:
            correct += (pid,)
        else:
            wrong += (pid,)
            # print(a, b)
        exc = efile.readline()
        ans = afile.readline()
        pid = pid + 1
    ofile = open("Grade.txt", 'w')
    ofile.write("Correct: " + str(len(correct)) + str(correct) + "\n")
    ofile.write("Wrong: " + str(len(wrong)) + str(wrong) + "\n")
    return

闪光点

前验和后验：大幅度节省时间，使每一条式子只需要验证自己而不需要交叉比对。如果要进行交叉比对的话，要使用至少 \(O(n^2)\) 的时间进行检验，而现在只需要 \(O(1)\)。
分数处理：使用一个函数规整化真分数、带分数以及整数，使式子符合要求的同时能够被解释器读懂，尽可能减少计算误差。

性能分析

   959 function calls (943 primitive calls) in 0.000 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      233    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
      147    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
      138    0.000    0.000    0.000    0.000 sre_parse.py:164(__getitem__)
       68    0.000    0.000    0.000    0.000 sre_parse.py:233(__next)
       64    0.000    0.000    0.000    0.000 sre_parse.py:172(append)
       63    0.000    0.000    0.000    0.000 sre_parse.py:254(get)
       63    0.000    0.000    0.000    0.000 {built-in method builtins.ord}
    52/46    0.000    0.000    0.000    0.000 {built-in method builtins.len}
       18    0.000    0.000    0.000    0.000 {built-in method builtins.min}
       13    0.000    0.000    0.000    0.000 sre_parse.py:160(__len__)
        9    0.000    0.000    0.000    0.000 enum.py:659(name)
        9    0.000    0.000    0.000    0.000 types.py:171(__get__)
      6/1    0.000    0.000    0.000    0.000 sre_compile.py:71(_compile)
        6    0.000    0.000    0.000    0.000 sre_parse.py:111(__init__)
      6/1    0.000    0.000    0.000    0.000 sre_parse.py:174(getwidth)
        5    0.000    0.000    0.000    0.000 sre_parse.py:249(match)
        5    0.000    0.000    0.000    0.000 sre_parse.py:493(_parse)
        5    0.000    0.000    0.000    0.000 {method 'find' of 'bytearray' objects}
        5    0.000    0.000    0.000    0.000 {built-in method builtins.max}
        2    0.000    0.000    0.000    0.000 enum.py:283(__call__)
        2    0.000    0.000    0.000    0.000 enum.py:562(__new__)
        2    0.000    0.000    0.000    0.000 sre_compile.py:453(_get_iscased)
        2    0.000    0.000    0.000    0.000 sre_compile.py:595(isstring)
        2    0.000    0.000    0.000    0.000 sre_parse.py:81(groups)
        2    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 re.py:250(compile)
        1    0.000    0.000    0.000    0.000 re.py:289(_compile)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 sre_compile.py:249(_compile_charset)
        1    0.000    0.000    0.000    0.000 enum.py:790(_missing_)
        1    0.000    0.000    0.000    0.000 enum.py:797(_create_pseudo_member_)
        1    0.000    0.000    0.000    0.000 enum.py:833(__and__)
        1    0.000    0.000    0.000    0.000 sre_compile.py:276(_optimize_charset)
        1    0.000    0.000    0.000    0.000 enum.py:886(<listcomp>)
        1    0.000    0.000    0.000    0.000 enum.py:869(_decompose)
        1    0.000    0.000    0.000    0.000 sre_compile.py:413(<listcomp>)
        1    0.000    0.000    0.000    0.000 sre_compile.py:411(_mk_bitmap)
        1    0.000    0.000    0.000    0.000 sre_compile.py:461(_get_literal_prefix)
        1    0.000    0.000    0.000    0.000 sre_compile.py:492(_get_charset_prefix)
        1    0.000    0.000    0.000    0.000 sre_compile.py:536(_compile_info)
        1    0.000    0.000    0.000    0.000 sre_compile.py:598(_code)
        1    0.000    0.000    0.000    0.000 sre_compile.py:759(compile)
        1    0.000    0.000    0.000    0.000 sre_parse.py:76(__init__)
        1    0.000    0.000    0.000    0.000 sre_parse.py:224(__init__)
        1    0.000    0.000    0.000    0.000 sre_parse.py:286(tell)
        1    0.000    0.000    0.000    0.000 sre_parse.py:435(_parse_sub)
        1    0.000    0.000    0.000    0.000 sre_parse.py:921(fix_flags)
        1    0.000    0.000    0.000    0.000 sre_parse.py:937(parse)
        1    0.000    0.000    0.000    0.000 {built-in method __new__ of type object at 0x00007FF85872B810}
        1    0.000    0.000    0.000    0.000 {built-in method _sre.compile}
        1    0.000    0.000    0.000    0.000 {method 'translate' of 'bytearray' objects}
        1    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'setdefault' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'extend' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'sort' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Cprofile跑太快了，结果0.000s完成了10000道题的生成（我怀疑它根本没跑）。从这里可以看到调用的最多的是list的调用，毕竟插入答案需要很多次，调用次数多的并且时间比较长的应该是eval()函数。

所以上面的效能分析可以说是根本没用嘛，只好自己写了一个全局的凑凑数。
根据全局跑出来的时间分析一下：

\(n = 1e4\ r = 1e6\)
generateTime:
Time elapsed: 0.46s +-0.01s
checkTime:
Time elapsed: 0.35s +-0.01s

\(n = 1e4\ r = 1\)
generateTime:
Time elapsed: 1.03s +-0.10s
checkTime:
Time elapsed: 0.45s +-0.02s

\(n = 1e5\ r = 1e6\)
generateTime:
Time elapsed: 4.54s +- 0.20s
checkTime:
Time elapsed: 14.66s +- 0.50s

第一个是正常设定参数时候的用时，虽然说只有\(1e^4\)的数据大小，但因为程序中很多字符串处理和python本身的缺点导致运行得很慢（相对于\(O(n)\)而言）；
第二个\(r=1\)的时候全是分数，并且要对r进行扩大(不然数量不够)，所以时间会长一点（一点，指翻了一倍）；
第三个生成时间确实是正常时的十倍，可以认为\(O(n)\)的时间复杂度是正确的，但批改的时候很慢，猜测是因为文件太大了（pyc都警告了）。

异常处理

在随机生成中的各种数据发生的异常在generateExercise里处理了，代码在上面有。
异常处理主要集中在main里：
对参数n, r的检查

        if opt in ('-n', '--num'):
            try:
                n = int(arg)
                if n > 1e5:
                    print('n cannot larger than 1e5. Set r to 1e5')
                    n = int(1e5)
            except ValueError:
                print('n must be a number')
                sys.exit(2)
        if opt in ('-r', '--rad'):
            try:
                r = int(arg)
                if r > 1e6:
                    print('r cannot larger than 1e6. Set r to 1e6')
                    r = int(1e6)
            except ValueError:
                print('r must be a number')
                sys.exit(2)


    if r > 0:
        generateExercise(n, r)
        print('Generating', n, 'exercise...')
    else:
        print('Require -r <radius> argument. Input Myapp.exe -h for further help.')
        sys.exit(2)

对文件名错误的异常：

        if opt in ('-e', '--eFile'):
            try:
                efile = open(arg, mode='r')
                flage = 1
            except (OSError, FileNotFoundError):
                print('Exercise file not found. Please double check the path and your spelling.')
                sys.exit(2)
        if opt in ('-a', '--aFile'):
            try:
                afile = open(arg, mode='r')
                flaga = 1
            except (OSError, FileNotFoundError):
                print('Answer file not found. Please double check the path and your spelling.')
                sys.exit(2)

测试运行

单元测试

对每一个函数（批改除外）进行单元测试，代码详情见Github。

..............
----------------------------------------------------------------------
Ran 14 tests in 0.004s

OK

测试批改函数

我和partner分别造了两组数据，在github上有，我这里贴一下他的：

//Ex.txt
1. 1'1/2 + 1/9 ÷ 55 = 
2. 1 + 1 - 1 = 
3. 2 +2 ÷ 9999 = 
4. 695 × 0 -0 = 
5. 0 + (7/10 + 2 - 1) = 
6. 3/10 ÷ ( 1'69/100 + 31/100) = 
7. 5 + 5 - 5 = 
8. 123 + 568 / 96 = 
9. 321 + 123 * 65 = 
10. 33 * 55 / 22 = 
11. 90 / 55 + 66 = 
12. 45 + 85 - 99 = 
13. 99 + 13 / 123 = 
14. 5 + 3 + 2 = 
15. 1 + 2 + 6 = 
16. 12 * 45 / 45 = 
17. 56 / 45 + 123 = 
18. 66 / 258 + 123 = 
19. 5 + 6 + 1'5/1 - 9 = 
20. 1 + 1 = 

//Ans.txt
1. 0
2. 5
3. 20000/9999
4. 0
5. 17/10
6. 3
7. 5
8. 1547/12
9. 8316
10. 330/4
11. 744/11
12. 65
13. 12190/123
14. 10
15. 9
16. 12
17. 5591/45
18. 5
19. 8
20. 2
//Myapp.exe -e Ex.txt -a Ans.txt
//Grade.txt
Correct: 15(3, 4, 5, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 19, 20)
Wrong: 5(1, 2, 6, 12, 18)

有兴趣的自己算一算。

总结

partner总结：结对编程需要团队之间进行充分的交流，对题目的具体要求要达成一致，
在程序上不能埋头自己写自己的，需要时常进行沟通交流并最好规定好格式以及所需要的参数，
目的是让各自写的程序模块能够进行很好衔接运行。

我的总结：笑死，我有点害怕上面的被查重了。这次的代码工作大多数都是我一个人完成的，partner因为过于得闲再写了一个简化版的，我放在了另一个分支里。吐槽一下partner总是抓不住重点，写了一个下午的设计文档，又写了一个晚上的总结，结果设计文档我还要大修了一次；让他造数据的时候也是，设计测试计算答案的数据的时候一直给我一些贼简单的还单调的题目，要么直接给我要手算1min的题（问题是不给我答案，再问他拿的时候还等了很久），设计测试生成题目的时候反而给我很复杂的题目，造数据就造了半小时，明明前面已经验证了计算答案的模块了。不过partner还是挺好的，至少注释和闪光点都是他的成就，毕竟我觉得这代码全是常识，就会一个注释也没有了（虽然说我打这个10000个字符的博客的时候，他在打一个1600字的选修课报告），还有造的数据也不错，虽然说因为格式不对被我打回去几次。算了不说了，感觉我在暴露别人黑点的说。~~毕竟找不到别人才会内部消化嘛~~

posted @ 2021-10-25 00:32 juseice 阅读(100) 评论(0) 收藏举报

刷新页面返回顶部

juseice