利用BLEU进行机器翻译检测(Python-NLTK-BLEU评分方法)

双语评估替换分数(简称BLEU)是一种对生成语句进行评估的指标。完美匹配的得分为1.0,而完全不匹配则得分为0.0。这种评分标准是为了评估自动机器翻译系统的预测结果而开发的,具备了以下一些优点:

  1. 计算速度快,计算成本低。
  2. 容易理解。
  3. 与具体语言无关。
  4. 已被广泛采用。

BLEU评分是由Kishore Papineni等人在他们2002年的论文BLEU a Method for Automatic Evaluation of Machine Translation中提出的。BLEU计算的原理是计算待评价译文和一个或多个参考译文间的距离。距离是文本间n元相似度的平均,n=1,2,3(更高的值似乎无关紧要)。也就是说,如果待选译文和参考译文的2元(连续词对)或3元相似度较高,那么该译文的得分就较高。

我们是翻译众包业务,对于我们的应用场景,如何得知译员是否有参考机器翻译引擎就成了一个比较重要的问题。我提出的基本思路是:

  1. 在多个翻译网站上翻译原文,得到一组机器翻译评测集,以下的例子中就是一段原文通过百度、有道翻译之后,组织了一个机器翻译评测集
  2. 将译员翻译出来的译文,作为待评测数据,计算其与机器翻译评测集的BLEU值(使用NLTK中提供的BLEU评分方法)
  3. 值越高,表明匹配度越高,则译员参考机器翻译或者直接拷贝机器翻译的可能性就越高,此时需要项目经理介入。

以下是示例:

  1、原文

新译星将代表四达时代集团在展览会上闪亮登场,届时我们将从新译星所开展的业务、具备的优势、成功案例等多个维度进行介绍,让您更加全面的了解新译星。我们拥有稳定的全职国际化团队,能够确保守时、高效的完成翻译和配音,并通过至臻完善的质量控制和项目管理体系进行全方位把控,提供翻译、配音、字幕制作、后期制作、播出以及收视率调查等一条龙服务。

  2、人工翻译

New Transtar will present itself at the Exhibition on behalf of StarTimes, and we will give a comprehensive introduction of ourselves, including the current services we offer, the advantages we hold, and the projects we have completed, to help you understand us more. New Transtar boasts of an international team of professionals and is capable of providing fast and quality-guaranteed services including translating, dubbing, subtitle making, post-production, broadcasting and collecting of viewership ratings, thanks to our strict, streamlined and developed quality control and project management system.

  3、百度翻译

The new translator will stand on the exhibition on behalf of the four times group at the exhibition. We will introduce the new star's business, the advantages and the successful cases, so that you can understand the new translator more comprehensively. We have a stable full-time international team that ensures punctual, efficient translation and dubbing, and provides a full range of control through the perfect quality control and project management system, providing a one-stop service for translation, dubbing, subtitle production, post production, broadcasting, and ratings surveys.

  4、有道翻译

The new translator star will represent sida times group in the exhibition, when we will introduce the new translator star's business, advantages, successful cases and other dimensions, so that you can have a more comprehensive understanding of the new translator star. We have a stable full-time international team, which can ensure timely and efficient translation and dubbing. Through perfect quality control and project management system, we provide translation, dubbing, subtitle production, post-production, broadcasting and rating survey.

  5、用百度翻译和有道翻译组织机器翻译评测集

[['The', 'new', 'translator', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful', 'cases', 'so', 'that', 'you', 'can', 'understand', 'the', 'new', 'translator', 'more', 'comprehensively', 'We', 'have', 'a', 'stable', 'full-time', 'international', 'team', 'that', 'ensures', 'punctual', 'efficient', 'translation', 'and', 'dubbing', 'and', 'provides', 'a', 'full', 'range', 'of', 'control', 'through', 'the', 'perfect', 'quality', 'control', 'and', 'project', 'management', 'system', 'providing', 'a', 'one-stop', 'service', 'for', 'translation', 'dubbing', 'subtitle', 'production', 'post', 'production', 'broadcasting', 'and', 'ratings', 'surveys'],['The', 'new', 'translator', 'star', 'will', 'represent', 'sida', 'times', 'group', 'in', 'the', 'exhibition', 'when', 'we', 'will', 'introduce', 'the', 'new', 'translator', 'star`s', 'business', 'advantages', 'successful', 'cases', 'and', 'other', 'dimensions', 'so', 'that', 'you', 'can', 'have', 'a', 'more', 'comprehensive', 'understanding', 'of', 'the', 'new', 'translator', 'star', 'We', 'have', 'a', 'stable', 'full-time', 'international', 'team', 'which', 'can', 'ensure', 'timely', 'and', 'efficient', 'translation', 'and', 'dubbing', 'Through', 'perfect', 'quality', 'control', 'and', 'project', 'management', 'system', 'we', 'provide', 'translation', 'dubbing', 'subtitle', 'production', 'post-production', 'broadcasting', 'and', 'rating', 'survey']]

  6、用人工翻译组织待检测数据

['New', 'Transtar', 'will', 'present', 'itself', 'at', 'the', 'Exhibition', 'on', 'behalf', 'of', 'StarTimes', 'and', 'we', 'will', 'give', 'a', 'comprehensive', 'introduction', 'of', 'ourselves', 'including', 'the', 'current', 'services', 'we', 'offer', 'the', 'advantages', 'we', 'hold', 'and', 'the', 'projects', 'we', 'have', 'completed', 'to', 'help', 'you', 'understand', 'us', 'more', 'New', 'Transtar', 'boasts', 'of', 'an', 'international', 'team', 'of', 'professionals', 'and', 'is', 'capable', 'of', 'providing', 'fast', 'and', 'quality-guaranteed', 'services', 'including', 'translating', 'dubbing', 'subtitle', 'making', 'post-production', 'broadcasting', 'and', 'collecting', 'of', 'viewership', 'ratings', 'thanks', 'to', 'our', 'strict', 'streamlined', 'and', 'developed', 'quality', 'control', 'and', 'project', 'management', 'system']

  7、首先测试人工翻译产出的译文与机器翻译评测集之间的BLEU值,得到结果为0.119115465241,如下

[root@host-10-0-251-156 ~]# python
Python 2.7.5 (default, Apr 11 2018, 07:36:10)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk.translate.bleu_score import sentence_bleu
>>>
>>> reference=[['The', 'new', 'translator', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful', 'cases', 'so', 'that', 'you', 'can', 'understand', 'the', 'new', 'translator', 'more', 'comprehensively', 'We', 'have', 'a', 'stable', 'full-time', 'international', 'team', 'that', 'ensures', 'punctual', 'efficient', 'translation', 'and', 'dubbing', 'and', 'provides', 'a', 'full', 'range', 'of', 'control', 'through', 'the', 'perfect', 'quality', 'control', 'and', 'project', 'management', 'system', 'providing', 'a', 'one-stop', 'service', 'for', 'translation', 'dubbing', 'subtitle', 'production', 'post', 'production', 'broadcasting', 'and', 'ratings', 'surveys'],['The', 'new', 'translator', 'star', 'will', 'represent', 'sida', 'times', 'group', 'in', 'the', 'exhibition', 'when', 'we', 'will', 'introduce', 'the', 'new', 'translator', 'star`s', 'business', 'advantages', 'successful', 'cases', 'and', 'other', 'dimensions', 'so', 'that', 'you', 'can', 'have', 'a', 'more', 'comprehensive', 'understanding', 'of', 'the', 'new', 'translator', 'star', 'We', 'have', 'a', 'stable', 'full-time', 'international', 'team', 'which', 'can', 'ensure', 'timely', 'and', 'efficient', 'translation', 'and', 'dubbing', 'Through', 'perfect', 'quality', 'control', 'and', 'project', 'management', 'system', 'we', 'provide', 'translation', 'dubbing', 'subtitle', 'production', 'post-production', 'broadcasting', 'and', 'rating', 'survey']]
>>>
>>> candidate=['New', 'Transtar', 'will', 'present', 'itself', 'at', 'the', 'Exhibition', 'on', 'behalf', 'of', 'StarTimes', 'and', 'we', 'will', 'give', 'a', 'comprehensive', 'introduction', 'of', 'ourselves', 'including', 'the', 'current', 'services', 'we', 'offer', 'the', 'advantages', 'we', 'hold', 'and', 'the', 'projects', 'we', 'have', 'completed', 'to', 'help', 'you', 'understand', 'us', 'more', 'New', 'Transtar', 'boasts', 'of', 'an', 'international', 'team', 'of', 'professionals', 'and', 'is', 'capable', 'of', 'providing', 'fast', 'and', 'quality-guaranteed', 'services', 'including', 'translating', 'dubbing', 'subtitle', 'making', 'post-production', 'broadcasting', 'and', 'collecting', 'of', 'viewership', 'ratings', 'thanks', 'to', 'our', 'strict', 'streamlined', 'and', 'developed', 'quality', 'control', 'and', 'project', 'management', 'system']
>>>
>>> score = sentence_bleu(reference, candidate)
>>> print score
0.119115465241
>>>

  8、其次我们稍微改动以下百度翻译出来的译文,并测试其与机器翻译评测集之间的BLEU值,得到结果0.875629670466,如下:

    8.1稍微改动之后的百度翻译

New Transtar will stand on the exhibition on behalf of the four times group at the exhibition. We will introduce the new star's business, the advantages and the successful cases, so that you can understand the new translator more comprehensively. We have a stable full-time international team that ensures punctual, efficient translation and dubbing, and provides a full range of control through the perfect quality control and project management system, providing a one-stop service for translation, dubbing, subtitle production, streamlined and developed quality control and project management system.

    8.2用改动之后的百度翻译作为待评测数据

['New', 'Transtar', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful', 'cases', 'so', 'that', 'you', 'can', 'understand', 'the', 'new', 'translator', 'more', 'comprehensively', 'We', 'have', 'a', 'stable', 'full-time', 'international', 'team', 'that', 'ensures', 'punctual', 'efficient', 'translation', 'and', 'dubbing', 'and', 'provides', 'a', 'full', 'range', 'of', 'control', 'through', 'the', 'perfect', 'quality', 'control', 'and', 'project', 'management', 'system', 'providing', 'a', 'one-stop', 'service', 'for', 'translation', 'dubbing', 'subtitle', 'production', 'streamlined', 'and', 'developed', 'quality', 'control', 'and', 'project', 'management', 'system']

    8.3BLEU计算

>>> candidate_baidu=['New', 'Transtar', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful', 'cases', 'so', 'that', 'you', 'can', 'understand', 'the', 'new', 'translator', 'more', 'comprehensively', 'We', 'have', 'a', 'stable', 'full-time', 'international', 'team', 'that', 'ensures', 'punctual', 'efficient', 'translation', 'and', 'dubbing', 'and', 'provides', 'a', 'full', 'range', 'of', 'control', 'through', 'the', 'perfect', 'quality', 'control', 'and', 'project', 'management', 'system', 'providing', 'a', 'one-stop', 'service', 'for', 'translation', 'dubbing', 'subtitle', 'production', 'streamlined', 'and', 'developed', 'quality', 'control', 'and', 'project', 'management', 'system']
>>> score_baidu = sentence_bleu(reference, candidate_baidu)
>>> print score_baidu
0.875629670466
>>>

  9、由上面示例可看到,当待评测译文非常接近(也就是说该译员参考了机器翻译或直接进行的拷贝)机器翻译评测集中的数据时,BLEU值会升高。不过至于高到什么程度才需要项目经理介入,这就需要在实际项目中不断的摸索了。

 

posted @ 2018-08-03 11:32  振宇要低调  阅读(11288)  评论(0编辑  收藏  举报