机器翻译评价指标——BLEU
BLEU (BiLingual Evaluation Understudy) 是机器翻译任务的评价指标。BLEU根据n-gram的不同分为:\(\text{BLEU}_{1}\)、\(\text{BLEU}_{2}\)、\(\text{BLEU}_{3}\)、\(\text{BLEU}_{4}\)
1.1 BLEU 数学公式
- Step 1:计算 precision
\[p_{n} = \frac{\sum_{C \in \text{Candidate}} \sum_{\text{n-gram} \in C} \text{Count}_{\text{clip}}(\text{n-gram})}{\sum_{C \in \text{Candidate}} \sum_{\text{n-gram}^\prime \in C^\prime} \text{Count}(\text{n-gram}^\prime)}
\]
其中,\(\text{Count}_{\text{clip}}(\text{n-gram})\) 表示既在候选的译文中又在参考译文中的 \(\text{n-gram}\)
分子:模型翻译的句子中出现在标准译文中的n-gram个数
分母:模型翻译的句子中所有的n-gram个数
- Step 2:计算 BP (Brevity Penalty)
\[\left.\mathbf{BP}=\left\{\begin{array}{ll}1&\quad\mathrm{if~}&c>r\\e^{(1-r/c)}&\quad\mathrm{if~}&c\leq r\end{array}\right.\right.
\]
其中,\(c\) 是候选翻译的长度;\(r\) 是有效参考语料的长度
- Step 3:计算 BLEU
\[\text{ВLЕU}_N=\text{BP}\cdot\exp\left(\sum_{n=1}^Nw_n\log p_n\right)
\]
1.2 BLEU 代码实现
import math
from collections import Counter
def cal_precision(reference, candidate, n):
candidate_ngrams = [tuple(candidate[i:i + n])
for i in range(len(candidate) - n + 1)]
reference_ngrams = [tuple(reference[i:i + n])
for i in range(len(reference) - n + 1)]
candidate_ngram_counts = Counter(candidate_ngrams)
reference_ngram_counts = Counter(reference_ngrams)
# Count the number of n-grams that appear in both candidate and reference
overlap_ngrams = sum(
min(candidate_ngram_counts[ngram], reference_ngram_counts[ngram])
for ngram in candidate_ngram_counts
)
return overlap_ngrams / len(candidate_ngrams)
def cal_bleu(reference, candidate, max_n=4):
if len(candidate) == 0:
return 0.0
brevity_penalty = 1 if len(candidate) > len(reference) else math.exp(1 - (len(reference) / len(candidate)))
term = math.exp(sum(1 / n * math.log(cal_precision(reference, candidate, n)) for n in range(1, max_n + 1)))
bleu = brevity_penalty * term
return bleu
if __name__ == "__main__":
# 定义参考翻译和预测翻译
reference = ['this', 'is', 'a', 'test']
candidate = ['this', 'is', 'a', 'test', 'too']
# 计算BLEU分数
bleu_score = cal_bleu(reference, candidate, max_n=4)
print(f'BLEU Score: {bleu_score}')

浙公网安备 33010602011771号