Explanation---lucene中对于评分细节描述的类

一般通过IndexSearch.explain(query,docId)----》weight.explain(reader, doc) 方法得到一个文档的评分的具体信息 。

Explanation的信息如下:

4.803122 = (MATCH) fieldWeight(keywords:奶粉 in 457), product of: 
2.0 = tf(termFreq(keywords:奶粉)=4)
4.803122 = idf(docFreq=414, maxDocs=18609)
0.5 = fieldNorm(field=keywords, doc=457)


第一行表示总得分:document(docId=457)的score为4.803122,它由下面的三个值相乘得来
第二行表示项频率在document(docId=457)keywords这个filed中共出现了4个“奶粉”关键词,2.0是由根号4算出来的。
第三行表示反转文档频率:含有“奶粉”关键词的document共有414个,总的document有18609个,4.803122是由
ln(18609/(414+ 1) )+ 1.0 =ln(18609) -ln(415) +1 = 9.831400 - 6.028278 = 4.803122
第三行表示域的加权长度因子:fieldNorm = fieldboost / sqrt(fieldlength),其中fieldlength为keywords这个field的token(分词)数量  。

对于Weight的一个实现TermWeight中的weight.explain(reader, doc)实现如下 :

public Explanation explain(IndexReader reader, int doc)
throws IOException {
ComplexExplanation result
= new ComplexExplanation();
result.setDescription(
"weight("+getQuery()+" in "+doc+"), product of:");
Explanation idfExpl
= new Explanation(idf, "idf(docFreq=" + reader.docFreq(term) + ")");
// explain query weight
Explanation queryExpl = new Explanation();
queryExpl.setDescription(
"queryWeight(" + getQuery() + "), product of:");
Explanation boostExpl
= new Explanation(getBoost(), "boost");
if (getBoost() != 1.0f)
queryExpl.addDetail(boostExpl);
queryExpl.addDetail(idfExpl);
Explanation queryNormExpl
= new Explanation(queryNorm,"queryNorm");
queryExpl.addDetail(queryNormExpl);
queryExpl.setValue(boostExpl.getValue()
*idfExpl.getValue() *queryNormExpl.getValue());
result.addDetail(queryExpl);
// 说明Field的权重
String field = term.field();
ComplexExplanation fieldExpl
= new ComplexExplanation();
fieldExpl.setDescription(
"fieldWeight("+term+" in "+doc+"), product of:");
Explanation tfExpl
= scorer(reader).explain(doc);
fieldExpl.addDetail(tfExpl);
fieldExpl.addDetail(idfExpl);
Explanation fieldNormExpl
= new Explanation();
byte[] fieldNorms = reader.norms(field);
float fieldNorm =
fieldNorms
!=null ? Similarity.decodeNorm(fieldNorms[doc]) : 0.0f;
fieldNormExpl.setValue(fieldNorm);
fieldNormExpl.setDescription(
"fieldNorm(field="+field+", doc="+doc+")");
fieldExpl.addDetail(fieldNormExpl);

fieldExpl.setMatch(Boolean.valueOf(tfExpl.isMatch()));
fieldExpl.setValue(tfExpl.getValue()
*idfExpl.getValue() *fieldNormExpl.getValue());
result.addDetail(fieldExpl);
result.setMatch(fieldExpl.getMatch());

// combine them
result.setValue(queryExpl.getValue() * fieldExpl.getValue());
if (queryExpl.getValue() == 1.0f)
return fieldExpl;
return result;
}

参考文章:

http://www.cnblogs.com/lvpei/articles/1732475.html

http://archive.cnblogs.com/a/1906353/

http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Weight.html

posted @ 2011-04-08 15:11  xiao晓  阅读(1057)  评论(0)    收藏  举报