mahout推荐3-评估查准率和查全率

通过估计偏好值来生成推荐结果并非绝对必要。给出一个从优到劣的推荐列表对于许多场景都够用了,而不必包含估计的偏好值。

查准率:在top结果中相关结果的比例

查全率:所有相关结果,包含在top结果中的比例

对上个例子进行测试:

package mahout;

import java.io.File;

import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.eval.IRStatistics;
import org.apache.mahout.cf.taste.eval.RecommenderBuilder;
import org.apache.mahout.cf.taste.eval.RecommenderIRStatsEvaluator;
import org.apache.mahout.cf.taste.impl.eval.GenericRecommenderIRStatsEvaluator;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import org.apache.mahout.common.RandomUtils;

public class IRStatsEvalutator {
	public static void main(String[] args) throws Exception {
		RandomUtils.useTestSeed();
		DataModel dataModel = new FileDataModel(new File("data/intro.csv"));
		RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();
		//用于生成推荐引擎的构建器,与上一例子实现相同
		RecommenderBuilder builder = new RecommenderBuilder() {
			
			public Recommender buildRecommender(DataModel model) throws TasteException {
				// TODO Auto-generated method stub
				//用户相似度,多种方法
				UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
				//用户邻居
				UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);
				//一个推荐器
				return new GenericUserBasedRecommender(model, neighborhood, similarity);
			}
		};
		//评估推荐2个结果时的查准率和查全率
		IRStatistics statistics = evaluator.evaluate(builder, null, dataModel, null, 2, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1.0);
		
		System.out.println("查准率:"+statistics.getPrecision());
		System.out.println("查全率:"+statistics.getRecall());
	}
}

 输出结果:

14/08/04 09:46:21 INFO file.FileDataModel: Creating FileDataModel for file data\intro.csv
14/08/04 09:46:21 INFO file.FileDataModel: Reading file info...
14/08/04 09:46:21 INFO file.FileDataModel: Read lines: 21
14/08/04 09:46:21 INFO model.GenericDataModel: Processed 5 users
14/08/04 09:46:21 INFO model.GenericDataModel: Processed 5 users
14/08/04 09:46:21 INFO model.GenericDataModel: Processed 5 users
14/08/04 09:46:21 INFO eval.GenericRecommenderIRStatsEvaluator: Evaluated with user 2 in 31ms
14/08/04 09:46:21 INFO eval.GenericRecommenderIRStatsEvaluator: Precision/recall/fall-out/nDCG/reach: 1.0 / 1.0 / 0.0 / 1.0 / 1.0
14/08/04 09:46:21 INFO model.GenericDataModel: Processed 5 users
14/08/04 09:46:21 INFO eval.GenericRecommenderIRStatsEvaluator: Evaluated with user 4 in 0ms
14/08/04 09:46:21 INFO eval.GenericRecommenderIRStatsEvaluator: Precision/recall/fall-out/nDCG/reach: 0.75 / 1.0 / 0.08333333333333333 / 1.0 / 1.0
查准率:0.75
查全率:1.0

 文件描述:见《mahout in Action》

posted @ 2014-08-04 09:49  jseven  阅读(1011)  评论(0编辑  收藏  举报