少数人的智慧

郑昀@玩聚SR 20091105

一、冷启动

Greg Linden针对最新的一篇论文："The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web" (PDF，即《少数人的智慧：基于网络专家意见的协同过滤研究》) 做了如下点评：

“

What they do say is that using a very small pool of experts works surprisingly well.

论文说的是，用很小一个专家池，推荐效果惊人地好。

In particular, I think it suggests a good alternative to content-based methods for bootstrapping a recommender system.

我认为它为一个推荐系统的自启动指出了一个很好的替代选择。

If you can create a high quality pool of experts, even a fairly small one, you may have good results starting with that while you work to gather ratings from the broader community.

”

即，选择一个高质量专家池，可以是你组建的团队，也可以是你选中的专家群，即使是相当小的一个群体，你的推荐系统也会有一个非常好的开端。少数人的智慧，此时此刻，可以解决推荐系统的冷启动问题。这也是玩聚SR最开始选择Experts Pool作为起源，一上来就有很好信息过滤器效果的原因。

二、论文的摘要：

为了方便理解，下面意译一下该论文：

最近邻协同过滤（Nearest-neighbor collaborative filtering）是一个很有效的推荐方法。但它总受困于这几个问题：

数据稀疏和噪音；冷启动问题（cold-start）；可扩展性问题。

所以论文作者提出一个新方法，一个传统协同过滤方法的变种：

并不是对用户打分数据（User-rating data）实施最近邻算法，而是用一个专家邻居（expert neighbors）集合作为比对样本，去计算这批人与目标用户的相似度。

这个方法至少没有太大可扩展性问题，相当于缩小了比对的基准集合。最近邻原方法可近似理解为做两两比对，计算肯定花时间，而且当新用户（尤其是某某观光团的到来会让数据噪音多得一塌糊涂）比比皆是时，没有几条数据能够让你进行相似性计算。

作者定义专家为，在给定领域，能够产生思虑周全的、始终如一的和可靠的评估（评分）、我们可信任的独立个体。

（原文：

We define an expert as an individual that we can
trust to have produced thoughtful, consistent and reliable
evaluations (ratings) of items in a given domain.

）

我们比较关注论文作者们的以下两个探讨问题的角度：

(a) study how preferences of a large population can be pre-
dicted by using a very small set of users;

研究用一小群用户去预测海量用户到底有多大的可参考价值；

如果这几个角度是可行的话，那么实际上并不需要拿到一个海量用户社区的所有数据，只要锁定Experts Pool即可为用户进行推荐。

附录：

Greg Linden在被封的BlogSpot的原文如下：

Wednesday, November 04, 2009

Using only experts for recommendations

A recent paper from SIGIR, "The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web" (PDF), has a very useful exploration into the effectiveness of recommendations using only a small pool of trusted experts.
The results suggest that using a small pool of a couple hundred experts, possibly your own experts or experts selected and mined from the web, has quite a bit of value, especially in cases where big data from a large community is unavailable.
A brief excerpt from the paper:

Recommending items to users based on expert opinions .... addresses some of the shortcomings of traditional CF: data sparsity, scalability, noise in user feedback, privacy, and the cold-start problem .... [Our] method's performance is comparable to traditional CF algorithms, even when using an extremely small expert set .... [of] 169 experts.
Our approach requires obtaining a set of ... experts ... [We] crawled the Rotten Tomatoes web site –- which aggregates the opinions of movie critics from various media sources -- to obtain expert ratings of the movies in the Netflix data set.

The authors certainly do not claim that using a small pool of experts is better than traditional collaborative filtering.
What they do say is that using a very small pool of experts works surprisingly well. In particular, I think it suggests a good alternative to content-based methods for bootstrapping a recommender system. If you can create a high quality pool of experts, even a fairly small one, you may have good results starting with that while you work to gather ratings from the broader community.

其他文章：

posted @ 2009-11-05 18:17 老兵笔记阅读(1534) 评论(0) 收藏举报

刷新页面返回顶部

少数人的智慧

Wednesday, November 04, 2009

Using only experts for recommendations

公告