浙江省高等学校教师教育理论培训

微信搜索“毛凌志岗前心得”小程序

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

RunningMapReduceExampleTFIDF - hadoop-clusternet - This document describes how to run the TF-IDF MapReduce example against ascii books. - This project is for those who wants to experiment hadoop as a skunkworks in a small cluster (1-10 nodes) - Google Project Hosting

  // inverse document frequency quotient between the number of docs in corpus and number of docs the
               
// term appears Normalize the value in case the number of appearances is 0.
               
double idf = Math.log10((double) numberOfDocumentsInCorpus /
                   
(double) ((numberOfDocumentsInCorpusWhereKeyAppears == 0 ? 1 : 0) +
                         numberOfDocumentsInCorpusWhereKeyAppears
));

posted on 2012-09-23 08:58  lexus  阅读(182)  评论(0)    收藏  举报