斯坦福-mining massive data sets

CS246

Mining Massive Data Sets

Winter 2012

http://www.stanford.edu/class/cs246/

CS246

Mining Massive Data Sets

Winter 2012

The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.
Topics include: Frequent itemsets and Association rules, Near Neighbor Search in High Dimensional Data, Locality Sensitive Hashing (LSH), Dimensionality reduction, Recommendation Systems, Clustering, Link Analysis, Large scale supervised machine learning, Data streams, Mining the Web for Structured Data, Relation extraction and Web Advertising.

CS246 is the first part in a two part sequence CS246--CS341. CS246 will discuss methods and algorithms for mining massive data sets, whileCS341: Project in Mining Massive Data Sets will be a project-focused advanced class with an unlimited access to a large MapReduce cluster.

CS341

Project in Mining Massive Data Sets

Spring 2011

CS341 (Project in Mining Massive Data Sets) is a project-focused advanced class with access to a large MapReduce cluster. This course is the second part in a two part sequence CS246/CS341 replacing CS345A: Data Mining. CS246 discusses methods and algorithms for mining massive data sets.

In this class, we will develop large scale data mining techniques and research projects. Students will have access to Amazon EC2 comptuing cluster. This means we will be able to run massive MapReduce jobs. Because it is challenging to work on algorithms for large scale data mining, we will be able to work with only a small number of students, and enrollment will be limited.

This is a purely project based course. We expect that students are already to some extent familiar with data mining methods. There will be lectures on some advanced data mining algorithm at the begging of the quarter. We also expect to have a good number of industrial guest lecturers discussing big data case studies.

CS345A, Winter 2009: Data Mining

http://i.stanford.edu/~ullman/mining/2009/index.html#info

posted on 2012-10-11 10:26 xiaoshier 阅读(404) 评论(0) 收藏举报

刷新页面返回顶部

xiaoshier

斯坦福-mining massive data sets

导航

公告