Kaggle的[公共数据集平台]
Kaggle的公共数据集平台上提供了大量可直接用于机器学习实战的趣味性和练习性数据集。
以下是我们团队实际评测后精选的推荐清单,这些数据集都经过(基本)清理,可直接用于分析!
二元分类
多类别分类
回归分析
自然语言处理
时间序列分析
图像处理
地理空间预测
大型数据集
(注:所有超链接已保留原始地址,分类标题根据数据集特性进行了专业术语转换,同时确保中文表达符合技术文档的规范要求)
There are lots of machine learning ready datasets available to use for fun or practice on Kaggle's Public Datasets platform. Here is a short list of some of our favorites that we've already had the chance to review. They're all (mostly) cleaned and ready for analysis!
Binary Classification
- Indian Liver Patient Records
- Synthetic Financial Data for Fraud Detection
- Business and Industry Reports
- Can You Predict Product Backorders?
- Exoplanet Hunting in Deep Space
- Adult Census Income
Multiclass Classification
Regression
NLP
- The Enron Email Dataset
- Ubuntu Dialogue Corpus
- Old Newspapers: A cleaned subset of HC Corpora newspapers
- Speech Accent Archive
- Blog Authorship Corpus
Time Series Analysis
Image Processing
Mapping and Prediction
- Seattle Police Department 911 Incident Response
- Baltimore 911 Calls
- Crimes in Chicago
- Philadelphia Crime Data
- London Crime

浙公网安备 33010602011771号