[GPU] Machine Learning on C++
一、MPI为何物?
初步了解:MPI集群环境搭建
二、重新认识Spark
链接:https://www.zhihu.com/question/48743915/answer/115738668
马铁大神的phd thesis 总结里面说了一句话 大概意思是说 单纯的如果使用mpi 来实现一个算法 比spark 快五六倍是很正常的 但是spark 是一个 general 的 data flow 处理框架 就是可以在数据的生命周期里面 可以使用spark 之上的具体实现来处理数据 ml 只是一部分而已 这就是spark 最大的卖点之一
所以你用这个Prophet平台来和spark 比 ml这方面的效率当然你要快了的 因为还有很多ml 专业的平台都要比spark 快 这就不列举了
因为spark 基于 mapreduce的 这种program model 就不是适合ml的 特别是ml 里面大量参数的模型 比如lda 之类的
三、Microsoft Distributed Machine Learning Toolkit (DMTK)
DMTK includes the following projects:
- DMTK framework(Multiverso): The parameter server framework for distributed machine learning.
- LightLDA: Scalable, fast and lightweight system for large-scale topic modeling.
- LightGBM: LightGBM is a fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
- Distributed word embedding: Distributed algorithm for word embedding implemented on multiverso.
四、GPU隆重登场
(1) OpenCV的OpenCL实现了下面的哪些算法?
class cv::ml::LogisticRegression
In a nutshell
Ref: How to use NVIDIA GPUs for Machine Learning with the new Data Science PC from Maingear
看样子大家才刚刚意识到这个事情,或者dnn就足够了。
Goto: [CUDA] Install H2O.ai,有部分GPU实现的算法。
- GLM: Lasso, Ridge Regression, Logistic Regression, Elastic Net Regulariation
- KMeans
- Gradient Boosting Machine (GBM) via XGBoost
- Singular Value Decomposition(SVD) + Truncated Singular Value Decomposition
- Principal Components Analysis(PCA)
五、ML in OpenCV
Classification with OpenCV3 C++ (1/2)
Classification with OpenCV3 C++ (2/2)
Code for my blog post: "Classification with OpenCV 3 C++"
/* implement */

浙公网安备 33010602011771号