Fork me on GitHub

http://metaoptimize.com/qa/questions/10640/why-anything-other-than-deep-learning

Deep learning has been shown time and again to outperform everything as shown yann lecun or andrew ng (except when online logistic regression is ok for large datasets). So why are people, on this forum for example, discussing anything else? Why are outclassed things like svm's, topic models, crf's discussed.?

1.深度学习有适用范围,适合于大数据,深度的训练,适用于提取高层特征,对于小数据集显得开销过大。

2.需要有很多数据才能训练好参数。

3.需要人工调节参数和层数等,目前还没有off-the-shelf的工具可以直接使用,所以工程上效果不见得比常规方法好。

 

Your question comes off as a bit sarcastic, but I will answer it assuming it isn't.

As a deep learning researcher, I will be the first to admit that deep learning is a poorly defined term. There are many reasonable definitions for it, some more expansive and some more restrictive. For example, the most expansive definition I might use would include any learning algorithm that learns a distributed representation of its input and isn't just doing template matching. This doesn't require a neural net (unless you also use a very expansive definition of neural net that includes decision trees!). Sidestepping the definitional problem of exactly what constitutes deep learning, let me try and address your question.

  1. Deep learning is not appropriate for all problems. Small datasets might be better served by a fully Bayesian approach with a careful encoding of prior beliefs. Perhaps there are certain types of bandit problems one would be hard-pressed to use a deep architecture on or learn features for. Even in classification, classifying graphs (where graphs are the actual instances) poses challenges for deep learning and many classification techniques throughout machine learning. Sometimes particular graphical models can be engineered for a problem. Sometimes other models are more computationally appropriate. Sometimes we can directly parameterize a discrete probability distribution and learn its parameters. Some of these situations are waiting for an enterprising researcher to apply the deep learning philosophy, but others it simply seems inappropriate.

  2. All the things you mentioned can be part of deep models. A CRF can be deep or even be the top layer of a deep model. There are deep topic models (I am thinking of the replicated softmax with additional layers added). Learning the kernel in an SVM can sometimes result in a deep model. All of machine learning is related and we as researchers can generalize across models. For example, SVMs taught us a very important lesson (often articulated by Yann LeCun) about training neural nets, namely that we shouldn't be afraid of using models with many parameters and we should just control overfitting some other way and use more powerful regularization. Also, sometimes training a neural net with the hinge loss is useful. Machine learning is a giant interconnected web of ideas and as the field continues, people keep finding new relationships between seemingly disparate concepts and models.

To summarize, deep learning isn't the answer for every problem and all these other techniques have things to teach people, even people who are only interested in deep learning.

Although I agree with the general sentiment of rm9, I think it might be possible to make a relatively "out of the box" deep neural network system, especially with recent advances in hyper-parameter optimization. Just like with SVMs, the best results will come from SVM experts, but reasonable results should be possible automatically with deep neural nets as long as we give non-experts sufficiently sophisticated software packages and they have access to enough computation.

 

From a practical point of view, deep learning needs a lot of data. That's not always available. Also, it takes a lot of time to train a large network. That also might not be suitable for all applications.

But I think the major thing you have to remember is that scientific papers aren't equal to a working industrial system.

What you see in the paper is the "best result" for the method and you don't see all the failed attempts that came before it. Deep Learning is not an out-of-the-box solution. It requires a lot of optimization and tuning both in the parameters and in the input data of the system.

And if you review enough deep learning papers, you see that they don't outperform everything. On some papers they aren't the best and again, if it would perform very badly then it wouldn't be published anyway.

Lastly, there are way more problems to solve than the ones you see in the deep learning papers. Deep learning is the current trend so it seems to be everywhere but it's not. There are many areas where it is not applicable (at least yet..).

posted on 2013-08-05 18:07  huashiyiqike  阅读(256)  评论(0编辑  收藏  举报