《Hands-On Maching Learning with Scikit-Learn and TensorFow》第一章 机器学习的景观
1.How would you define Machine Learning?(你怎么定义机器学习?)
Machine Learning is the science (and art) of programming computers so they can learn from data.
A slightly more general definition : Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed.
2.Can you name four types of problems where it shines?(你能列举出机器学习衍生出的四种类型的问题吗?)
(1)Problems for which existing solutions require a lot of hand-tuning or long lists of rules: one Machine Learning algorithm can often simplify code and perform better;
(2)Complex problems for which there is no good solution at all using a traditional approach: the best Machine Learning techniques can find a solution;
(3)Fluctuating environments: a Machine Learning system can adapt to new data;
(4) Getting insights about complex problems and large amounts of data.
3.What is a labled training set?
In supervised learning, the traning data you feed to the algorithm includes the desired solutions.
4.What are the two most common supervised tasks?
One is classification, another is to predict a target numeric value.
5.Can you name four common unsupervised tasks?
(1) Visualization algorithms;
(2) Dimensionality reduction;
(3)Anomaly detection;
(4)Association rule learning.
6.what type of Maching Learning algorithm would you use to allow a robot to walk in various unknown terrains?
Reinforcement Learning.
7.What type of algorithm would you use to segment your customers into multiple groups?
Clustering algorithm.
8.Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem?
Supervised learning
9.What is an online learning system?
Online learning system is that receive data as a continuous flow and need to adapt to change rapidly or autonomously.
10.What is out-of-core learning?
Out-of -core learning is that online learning algorithms is used to train systems on huge datasets that cannot fit in one machine's main memroy.
11.What type of learning algorithm relies on a similarity measure to make predictions?
Instance-based learning.
12.What is the difference between a model parameter and a learning algorithm's hyperparameter?
A hyperparameter is a parameter of a learning algorithm not of the model.
13.What do model-based learning algorithms search for? What is the most common strategy they use to succeed? How do they make predictions?
14.Can you name four of the main challenges in Machine Learning?
(1)Insufficient Quantity of Training Data;
(2)Nonrepresentative Training Data;
(3)Poor-Quality Data;
(4)Irrelevant Features.
15.If your model performs great on the training data but generalizes poorly to new instances, what is happening? Can you name three possible solutions?
Overfit;
Solutions:
(1)To simplify the model by selecting one with fewer parameters, by reducing the number of attributes in the training data or by constraining the model;
(2)To gather more training data;
(3) To reduce the noise in the training (e.g.,fix data errors and remove outliers).
16.What is a test set and why would you want to use it?
A test set is for testing a model, evaluating a model how it performs.
17.What is the purpose of a validation set?
To solve the problem that the model which you adapted it and hyperparameters to produce the best model for that set is unlikely to perform as well on new data.
18.What can go wrong if tune hyperparameters using the test set?
The problem is that you measured the generalization error multiple times on
the test set, and you adapted the model and hyperparameters to produce the
best model for that set. This means that the model is unlikely to perform as
well on new data.
19.What is cross-validation and why would you prefer it to a validation set?
The second holdout set which you select the model and hyperparameters that perform best on it.
To avoid "wasting" too much training data in validation sets.
浙公网安备 33010602011771号