lessons by Andrew Ng
Definition of machine learning
Tom Mitchell
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”
Arthur Samuel
“the field of study that gives computers the ability to learn without being explicitly programmed”
Supervised Learning
- given a data set
- already know what our correct output should look like
Regression Problems
map input variables to some continuous function
线性回归模拟预测
Classification Problems
map input variables into discrete categories
离散点分类预测
Unsupervised Learning
- little or no idea what our results should look like
- cluster the data based on relationships among the variables in the data
- no feedback based on the prediction results
未知结果的集群分类
Model Representation
$x^{(i)}$ – “input” variables
$y^{(i)}$ – “output” or target variables
$(x^{(i)}, y^{(i)})$ – training example –> form training set
X – space of input values
Y – space of output values
$h(x)$ – hypothesis, a function from X to Y
Cost Function
also called “Squared error function“, or “Mean squared error“
$J(\theta_0, \theta_1) = \dfrac {1}{2m} \displaystyle \sum_{i=1}^m \left ( h_\theta (x_{i}) - y_{i} \right)^2$
The mean is halved $\left(\frac{1}{2}\right)$ as a convenience for the computation of the gradient descent, as the derivative term of the square function will cancel out the $\frac{1}{2}$ term.
值要尽可能的小