lessons by Andrew Ng

Definition of machine learning

Tom Mitchell

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

Arthur Samuel

“the field of study that gives computers the ability to learn without being explicitly programmed”

Supervised Learning

  • given a data set
  • already know what our correct output should look like

Regression Problems

map input variables to some continuous function

线性回归模拟预测

Classification Problems

map input variables into discrete categories

离散点分类预测

Unsupervised Learning

  • little or no idea what our results should look like
  • cluster the data based on relationships among the variables in the data
  • no feedback based on the prediction results

未知结果的集群分类

Model Representation

$x^{(i)}$ – “input” variables

$y^{(i)}$ – “output” or target variables

$(x^{(i)}, y^{(i)})$ – training example –> form training set

X – space of input values

Y – space of output values

$h(x)$ – hypothesis, a function from X to Y

Cost Function

also called “Squared error function“, or “Mean squared error

$J(\theta_0, \theta_1) = \dfrac {1}{2m} \displaystyle \sum_{i=1}^m \left ( h_\theta (x_{i}) - y_{i} \right)^2$

The mean is halved $\left(\frac{1}{2}\right)$ as a convenience for the computation of the gradient descent, as the derivative term of the square function will cancel out the $\frac{1}{2}$ term.

值要尽可能的小