Machine learning学习笔记#1

Definition of machine learning

Tom Mitchell

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

Arthur Samuel

“the field of study that gives computers the ability to learn without being explicitly programmed”

Supervised Learning

given a data set

already know what our correct output should look like

Regression Problems

map input variables to some continuous function

线性回归模拟预测

Classification Problems

map input variables into discrete categories

离散点分类预测

Unsupervised Learning

little or no idea what our results should look like

cluster the data based on relationships among the variables in the data

no feedback based on the prediction results

未知结果的集群分类

Model Representation

$x^{(i)}$ – “input” variables

$y^{(i)}$ – “output” or target variables

$(x^{(i)}, y^{(i)})$ – training example –> form training set

X – space of input values

Y – space of output values

$h(x)$ – hypothesis, a function from X to Y

Cost Function

also called “Squared error function“, or “Mean squared error“

$J(\theta_0, \theta_1) = \dfrac {1}{2m} \displaystyle \sum_{i=1}^m \left ( h_\theta (x_{i}) - y_{i} \right)^2$

The mean is halved $\left(\frac{1}{2}\right)$ as a convenience for the computation of the gradient descent, as the derivative term of the square function will cancel out the $\frac{1}{2}$ term.

值要尽可能的小