Machine Learning Summary Section Two – zdf's blog

Machine Learning Summary Section Two

Mar 19, 2016

By Zhao Dongfang

In Machine learning

1 minute read

多变量线性回归 Linear Regression with Multiple Variable

$h_{\theta}(x)=\theta_{0} + \theta_{1}x_{1} + \theta_{2}x_{2} + … + \theta_{n}x_{n}$
$x_{0} = 1$ , $h_{\theta}(x)=\theta_{0}x_{0} + \theta_{1}x_{1} + \theta_{2}x_{2} + … + \theta_{n}x_{n}$
$h_{\theta}(x)=\theta^{T}X$

特征缩放 Feature scaling

make sure features are on a similar scale
get every feature into approximately a $-1\leq x_{i} \leq +1$ range
mean normalization 均值归一化 $ x_{i} = \frac{x_{i} - \mu_{i}}{s_{i}} $ . ($ \mu_{i} $ is avg and $ s_{i} $ can be $ max - min $ or standard deviation )

学习率 Learning rate

making sure gradient descent is working correctly. $ J(\theta) $ should decrease after every iteration.
$ J(\theta)-iteration $ 曲线上升或波动，需要降低学习率。学习率要适中，过小会导致梯度下降的太慢
To try $ \alpha $, try … 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3 …

特征与多项式回归 Features and Polynomial Regression

线性回归并不能拟合所有数据，我们要使用其他模型来进行拟合

正规方程 Normal Equation

对于某些线性回归问题，我们可以使用正规方程来解决。复杂度为 $O(n^{3})$
训练集特征向量$X$(其中$x_{0} = 1$), 训练集结果为向量y
利用正规方程解出向量 $\theta = (X^{T}X)^{-1}X^{T}y$
$X^{T}X$ 是为了构成方阵，只有方程才有逆。$X\theta = y$

Octave or Matlab

基本操作
移动数据
计算数据
绘图数据
控制语句与函数
向量化

逻辑回归 Logical Regression

逻辑回归模型的假设 $h_{\theta}(x) = g(\theta_{T}X)$
Sigmod Function $ g(z) = \frac{1}{1 + e^{-z}} $
逻辑回归的代价函数 $J(\theta) = \frac{1}{m} \sum_{i = 1}^{m}Cost(h_\theta(x^{(i)}),y^{(i)})$
化简后 $Cost(h_\theta(x),y) = -y \times log(h_\theta(x)) - (1-y) \times log(1 - h_\theta(x))$
fminunc
one-vs-all 取 $max(h_\theta(x))$ 为分类结果

正则化 Regularization

过拟合 Overfitting 高方差
过拟合解决方案
- 减少特征变量个数(Manully select which features to keep or use model select alogrithm)
- 正则化
正则化在线性回归和逻辑回归中的应用 \lamda