稀疏pca原理
Note that there are many different formulations for the Sparse PCA problem. The one implemented here is based on Mrl09. The optimization problem solved is a PCA problem with an
$\ell_1$
penalty on the components:注意稀疏主成分分析问题有许多不同的公式。这里应用的公式基于Mrl09。优化问题是一个带有$\ell_1$
惩罚组件的PCA问题:
$$\begin{split}(W_{n \times k}^*, H_{k \times m}^*) = \underset{W_{n \times k}, H_{k \times m}}{\operatorname{arg\,min\,}} & \frac{1}{2}
||X_{n \times m}-W_{n \times k}H_{k \times m}||_2^2+\alpha||H_{k \times m}||_1 \\
\text{subject to } & ||W_i||_2 = 1 \text{ for all }
0 \leq i \leq k \leq m\end{split}$$
由上式,对H的l1范数惩罚项使得稀疏转化后的H稀疏,同时可以看到稀疏pca也是可以降维的。这与机器学习算法一书错误以为稀疏pca不可以降维。与pca的异同:去掉惩罚项其实就是普通pca。
矩阵Frobenius范数不同于矩阵2范数,具体阅读范数博文。
The sparsity-inducing
$\ell_1$
norm also prevents learning components from noise when few training samples are available. The degree of penalization (and thus sparsity) can be adjusted through the hyperparameter alpha
. Small Halves lead to a gently regularized factorization, while larger Halves shrink many coefficients to zero.当可用的训练样本很少时,引入稀疏的$\ell_1$
范数还可以防止学习组件产生噪声。惩罚的程度(也就是稀疏度)可以通过超参数alpha
来调整。较小的值会导致温和的正则化因子分解,而较大的值会将许多系数缩小到零。
经稀疏PCA转换后的矩阵:
- 列向量线性无关;
- 维度减少;
- 稀疏。
以上笔记参考自sklearn文档