Testlatex
Suppose we are now given the individual machines, whether obtained from boosting or bagging and ask the best way to combine them rather than using averaging (for bagging) or the weighted median (for boosting). Breiman (1996a) suggests the following stacking technique: Suppose that once again pattern $i$
has an observed value $y_i$
on the training set. Suppose machine $k$
has a predicted value $y_i^{(k)}$
for pattern $i$
on the training set where there are a total of $K$
machines. Then find the $\gamma_k$
that minimizes:假设有个体机器,不论来自增强还是装袋并问联合这些个体机器的最好方式而不是使用平均(对装袋)或加权平均(对增强)。Breiman建议以下堆叠技术:仍假设训练集的模式$i$
有观测值$y_i$
。假设机器$k$
对训练集上模式$i$
的预测为$y_i^{(k)}$
,全部有$K$
个机器。那么,找到$\gamma_k$
以最小化如下:
$$W = \sum_{i=1}^{N_1} \left(y_i-\sum_{}^{}\gamma_k y_i^{(k)}\right)^2 \quad \gamma_k \ge 0$$
This is a constrained quadratic optimization problem for which the use of standardized quadratic programming packages is recommended. The above is equivalent to minimizing (with respect to $\gamma$
):这是一个约束二次优化问题,推荐使用标准化的二次规划软件包。上面等价于最小化(对应$\gamma$
):
$$W = C^{t} \gamma + \frac{1}{2} \gamma^{t} H \gamma$$
where $C$
is a vector and $H$
is a Hessian whose elements are:其中$C$
是一向量且$H$
是海赛矩阵,两者的元素为:
$$c_k = -2 \sum_{i=1}^{N_1} y_i y_i^{(k)} \quad k=1,\ldots,K$$
$$h_{jk} = 2\sum_{i=1}^{N_1} y_i^{(j)} y_i^{(k)} \quad j,k=1,\ldots,K \ and \ h_{jk}=h_{kj}$$
Tables V, VI, and VII show the results of using stacking using the best loss functions (for boosting, from Tables I-III). Note that if $\gamma_k=0$
, machine $k$
is not needed. The last column of these tables indicate the number of trees in the stacked or unstacked implementation. There are mixed results. On Friedman #1, both bagging and boosting improve over their unstacked results at a 5% significance level, but boosting still is better than bagging. For Freidman #2, stacking makes both boosting and bagging worse while for Friedman #3, stacking makes bagging better but boosting is still better than stacking.表V、VI和VII显示了使用最佳损失函数(对增强,最佳损失函数来自表I到III)、使用堆叠的结果。注意如果$\gamma_k=0$
,那么机器$k$
没必要存在。这些表的最后一列堆叠和不堆叠情况下揭示树的数量。结果喜忧参半。对Friedman 1号函数,增强和装袋加上堆叠均有5%的显著提升,堆叠后增强仍优于装袋。对Friedman 2号函数,堆叠使增强和装袋均变差。而对Friedman 3号函数,堆叠更有利于装袋,但增强仍优于装袋。