论文标题
从迭代近似反相中得出差分目标传播
Deriving Differential Target Propagation from Iterating Approximate Inverses
论文作者
论文摘要
我们表明,目标传播的一种特定形式,即依赖于每一层的学习对逆,即差异,即目标是向前传播的小扰动,这会产生更新规则,这与基于高斯 - 纽顿梯度的近似优化相对应,而无需对大型矩阵进行操纵或倒置。有趣的是,这在生物学上比后传播更合理,但可能会隐含地提供更强的优化程序。扩展差异目标传播,我们考虑基于每一层的本地自动编码器的几个迭代计算,以实现更精确的反转以更准确的目标传播,并且我们表明,如果自动编码功能减去标识功能,则在lipschitz的lips and a lipschitz常数小于一个,即自动效果,即自动效果,即自动效应,即自动效果,即自动级别,即co.e.e。我们还提出了一种方法来使每一层的变化都归一化,以考虑到每一层对输出的相对影响,以便对更大的重量变化对更有影响的层进行,就像在梯度下降的普通背部传播中发生的那样。
We show that a particular form of target propagation, i.e., relying on learned inverses of each layer, which is differential, i.e., where the target is a small perturbation of the forward propagation, gives rise to an update rule which corresponds to an approximate Gauss-Newton gradient-based optimization, without requiring the manipulation or inversion of large matrices. What is interesting is that this is more biologically plausible than back-propagation yet may turn out to implicitly provide a stronger optimization procedure. Extending difference target propagation, we consider several iterative calculations based on local auto-encoders at each layer in order to achieve more precise inversions for more accurate target propagation and we show that these iterative procedures converge exponentially fast if the auto-encoding function minus the identity function has a Lipschitz constant smaller than one, i.e., the auto-encoder is coarsely succeeding at performing an inversion. We also propose a way to normalize the changes at each layer to take into account the relative influence of each layer on the output, so that larger weight changes are done on more influential layers, like would happen in ordinary back-propagation with gradient descent.