二阶灵敏度分析用于双光线优化

论文标题

二阶灵敏度分析用于双光线优化

Second-Order Sensitivity Analysis for Bilevel Optimization

论文作者

Dyro, Robert, Schmerling, Edward, Arechiga, Nikos, Pavone, Marco

论文摘要

在这项工作中，我们得出了一种二级方法来进行双重优化，这是一种数学编程，其中针对参数化优化问题（“下部”问题）的解决方案本身将被优化（在“上层”问题中）作为参数的函数。基于隐式函数定理（IFT），许多现有的双层优化方法采用一阶灵敏度分析，以使较低问题得出相对于其参数的较低问题解决方案的梯度；然后将此IFT梯度用于上部问题的一阶优化方法。本文扩展了此灵敏度分析，以提供较低问题的二阶导数信息（我们称之为IFT Hessian），从而在上层可以使用更快的二阶优化方法。我们的分析表明，（i）已经用于生成IFT梯度的大部分计算可以重复用于IFT Hessian，（ii）为IFT梯度得出的错误界限，很容易适用于IFT Hessian，（iii）计算IFT Hessians可以通过从每个下层求解中提取更多信息来显着降低总体计算。我们通过将其应用于最小二乘超级参数自动调整，多级SVM自动调整和逆最佳控制的问题实例，证实了我们的发现的广泛应用。

In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the "lower" problem) is itself to be optimized (in the "upper" problem) as a function of the parameters. Many existing approaches to bilevel optimization employ first-order sensitivity analysis, based on the implicit function theorem (IFT), for the lower problem to derive a gradient of the lower problem solution with respect to its parameters; this IFT gradient is then used in a first-order optimization method for the upper problem. This paper extends this sensitivity analysis to provide second-order derivative information of the lower problem (which we call the IFT Hessian), enabling the usage of faster-converging second-order optimization methods at the upper level. Our analysis shows that (i) much of the computation already used to produce the IFT gradient can be reused for the IFT Hessian, (ii) errors bounds derived for the IFT gradient readily apply to the IFT Hessian, (iii) computing IFT Hessians can significantly reduce overall computation by extracting more information from each lower level solve. We corroborate our findings and demonstrate the broad range of applications of our method by applying it to problem instances of least squares hyperparameter auto-tuning, multi-class SVM auto-tuning, and inverse optimal control.

下载PDF全文

下载文献需遵守相关版权规定

论文标题