论文标题
在未知偏好下进行交通路线的在线学习
Online Learning for Traffic Routing under Unknown Preferences
论文作者
论文摘要
在运输网络中,用户通常以分散和自私的方式选择路线,以最大程度地减少其个人旅行成本,实际上,这通常会导致社会的总体成果效率低下。结果,人们对设计公路收费方案的兴趣越来越大,以应对这些效率损失,并将用户转向系统高效的交通模式。但是,公路收费计划的功效通常依赖于访问有关用户旅行属性的完整信息,例如其起源目的地(O-D)旅行信息及其时间价值,这在实践中可能无法使用。 在这种实际考虑的过程中,我们提出了一种在线学习方法,以在流量网络中设置通行费,以推动具有不同时间值的异质用户朝着系统效率高效的流量模式。特别是,我们开发了一种简单而有效的算法,该算法仅根据网络道路上观察到的总体流量来调整每个时间段的通行费而不依赖用户的任何其他旅行属性,从而保留用户隐私。在绘制o-d对和用户时间值的环境中。在每个时期,我们都表明我们的方法获得了预期的遗憾和道路容量违反$ o(\ sqrt {t})$,其中$ t $是更新通行费的时期数量。我们的遗憾保证是与离线甲骨文有关的,该甲骨文具有有关用户旅行属性的完整信息。我们进一步建立了一个$ω(\ sqrt {t})$下限的任何算法的遗憾,该算法确定我们的算法是最佳的常数。最后,我们证明了方法相对于现实世界运输网络上的几个基准的出色性能,从而突出了其实际适用性。
In transportation networks, users typically choose routes in a decentralized and self-interested manner to minimize their individual travel costs, which, in practice, often results in inefficient overall outcomes for society. As a result, there has been a growing interest in designing road tolling schemes to cope with these efficiency losses and steer users toward a system-efficient traffic pattern. However, the efficacy of road tolling schemes often relies on having access to complete information on users' trip attributes, such as their origin-destination (O-D) travel information and their values of time, which may not be available in practice. Motivated by this practical consideration, we propose an online learning approach to set tolls in a traffic network to drive heterogeneous users with different values of time toward a system-efficient traffic pattern. In particular, we develop a simple yet effective algorithm that adjusts tolls at each time period solely based on the observed aggregate flows on the roads of the network without relying on any additional trip attributes of users, thereby preserving user privacy. In the setting where the O-D pairs and values of time of users are drawn i.i.d. at each period, we show that our approach obtains an expected regret and road capacity violation of $O(\sqrt{T})$, where $T$ is the number of periods over which tolls are updated. Our regret guarantee is relative to an offline oracle that has complete information on users' trip attributes. We further establish a $Ω(\sqrt{T})$ lower bound on the regret of any algorithm, which establishes that our algorithm is optimal up to constants. Finally, we demonstrate the superior performance of our approach relative to several benchmarks on a real-world transportation network, thereby highlighting its practical applicability.