基于语义的智能手机本地化的VP在挑战性的城市环境中

论文标题

基于语义的智能手机本地化的VP在挑战性的城市环境中

Semantic-Based VPS for Smartphone Localization in Challenging Urban Environments

论文作者

Lee, Max Jwo Lem, Hsu, Li-Ta, Ng, Hoi-Fung, Lee, Shang

论文摘要

对于各种物联网应用程序，例如增强现实，智能运输等，越来越多地需要基于智能手机的智能手机的室外定位系统。Google最近开发的基于功能的视觉定位系统（VPS）从智能手机图像中检测到与他们的地图数据库中的预次访问的边缘相匹配的边缘。随着智能城市的发展，建筑信息建模（BIM）变得广泛可用，为新的基于语义的VP提供了机会。本文提出了一种新颖的3D城市模型和基于语义的VP，以在全球导航卫星系统（GNSS）往往会失败的城市峡谷中进行准确，健壮的姿势估计。在离线阶段，使用材料分割的城市模型来生成分段图像。在在线阶段，使用智能手机摄像头拍摄图像，该智能手机相机提供有关周围环境的文本信息。该方法利用计算机视觉算法在智能手机图像中确定的不同类型的材料之间进行纠正和段落。然后提出了一种基于语义的VPS方法，以将分段的生成图像与分段的智能手机图像匹配。每个生成的图像都有一个姿势，其中包含纬度，经度，高度，偏航，俯仰和滚动。最大可能性的候选人被认为是用户的确切姿势。定位结果可在街道上达到2.0m级的准确性，在叶子密集的环境中达到550万，在Alleyway中达到1570万。当前最新方法的定位提高了45％。偏航的估计达到2.3°水平的准确性，是智能手机IMU的改进的8倍。

Accurate smartphone-based outdoor localization system in deep urban canyons are increasingly needed for various IoT applications such as augmented reality, intelligent transportation, etc. The recently developed feature-based visual positioning system (VPS) by Google detects edges from smartphone images to match with pre-surveyed edges in their map database. As smart cities develop, the building information modeling (BIM) becomes widely available, which provides an opportunity for a new semantic-based VPS. This article proposes a novel 3D city model and semantic-based VPS for accurate and robust pose estimation in urban canyons where global navigation satellite system (GNSS) tends to fail. In the offline stage, a material segmented city model is used to generate segmented images. In the online stage, an image is taken with a smartphone camera that provides textual information about the surrounding environment. The approach utilizes computer vision algorithms to rectify and hand segment between the different types of material identified in the smartphone image. A semantic-based VPS method is then proposed to match the segmented generated images with the segmented smartphone image. Each generated image holds a pose that contains the latitude, longitude, altitude, yaw, pitch, and roll. The candidate with the maximum likelihood is regarded as the precise pose of the user. The positioning results achieves 2.0m level accuracy in common high rise along street, 5.5m in foliage dense environment and 15.7m in alleyway. A 45% positioning improvement to current state-of-the-art method. The estimation of yaw achieves 2.3° level accuracy, 8 times the improvement to smartphone IMU.

下载PDF全文

下载文献需遵守相关版权规定

论文标题