【定位系列论文阅读】-Multi-Head Attention Neural Network for Smartphone Invariant Indoor Localization

醉酒柴柴

已于 2023-07-25 18:24:13 修改

阅读量394

点赞数

分类专栏：杂七杂八文章标签：论文阅读

于 2023-07-25 16:37:19 首次发布

本文链接：https://blog.csdn.net/weixin_46050242/article/details/131901952

版权

杂七杂八专栏收录该内容

10 篇文章 2 订阅

订阅专栏

文章目录

0.论文速览
1.Abstract
- 1.1 逐句翻译
- 1.2 总结
2.INTRODUCTION
3 RELATED WORK
- 3.1 逐句翻译
3.ANALYSIS OF HETEROGENEOUS FINGERPRINTS 异质指纹分析
- 3.1 逐句翻译
- - 第一段（指纹采集）
  - 第二段（介绍指纹采集图中观察到的结果）
4.THE ATTENTION MECHANISM 注意机制
- 4.1 逐句翻译
5.ANVIL FRAMEWORK ANVIL框架
- 5.1 逐句翻译
6. EXPERIMENTS 实验
- 6.1 逐句翻译
7. CONCLUSION 结论
- 7.1 逐句翻译
- - 第一段（本文提出的方法性能有所提高）

0.论文速览

0.1 文章信息

会议： 2022 IEEE 12th International Conference on Indoor Positioning and Indoor Navigation (IPIN)
源地址
题目：智能手机不变性室内定位的多头注意神经网络

0.2 概述（此处不全）

0.2.1 研究什么东西

在这里插入图片描述
在离线阶段对建立的指纹数据库进行处理，进行多头神经网络的训练，改进不同设备WIFI指纹定位，定位精度的问题

0.2.2 模型结构

在这里插入图片描述

0.2.3 评价

文章给我带来的收获：
1.方法好值得借鉴：针对wifi指纹的新框架
2.图表好值得借鉴（图7）
文章打分：⭐⭐⭐
值得再读：×（无开源代码）

1.Abstract

1.1 逐句翻译

Smartphones together with RSSI fingerprinting serve as an efficient approach for delivering a low-cost and highaccuracy indoor localization solution.
智能手机结合RSSI指纹识别技术是提供低成本、高精度室内定位解决方案的有效方法。

However, a few critical challenges have prevented the wide-spread proliferation of this technology in the public domain.
然而，一些关键的挑战阻碍了这项技术在公共领域的广泛扩散。

One such critical challenge is device heterogeneity, i.e., the variation in the RSSI signal characteristics captured across different smartphone devices.
其中一个关键的挑战是设备的异质性，即在不同的智能手机设备上捕获的RSSI信号特征的变化。

In the real-world, the smartphones or IoT devices used to capture RSSI fingerprints typically vary across users of an indoor
localization service.
在现实世界中，用于捕获RSSI指纹的智能手机或物联网设备通常因室内定位服务的用户而异。

Conventional indoor localization solutions may not be able to cope with device-induced variations which can degrade their localization accuracy.
传统的室内定位解决方案可能无法应对设备引起的变化，这会降低其定位精度。

We propose a multihead attention neural network-based indoor localization framework that is resilient to device heterogeneity.
我们提出了一个基于多头注意力神经网络的室内定位框架，该框架对设备异质性具有弹性。

An in-depth analysis of our proposed framework across a variety of indoor environments demonstrates up to 35% accuracy improvement compared to state-of-the-art indoor localization techniques.
对我们提出的框架在各种室内环境中的深入分析表明，与最先进的室内定位技术相比，准确率提高了35%。

1.2 总结

本文设计了一个室内定位框架，使其在不同的手机上使用RSSI指纹仍能精准定位。

2.INTRODUCTION

2.1 逐句翻译

第一段（举例一些室外基于位置的营销引入室内定位有很大潜力）

The proliferation of GPS (Global Positioning Systems) technology transformed the way we commute and engage with
our surroundings
GPS(全球定位系统)技术的普及改变了我们通勤和与周围环境互动的方式。

Now, GPS technology is a standard baked into every smartphone available on the market.
现在，GPS技术已经成为市场上所有智能手机的标准配置。

This identical technology is now powering several geo-location-based businesses and other interactive platforms such as Uber, Pokémon-Go and location-based marketing. In the present times, indoor localization technology holds the innate
potential to revolutionize navigation for indoor locales that are GPS deprived, e.g., malls, buildings, and tunnels. Several
firms such as IndoorAtlas, Target (Shopkick), and Zebra have already begun to provide services that aid customers in
locating products within a store [1].
这种相同的技术现在正在推动一些基于地理位置的业务和其他互动平台，如Uber、poksammon - go和基于位置的营销。目前，室内定位技术具有革命性的潜力，可以为没有GPS的室内场所(如商场、建筑物和隧道)带来导航革命。IndoorAtlas、Target (Shopkick)和Zebra等几家公司已经开始提供帮助客户在商店内定位产品的服务[1]。

第二段（介绍目前出现了很多室内定位的方法，其中wifi价格低普及性高）

While GPS is a de-facto standard for outdoor localization, there are no similar global standards for indoor environments.
虽然GPS是户外定位的事实上的标准，但对于室内环境还没有类似的全球标准。

As a result, a wide range of approaches have been created that employ a variety of sensors and radio frequencies.
因此，使用各种传感器和无线电频率的各种方法已经被创造出来。

A few widely used radio signals for indoor localization include Bluetooth, RFID, UWB (Ultra-Wide Band), and WiFi [2]-[4].
目前室内定位广泛使用的无线电信号有蓝牙、RFID、UWB (Ultra-Wide Band)和WiFi[2]-[4]

Due to its low setup costs and high ubiquity, WiFi-based indoor localization has received the greatest academic attention [3].
基于wifi的室内定位由于其设置成本低、普及性高，受到了学术界的极大关注[3]。

This is further fueled by the fact that most people today own smartphones with WiFi radios.
如今，大多数人都拥有带有WiFi收音机的智能手机，这一事实进一步推动了这一趋势。

第三段（介绍wifi定位存在的问题，指纹技术可以很好的解决这些问题）

Despite the evident benefits of WiFi based indoor localization, certain key challenges remain unresolved. Weak wall penetration, multipath fading, and shadowing effects all diminish the efficacy of such indoor localization platforms.
尽管基于WiFi的室内定位具有明显的优势，但某些关键挑战仍未解决。墙体穿透能力弱、多径衰落、阴影效应等都会降低这种室内定位平台的有效性。

Due to these issues, establishing a precise mathematical relationship between the Received Signal Strength Indicator
(RSSI) and the relative distances from WiFi Access Points (APs) is challenging.
由于这些问题，在接收信号强度指标(RSSI)和WiFi接入点(ap)的相对距离之间建立精确的数学关系是具有挑战性的。

These issues serve as the motivation behind fingerprinting-based techniques that are independent of the RSSI-distance relationship, and therefore are resilient to aforementioned drawbacks of WiFi-based indoor localization.
这些问题是基于指纹的技术背后的动机，这种技术独立于 RSSI 与距离之间的关系，因此能够克服上述基于 WiFi 的室内定位技术的缺点。

第四段（介绍指纹定位的两个阶段，离线阶段与在线阶段）

Traditionally, fingerprinting-based indoor localization comprises of two phases.
传统上，基于指纹的室内定位包括两个阶段。

In the first phase (offline or training phase), the provider of the localization service captures the RSSI values for visible APs at various indoor locations of interest.
在第一阶段(脱机或培训阶段)，定位服务的提供者为各种感兴趣的室内位置的可见ap捕获RSSI值。

This results in a database of RSSI fingerprint vectors and associated locations or Reference Points (RPs). This
database may further be used to train models (e.g., machine learning-based) for location estimation.
这将产生RSSI指纹向量和相关位置或参考点(rp)的数据库。该数据库可以进一步用于训练模型(例如，基于机器学习的模型)以进行位置估计。

In the second phase (online or testing phase), a user unaware of their own location captures an RSSI fingerprint on a device such as a smartphone.
在第二阶段(在线或测试阶段)，不知道自己位置的用户在智能手机等设备上捕获RSSI指纹。

This fingerprint is then sent to the trained model to determine the user’s location.
然后，这个指纹被发送到训练过的模型，以确定用户的位置。

This location can be overlayed on a map and visualized on the smartphone’s display.
这个位置可以叠加在地图上，并在智能手机的显示屏上显示出来。

第五段（提出不同手机捕捉到的信号强度不同的问题，会降低定位精度）

In the domain of fingerprinting-based indoor localization, a vast majority of work utilizes the same smartphone for
(offline) data collection and (online) location prediction, e.g., [5]-[7].
在基于指纹的室内定位领域，绝大多数工作使用相同的智能手机进行(离线)数据收集和(在线)位置预测，例如[5]-[7]。

This approach assumes that in a real-world setting, the localization devices operated by users would have identical
signal characteristics.
这种方法假设在现实环境中，用户操作的定位设备具有相同的信号特征。

Such a premise is largely debunked by the variety of brands and models that make up the modern smartphone industry.
这种假设在很大程度上被构成现代智能手机行业的各种品牌和型号所推翻。

In practice, the user base of smartphones is composed of heterogenous device components and characteristics such as WiFi chipset, antenna shape, OS version, etc. yielding variations in antenna gain.
在实践中，智能手机的用户群是由异构的设备组件和特征组成的，如WiFi芯片组、天线形状、操作系统版本等，从而导致天线增益的变化。

Recent works demonstrate that the observed RSSI values for a given locale captured across diverse smartphones vary significantly [8].
最近的研究表明，在不同的智能手机上捕捉到的给定区域的RSSI值差异很大[8]。

This degrades the localization accuracy achieved through conventional fingerprinting and motivates smartphone heterogeneity or device invariant fingerprinting techniques.
这降低了通过传统指纹识别实现的定位精度，并激发了智能手机异构或设备不变指纹识别技术。

第六段（设计了指纹识别框架，实现设备不变形，以及本文贡献）

In this paper, we present a robust and computationally lightweight WiFi RSSI-based fingerprinting framework
(ANVIL) that aims to achieve device invariance such that it experiences minimal accuracy loss across heterogeneous
smartphones.
在本文中，我们提出了一种鲁棒且计算轻量级的基于WiFi rssi的指纹识别框架(ANVIL)，旨在实现设备不变性，从而使其在异构智能手机中经历最小的准确性损失。

The main contributions of our work are:
我们工作的主要贡献是:

• we conduct an analysis to highlight the impact of device heterogeneity on RSSI fingerprints;
•我们进行了分析，以突出设备异质性对RSSI指纹的影响;

• towards the goal of promoting generalized device invariance, we identify and adapt data augmentation methodologies for training deep machine learning models;
•为了促进广义设备不变性的目标，我们确定并适应用于训练深度机器学习模型的数据增强方法;

• we introduce a novel multi-head attention neural network for device invariant indoor localization;
引入了一种新颖的多头注意力神经网络，用于设备不变性室内定位;

• by collecting fingerprints with multiple heterogeneous devices across buildings, we create benchmarks to test the localization accuracy of ANVIL against the state-of-the-art;
•通过在建筑物中收集多个异构设备的指纹，我们创建基准来测试ANVIL针对最先进的定位准确性;

• we prototype our calibration-free device invariant indoor localization framework, deploy it on smartphones, and evaluate it under real-world settings.
•我们对无需校准的设备不变室内定位框架进行了原型设计，将其部署在智能手机上，并在现实环境下对其进行评估。

3 RELATED WORK

3.1 逐句翻译

第一段（神经网络应用于设备以提高定位精度，智能手机能量限制已被解决，面临异构问题）

A considerable amount of work has been dedicated to addressing the challenges associated with WiFi fingerprinting
based indoor localization.
大量的工作致力于解决基于WiFi指纹的室内定位相关的挑战。

The recent growth in the computation capabilities of smartphones has enabled the proliferation of localization algorithms and frameworks with high computational and memory demands.
最近智能手机计算能力的增长使得具有高计算和内存需求的本地化算法和框架得以扩散。

For instance, FeedForward Deep Neural Networks (FF-DNN) [9] and also increasingly complex Convolutional Neural Networks (CNN) [6] combined with ensemble methods are being deployed on embedded devices to enhance indoor localization accuracy [7].
例如，前馈深度神经网络(FF-DNN)[9]和越来越复杂的卷积神经网络(CNN)[6]结合集成方法被部署在嵌入式设备上，以提高室内定位精度[7]。

One cause for concern with the proliferation of such techniques are the acute energy limitations on smartphones.
人们对这种技术的扩散感到担忧的一个原因是智能手机的严重能量限制。

The work in [5] addresses this challenge in a limited manner.
[5]中的工作以有限的方式解决了这一挑战。

However, most prior works in the domain of fingerprintingbased indoor localization, including [5], are plagued by the same major drawback, i.e., the lack of an ability to adapt to device heterogeneity across the offline and online phases.
然而，在基于指纹识别的室内定位领域，包括[5]在内的大多数先前的工作都受到同样的主要缺点的困扰，即缺乏适应设备离线和在线阶段异质性的能力。

This drawback, as discussed in section VI, leads to unpredictable degradation in localization accuracy in the online phase
正如第六节所讨论的，这个缺点会导致在线阶段定位精度的不可预测的下降

第二段（克服异构问题的两类技术：校准，不校准）

In the offline phase, the localization service provider likely captures fingerprints using a device that is different from the
IoT devices or smartphones employed in the online phase by users.
在离线阶段，本地化服务提供商可能会使用与用户在线阶段使用的物联网设备或智能手机不同的设备来捕获指纹。

Some common software and hardware differences that introduce device heterogeneity include WiFi antennas,
smartphone design materials, hardware drivers, and the OS [4].
一些常见的导致设备异构的软硬件差异包括WiFi天线、智能手机设计材料、硬件驱动程序和操作系统[4]。

Techniques to overcome such issues fall into two major categories: calibration-based and calibration-free methods.
克服这些问题的技术主要分为两大类:基于校准的方法和不需要校准的方法。

第三段（离线校准：使用自动编码器可以校准指纹但会有巨大开销）

An indoor localization framework can be calibrated in either the offline or online phase.
室内定位框架可以在离线或在线阶段进行校准。

The work in [10] employs offline phase calibrated fingerprinting-based indoor localization.
[10]中的工作采用基于离线相位校准指纹的室内定位。

In this approach, the fingerprint collection process employs a diverse set of devices.
在这种方法中，指纹采集过程采用了一系列不同的设备。

Later, an autoencoder is specifically trained and calibrated using fingerprints from different devices.
随后，使用不同设备的指纹对自动编码器进行专门训练和校准。

The role of the autoencoder is to create an encoded latent representation of the input fingerprint from one device such that decoded output fingerprint belongs to another device.
自动编码器的作用是创建来自一设备的输入指纹的编码潜在表示，使得解码后的输出指纹属于另一设备。

In this manner, the latent representation created is expected to be device invariant.
以这种方式，所创建的潜在表示预计是设备不变性的。

While this approach is promising, it comes with the significant overhead of utilizing multiple devices in the offline phase to capture fingerprints.
虽然这种方法很有前途，但它带来了在离线阶段使用多个设备来捕获指纹的巨大开销。

Additionally, there are no guarantees that the set of devices employed to capture fingerprints in the offline phase capture ample heterogeneity that may be experienced by the localization framework in the online phase.
此外，不能保证用于在离线阶段捕获指纹的设备集捕获在线阶段本地化框架可能经历的充分异质性。

第四段（在线校准：人工校准，很麻烦）

The online phase calibration approach involves acquiring RSSI values and location data manually for each new device
in the online phase [11].
在线相位校准方法涉及手动获取在线相位中每个新设备的RSSI值和位置数据[11]。

Such an approach is however not very practical as it can be cumbersome for users.
然而，这种方法不是很实用，因为它对用户来说很麻烦。

With this approach, once a user arrives at an indoor locale, they have to move to a known location and capture an RSSI fingerprint.
使用这种方法，一旦用户到达室内场所，他们必须移动到已知位置并捕获RSSI指纹。

The RSSI information is collected, and then manually calibrated through transformations such as weighted-least square optimizations and time-space sampling [12].
采集RSSI信息，然后通过加权最小二乘优化、时空采样等变换进行人工校准[12]。

Crowdsourcing techniques may also be used in conjunction to strengthen these approaches.
众包技术也可以用来加强这些方法。

Unfortunately, such systems experience considerable accuracy degradation [13]
不幸的是，这样的系统经历了相当大的精度下降[13]

第五段（不校准：1.数据转换为标准化格式（丢失指纹特征）2.训练深度学习模型（本文使用的方法））

In calibration-free fingerprinting, the fingerprint data is frequently converted into a standardized format that is transferable across mobile devices. In an effort to achieve this standardization, previous techniques such as Hyperbolic Location fingerprint (HLF) [14] and Signal Strength Difference (SSD) [15] employ ratios and differences between individual AP RSSI values respectively.
在免校准指纹识别中，指纹数据经常被转换成可在移动设备间传输的标准化格式。为了实现这种标准化，双曲位置指纹（HLF）[14] 和信号强度差（SSD）[15] 等先前的技术分别采用了单个 AP RSSI 值之间的比率和差值。

But these approaches suffer from low accuracy due to loss of critical distinguishing input fingerprint features in the transformation process.
但是这些方法由于在转换过程中丢失了关键的识别输入指纹特征，导致准确率较低。

An alternative to the standardization of fingerprints is presented in [16].
文献[16]提出了指纹标准化的替代方法。

The work proposed AdTrain, which employs adversarial training to improve the robustness of a deeplearning model against device heterogeneity.
这项工作提出了AdTrain，它采用对抗训练来提高深度学习模型对设备异质性的鲁棒性。

AdTrain introduces noise in the fingerprint and associated location label before training the deep-learning model.
AdTrain在训练深度学习模型之前，在指纹和相关的位置标签中引入噪声。

Based on our experiments, such an approach does improve the deeplearning model’s robustness to device heterogeneity in a limited manner.
根据我们的实验，这种方法确实以有限的方式提高了深度学习模型对设备异质性的鲁棒性。

The approach used in our framework in this paper (ANVIL) is orthogonal to the one proposed in AdTrain and ANVIL can be extended to include AdTrain.
本文框架中使用的方法(ANVIL)与AdTrain中提出的方法正交，并且ANVIL可以扩展到包括AdTrain。

The work in [17] employs Stacked Auto-Encoders (SAE) for improving resilience to device heterogeneity.
[17]中的工作采用堆叠自编码器(SAE)来提高对设备异构的弹性。

The authors expect that the lower dimensional encodings created using the SAE are more resilient to RSSI variations across devices.
作者期望使用SAE创建的低维编码对设备之间的RSSI变化更具弹性。

However, based on our experimental evaluations (section VI.B), we found that such an approach is unable to deliver high-quality localization and also does not converge easily.
然而，根据我们的实验评估(第VI.B节)，我们发现这种方法无法提供高质量的定位，也不容易收敛。

In contrast, ANVIL employs a multi-head attention neural network to achieve improvements in device invariance for indoor localization.
相比之下，ANVIL采用多头注意力神经网络来改善室内定位的设备不变性。

第六段（一些磁指纹和CSI指纹的缺点）

There are a limited set of previous works that have employed attention layers for magnetic [18], [19] and Channel
State Information (CSI) fingerprint-based [21] indoor localization.
之前有一组有限的工作使用了基于磁[18]，[19]和通道状态信息(CSI)指纹[21]的室内定位的注意层。

However, these works do not consider device heterogeneity.
然而，这些工作没有考虑到设备的异质性。

Additionally, it is well known that magnetic fingerprints are unstable over time and heavily impacted by minor changes in the indoor environment.
此外，众所周知，随着时间的推移，磁指纹是不稳定的，并且受到室内环境微小变化的严重影响。

On the other hand, CSI enabled hardware is not available in off-the-shelf smartphones available today [20].
另一方面，支持CSI的硬件在今天可用的现成智能手机中是不可用的[20]。

In contrast, ANVIL employs relatively stable WiFi RSSI that can be captured through heterogeneous off-the-shelf smartphones of various vendors to deliver stable indoor localization.
相比之下，ANVIL采用相对稳定的WiFi RSSI，可以通过各种供应商的异构现货智能手机捕获，以提供稳定的室内定位。

第七段（本文使用注意力来改善设备不变性）

Our proposed framework in this paper, ANVIL, is a multihead attention neural network based and calibration-free indoor localization framework that can be easily deployed to off-the-shelf smartphones, while relying on the ubiquity of WiFi signals to deliver device invariant performance.
我们在本文中提出的框架ANVIL是一个基于多头注意力神经网络的、无需校准的室内定位框架，可以很容易地部署到现成的智能手机上，同时依靠无处不在的WiFi信号来提供设备不变的性能。

To the best of our knowledge, no previous works have attempted to use attention in the context of improving device invariance.
据我们所知，没有以前的作品试图在改善设备不变性的背景下使用注意力。

We performed extensive evaluation of ANVIL against stateof-the-art prior works and evaluated it over a variety of indoor
environments and smartphones in real-world settings.
我们对ANVIL进行了广泛的评估，以对照最先进的先前作品，并在各种室内环境和现实世界的智能手机环境中对其进行了评估。

3.ANALYSIS OF HETEROGENEOUS FINGERPRINTS 异质指纹分析

3.1 逐句翻译

第一段（指纹采集）

To understand the cause of degradation in localization performance due to device heterogeneity, we evaluate the RSSI fingerprints captured by two distinct smartphones.
为了了解由于设备异质性导致定位性能下降的原因，我们评估了两种不同智能手机捕获的RSSI指纹。

For this experiment, 10 fingerprints are captured at a single location with two smartphones: LG V20 (LG) and Oneplus 3 (OP3).
本次实验用LG V20 (LG)和一加3 (OP3)两款智能手机在同一地点采集了10个指纹。

The RSSI values captured are in the range of –100dBm to 0dBm, where –100dBm indicates no received signal and 0dBm is the highest signal strength.
捕获的RSSI值范围为-100dBm ~ 0dBm，其中-100dBm表示没有接收到信号，0dBm为最高信号强度。

The RSSI values for the two devices are presented in figure 1. The solid lines represent the mean values, whereas the shaded regions represent the range of the observed RSSI values. From figure 1, we can make the following key observations:
这两个设备的RSSI值如图1所示。实线表示平均值，阴影区域表示观测到的RSSI值的范围。从图1中，我们可以得出以下主要观察结果:

第二段（介绍指纹采集图中观察到的结果）

在这里插入图片描述
图1所示。两个不同的智能手机在特定位置同时观察到的WiFi ap的RSSI值。实线表示平均值。阴影区域表示RSSI值的范围，超过10个指纹读数。不同的WiFi ap(通过MAC id)表示为0 ~ 50的唯一整数。

• There is considerable similarity in the shape of the RSSI fingerprints across the two smartphones. Therefore, indoor localization frameworks that focus on pattern-matching based approaches may be able to deliver higher quality localization performance.

两款智能手机的 RSSI 指纹形状非常相似。因此，基于模式匹配方法的室内定位框架或许能提供更高质量的定位性能。

• The RSSI values for the LG device exhibit an upward shift (higher signal reception) by an almost constant amount. This constant shift of RSSI value is similar to increasing the brightness of an image.
•LG设备的RSSI值表现出几乎恒定的向上移位(更高的信号接收)。这种RSSI值的恒定移动类似于增加图像的亮度。

• We also observe a contrastive effect for certain parts of the fingerprints. By contrastive effect, we mean the difference between the RSSI value changes. In figure 1, we observe that while the RSSI fingerprints across the two devices rise and fall together, the specific amount that they rise and fall by are not the same. A clear example of this observation can be seen across the RSSI values of APs 10 and 11.
•我们还观察到指纹某些部分的对比效应。对比效应是指RSSI值变化之间的差异。在图1中，我们观察到，虽然两个设备上的RSSI指纹同时上升和下降，但它们上升和下降的具体数量并不相同。可以在ap 10和ap 11的RSSI值中看到这一观察结果的一个清晰示例。

• Some APs that are visible when using the LG device are never visible (RSSI = ‒100dBm) to the OP3 device. A good example of this is AP 36, where the RSSI value for the LG device is ‒80dBm and the RSSI value for the OP3 device always remains ‒100dBm. From the perspective of deep-learning models, this is similar to the random dropout of input AP RSSI.
•使用LG设备时可见的一些ap对OP3设备永远不可见(RSSI = -100dBm)。一个很好的例子是AP 36，其中LG设备的RSSI值为-80dBm，而OP3设备的RSSI值始终保持-100dBm。从深度学习模型的角度来看，这类似于输入AP RSSI的随机dropout。

• Lastly, we also observe that the RSSI values from the LG device vary much more than the OP3 device, for a majority
of the APs. This indicates that different smartphones may experience different amounts of variation in RSSI values, and it may also be hard to quantify the range of RSSI variation for a given device beforehand. Therefore, our proposed approach must be resilient to such unpredictable variations in the raw RSSI values observed from an AP.
•最后，我们还观察到，对于大多数ap, LG设备的RSSI值比OP3设备变化更大。这表明不同的智能手机可能会经历不同的RSSI值变化量，并且也很难事先量化给定设备的RSSI变化范围。因此，我们提出的方法必须能够适应从AP观察到的原始RSSI值的这种不可预测的变化。

The evaluation of observed fingerprint variations across the two smartphones presented in this section (as well as our
analysis with other smartphones that show a similar trend) guides the design of our device invariant ANVIL framework.
本节介绍的两款智能手机中观察到的指纹变化的评估(以及我们对显示类似趋势的其他智能手机的分析)指导了我们的设备不变ANVIL框架的设计。

We specifically base our data augmentation strategy in ANVILon the observations made here. The multi head attention-based deep-learning approach (discussed in the next section) was chosen and adapted in ANVIL to promote generalized pattern matching across heterogeneous devices.
我们在ANVIL中的数据增强策略特别基于这里所做的观察。ANVIL选择并采用了基于多头注意力的深度学习方法(下一节将讨论)，以促进跨异构设备的广义模式匹配。

4.THE ATTENTION MECHANISM 注意机制

4.1 逐句翻译

第一段（注意力机制的数学表达）

The attention mechanism in the domain of deep learning is a relatively new approach initially employed for natural language processing (NLP) [22] and more recently also for image classification [23].
深度学习领域的注意机制是一种相对较新的方法，最初用于自然语言处理(NLP)[22]，最近也用于图像分类[23]。

The concept of attention is derived from the idea of human cognitive attention and our ability to selectively focus on sub-components of information while ignoring less relevant components of the same.
注意的概念来源于人类认知注意的概念，以及我们有选择地关注信息的子成分而忽略信息中不太相关的成分的能力。

For the purpose of deep learning, the attention mechanism is modeled as the retrieval of attention information from a database containing key and value pairs. Given a query Q and a set of key-value pairs (K,V), attention can be computed as:
就深度学习而言，注意力机制被模拟为从包含键值对的数据库中检索注意力信息。给定一个查询 Q 和一组键值对（K，V）
在这里插入图片描述

第二段（本文选择缩放点积）

In the above expression, a variety of similarity functions may be employed such as dot-product, scaled dot product, additive dot product, etc. [18].
在上述表达式中，可以使用多种相似函数，如点积、缩放点积、加性点积等[18]。

Among the variants, the dotproduct attention is the fastest to compute and is the most space-efficient, making it our choice for this work.
在这些变体中，点积注意力是计算速度最快的，也是最节省空间的，因此我们选择了它。

The process of computing scaled-dot-product attention (figure 2; right) for a given set of queries, keys, and values （Q,K,V）can be captured in an equation as:
计算缩放点积注意力的过程(图2;(右)对于一组给定的查询，键和值（Q,K,V）可以在等式中捕获为:、
在这里插入图片描述
where DK is the dimensionality of the key vector.
其中Dk是键向量的维数。

The role of the scaling factor (根号DK) is to counteract the effects of very large magnitudes being fed to the Softmax function, leading to regions that produce extremely small gradients
缩放因子(根号DK)的作用是抵消被馈送到Softmax函数的非常大的幅度的影响，导致产生极小梯度的区域

In practice, scaled dot-products are also known to outperform other approaches of computing attention.
在实践中，尺度点积也比其他计算注意力的方法表现得更好

在这里插入图片描述
图2。缩放点积注意(左)和多头注意(右)的程序表示。每个注意层接受三个输入:查询(Q)、键(K)和值(V)

第三段（介绍多头注意力）

While attention serves as a low-overhead approach for capturing a weighted relationship between queries and values, given a set of key-value pairs, it lacks any learnable parameters.
虽然对于捕获查询和值之间的加权关系，注意力是一种低开销的方法，但给定一组键值对，它缺乏任何可学习的参数。

This limits its ability to identify and quantify the hidden relationships between the different pairings of （Q,K,V）
这限制了它识别和量化（Q,K,V）的不同配对之间隐藏关系的能力。

The work in [22] extends the idea of a singular attention computation with multiple distinct versions (multiple heads) of linearly projected queries, keys, and values.
[22]中的工作扩展了单一注意力计算的思想，使用线性投影查询、键和值的多个不同版本(多个头)。

Each of these learnable linear projections are of dimension dk, dq, and dv. This form of attention is more commonly known as multi-headed attention and can be formally captured by the following equation:
这些可学习的线性投影的维数分别是dk dq dv。这种形式的注意力通常被称为多头注意力，可以用下面的公式来正式描述:
在这里插入图片描述
where
are model parameters or weights associated with the linear projections with dimensions head size × d. Head size (HS) is a hyperparameter of the multihead attention layer and d is the length of vector being projected (query, key or value).The process of computing scaled-dot-product attention and multi-headed attention is depicted in figure 2. The computation associated with each head can be performed in parallel, as shown.
其中 和是头部尺寸× d的线性投影的模型参数或权值。头部尺寸(HS)是多头关注层的超参数，d是被投影的向量(查询、键或值)的长度。计算缩放点积注意力和多头注意力的过程如图2所示。与每个头相关的计算可以并行执行，如下所示。

5.ANVIL FRAMEWORK ANVIL框架

5.1 逐句翻译

A. Overview 综述

第一段（框架-离线阶段）

在这里插入图片描述
Figure 3 presents a high-level representation of our proposed ANVIL framework. We begin in the offline phase that is annotated by red arrows.
图3给出了我们提出的ANVIL框架的高级表示。我们从用红色箭头标注的离线阶段开始。

Here we capture RSSI fingerprints for various RPs (see section VI for details) across the floorplan of the building.
在这里，我们在建筑的平面图上为各种rp捕获RSSI指纹(详细信息请参见第六节)。

Each row within the RSSI database consists of the RSSI values for every AP visible across the floorplan and its associated RP.
RSSI数据库中的每一行都由整个平面图上可见的每个AP及其关联RP的RSSI值组成。

The RSSI values indicated in a given row were captured using a single WiFi scan.
在给定行中显示的RSSI值是使用单个WiFi扫描捕获的。

These fingerprints are then pre-processed into queries (Q), keys (K) and values (V) to train a multi-headed attention neural network model, as shown in figure 3.
然后将这些指纹预处理为查询(Q)、键(K)和值(V)，以训练多头注意力神经网络模型，如图3所示。

The specific details of fingerprint preprocessing, model design and training are covered later in this section.
指纹预处理、模型设计和训练的具体细节将在本节后面介绍。

Once the model has been trained, it is deployed on a smartphone, with the fingerprint key-value pairings. This concludes the offline stage.
一旦模型经过训练，它就会被部署到智能手机上，并与指纹的键值配对。离线阶段到此结束。

第二段（框架-在现阶段）

In the online phase (green arrows), the user captures an RSSI fingerprint vector at an RP that is unknown.
在在线阶段(绿色箭头)，用户在未知RP处捕获RSSI指纹向量。

For any WiFi AP that was visible in the offline phase and is not observed in this online phase, its RSSI value is assumed to be –100dBm, ensuring consistent RSSI vector lengths across the phases.
对于任何在离线阶段可见而在在线阶段未观察到的WiFi AP，假设其RSSI值为-100dBm，以确保各阶段RSSI向量长度一致。

This fingerprint is pre-processed (see Section V.B) to form the fingerprint query and sent to the deployed multi-head attention neural network model on the smartphone.
该指纹经过预处理(参见章节V.B)，形成指纹查询，并发送到智能手机上部署的多头注意力神经网络模型。

As shown in figure 3, the model when fed the online phase fingerprint query, along with the keys and values from the offline phase, produces the location of the user in the online phase. This location is then shown to the user on the smartphone’s display.
如图3所示，当提供在线阶段指纹查询以及来自离线阶段的键和值时，该模型将生成用户在在线阶段的位置。然后这个位置就会显示在智能手机的显示屏上。

The following subsections elaborate on the major components of the ANVIL framework depicted in figure 3.
下面的小节详细介绍图3中描述的ANVIL框架的主要组件

B. RSSI Fingerprint Preprocessing RSSI指纹预处理

第一段（介绍预处理）

The RSSI for various WiFi APs along with their corresponding reference points are captured within a database as shown in figure 3.
各种WiFi ap的RSSI及其相应的参考点在数据库中被捕获，如图3所示。

As mentioned earlier, the RSSI values vary in the range of –100dBm to 0dBm, where –100 indicates no signal and 0 indicates a full (strongest) signal.
取值范围为-100dBm ~ 0dBm，其中-100表示无信号，0表示满(最强)信号。

The RSSI values captured in this dataset are normalized to a range of 0 to 1, where 0 represents the weak or null signal, and 1 represents the strongest signal.
在此数据集中捕获的RSSI值被归一化为0到1的范围，其中0表示弱信号或空信号，1表示最强信号。

This new dataset is the basis of all training data required by our multi-head attention model.
这个新数据集是我们多头注意模型所需的所有训练数据的基础。

As discussed in Section IV, the multi-head attention model requires three main inputs: the Queries and the Key-Value pair dataset.
如第四节所述，多头注意力模型需要三个主要输入:查询和键值对数据集。

For this work, the RSSI fingerprint vectors captured in the training phase (without RP information) are used as both queries and keys in the offline phase. The associated RPs are one-hot encoded and are used as values.
对于这项工作，在训练阶段捕获的RSSI指纹向量(没有RP信息)被用作离线阶段的查询和键。关联的rp是一次性编码的，并作为值使用。

C. Fingerprint Augmentation Stack (FASt) Layer 指纹增强堆栈(FASt)层

第一段（指纹会因为外部因素变化）

A major challenge to maintaining localization stability for fingerprinting-based indoor localization is the variation in RSSI fingerprints across heterogeneous devices, as discussed in Section III.
如第三节所述，维持基于指纹的室内定位稳定性的主要挑战是不同设备间RSSI指纹的差异。

However, in the online phase, it would be impossible to foretell what combination of effects the fingerprint captured using a smartphone.
然而，在在线阶段，不可能预测使用智能手机捕获指纹的效果组合。

The received RSSI fingerprints can vary depending on external factors like building layout, walls, metallic object (reflecting the RSSI fingerprint) making the prediction of the location ambiguous.
接收到的RSSI指纹可能会因建筑物布局、墙壁、金属物体(反映RSSI指纹)等外部因素而变化，从而使位置预测不明确。

第二段（提出FASt克服异构困难）

To overcome this challenge, we propose a streamlined Fingerprint Augmentation Stack (FASt) implemented as a layer that contains the required subcomponents for the augmentation of RSSI fingerprints that would promote the resilience to device heterogeneity.
为了克服这一挑战，我们提出了一种简化的指纹增强堆栈(FASt)，该层包含了增强RSSI指纹所需的子组件，从而提高了对设备异构的弹性。

The major advantage of FASt is that it can be seamlessly integrated into any deeplearning based model and promote the rapid prototyping of device invariant models for the purpose of fingerprintingbased indoor localization.
FASt的主要优点是它可以无缝集成到任何基于深度学习的模型中，并促进设备不变模型的快速原型化，以实现基于指纹的室内定位。

A procedural representation of the FASt layer is shown in figure 4.
FASt层的过程表示如图4所示。
在这里插入图片描述
指纹增强堆栈(FASt)的各个组件。每个子组件都增强了特定的异质性效应。

第三段（介绍FASt层的设计 1）

The design of the FASt layer is based on the three observations from our analysis in Section III, i.e., AP dropout, contrast, and brightness.
FASt层的设计基于我们在第三节中分析的三个观测值，即AP dropout、对比度和亮度。

While these are well known data augmentation techniques in the domain of computer vision [24], they do not directly translate to the domain of RSSI fingerprinting-based pattern matching.
虽然这些都是计算机视觉领域众所周知的数据增强技术[24]，但它们并不能直接应用于基于 RSSI 指纹的模式匹配领域。

In the computer vision domain, all three effects are applied to the inputs indiscriminately, as it is most likely that the input does not contain cropped portions relative to rest of the image (blacked/whited out regions of image).
在计算机视觉领域，所有三种效果都不加区分地应用于输入，因为很可能输入不包含相对于图像的其余部分(图像的黑/白区域)的裁剪部分。

This is especially true in the domain of image-based pattern matching, where the input image may not have cropped out components, with the exception of image inpainting [25].
在基于图像的模式匹配领域尤其如此，输入图像可能没有裁剪出组件，但图像绘制除外[25]。

The utilization of image augmentation before training deep-learning models is a wellknown approach for improving generalizability.
在训练深度学习模型之前使用图像增强是一种众所周知的提高泛化能力的方法。

第四段（介绍FASt层的设计 2）

In the case of RSSI-fingerprints, the fact that a certain set of APs are not visible in some areas of the floorplan as compared to the others is a critical and unique attribute.
在rssi指纹的情况下，与其他ap相比，某些ap组在平面图的某些区域不可见这一事实是一个关键且独特的属性。

A deep-learning framework might utilize this to better correlate an RSSI pattern to a specific location on the floorplan.
深度学习框架可以利用这一点更好地将RSSI模式与平面图上的特定位置关联起来。

Based on these beliefs and our analysis in Section III, we adapt the general random dropout, brightness, and contrast layers to only act on WiFi APs that are visible to the smartphone i.e., RSSI 不等于 -100db.
基于这些信念和我们在第三节中的分析，我们将一般随机差、亮度和对比度层调整为仅作用于智能手机可见的WiFi ap，即RSSI 不等于 -100db.

It is important to note that we deliberately, do not add random noise to the FASt layer, as it was not specially adapted for the purpose of RSSI fingerprinting-based indoor localization.
值得注意的是，我们故意不向FASt层添加随机噪声，因为它不是专门用于基于RSSI指纹的室内定位的。

However, random noise is an important aspect of the model design as discussed in the next subsection
然而，随机噪声是模型设计的一个重要方面，将在下一小节中讨论

D. Multi-Head Attention Model 多头注意模型

第一段（介绍多头注意模型与前文提出的方法结合）

The concept of attention and especially multi-head has gained considerable popularity in various domains of pattern matching [22].
注意的概念，特别是多头的概念在模式匹配的各个领域中已经得到了相当大的普及[22]。

Its simplicity in mathematical operations enables a computationally efficient process for deep learning.
它在数学运算上的简单性使深度学习的计算过程变得高效。

It is also well known that conventional feed-forward neural network layers (dense layers) require lower computational capabilities on the target deployment platform when compared to convolutional approaches.
众所周知，与卷积方法相比，传统的前馈神经网络层(密集层)在目标部署平台上需要更低的计算能力

Additionally, recent work in the domain of pattern matching is slowly considering attention as an alternative to convolutional layers [26].
此外，最近在模式匹配领域的工作正在慢慢考虑将注意力作为卷积层的替代方法[26]。

Based on these observations, we combine our domain specific fingerprint augmentation layer stack (FASt), Multi-Head Attention, and feed-forward neural networks to create a deeplearning classification model as depicted in figure 5.
基于这些观察结果，我们将特定领域的指纹增强层堆栈(FASt)、多头注意和前馈神经网络结合起来，创建了一个深度学习分类模型，如图5所示。
在这里插入图片描述
图5。ANVIL框架下的多头注意力神经网络模型综述。

第二段（介绍模型）

The model shown in figure 5 consists of three inputs: fingerprint query, keys, and one-hot encoded reference point (RP) labels as values.
图5所示的模型由三个输入组成:指纹查询、密钥和作为值的单热编码参考点(RP)标签。

The key-value inputs remain fixed in the offline and online phases of the model development process.
键值输入在模型开发过程的离线和在线阶段保持固定。

The fingerprint keys and RP values are the fingerprints captured at specific RPs using the smartphone from the offline phase.
指纹密钥和RP值是从离线阶段使用智能手机在特定RP处捕获的指纹。

In the offline phase, we use the same fingerprints as queries to train the model, whereas, in the online phase, fingerprints captured by the end-user are fed as queries to the model which in turn produces the location of the user.
在离线阶段，我们使用相同的指纹作为查询来训练模型，然而，在在线阶段，最终用户捕获的指纹作为查询馈送给模型，进而产生用户的位置。

第三段（介绍模型）

For the sake of simplicity, all hyperparameters in the model were chosen such that the model generalizes well across all the floorplans discussed in the experiments section
为了简单起见，选择模型中的所有超参数，使模型可以很好地泛化实验部分中讨论的所有平面图

We employ the value of 0.10 for the dropout, random brightness, and contrast functions of the FASt layer.
我们对FASt层的dropout、随机亮度和对比度函数采用0.10的值。

Therefore, for each RSSI fingerprint Query fed to the model, random dropout, random contrast, and random brightness (increase or decrease) are applied with a 10% probability.
因此，对于提供给模型的每个RSSI指纹查询，以10%的概率应用随机dropout、随机对比度和随机亮度(增加或减少)。

Similarly, the Gaussian noise layer was set to a standard deviation of 0.12. No augmentation is applied to the fingerprint keys.
同样，将高斯噪声层设置为标准差为0.12。没有对指纹键进行增强。

Based on our experiments (not presented for brevity), the overall performance of the model is better when Gaussian noise is applied to the whole fingerprint, instead of the masked approach (rssi 不等于100 dBm) used for the FASt layer.
根据我们的实验(为简洁起见，没有给出)，当高斯噪声应用于整个指纹时，模型的整体性能优于用于FASt层的掩膜方法(rssi 不等于100 dBm)。

One possible explanation is that the noise layer acts as regularization on the visible AP dropout layer which only acts on visible APs.
一种可能的解释是，噪声层在可见AP上起到正则化作用，而可见AP只作用于可见AP。

Following the augmentation of queries, they are then fed to the multi-head attention layer which has a total of 5 heads (NH) and a head size of 50 (HS).
随着查询的增加，它们被馈送到多头关注层，该层共有5个头(NH)，头大小为50 (HS)。

For simplicity of model design, the head size is used as the size of all linear projections within the multi-head attention layer.
为简化模型设计，头部大小作为多头注意层内所有线性投影的大小。

The output from the multi-head attention layer is fed to a stack of feed-forward fully connected or dense neural network layers with an interleaved dropout of 0.10.
多头注意力层的输出被馈送到一堆前馈全连接或密集的神经网络层，交错差值为0.10。

We employ the ReLu activation function across all layers, except for the output layer which uses Softmax.
除了使用Softmax的输出层外，我们在所有层上都使用了ReLu激活函数。

The length of the output layer is set to the number of unique RPs for the given floorplan.
输出层的长度设置为给定平面图的唯一rp的数量

With this setup, our multi-head attention model design has approximately 111K trainable parameters.
通过这种设置，我们的多头注意力模型设计有大约111K个可训练参数。

6. EXPERIMENTS 实验

6.1 逐句翻译

A. Experimental Setup 实验设置

1) Fingerprint Benchmark Path Suite 指纹基准路径套件

第一段（室内路径介绍1）

We evaluate the device invariance of ANVIL against four fingerprinting-based indoor localization frameworks from prior work across a benchmark suite containing five indoor paths in different buildings around the Colorado State University campus.
在科罗拉多州立大学校园内不同建筑中的五条室内路径的基准套件中，我们评估了 ANVIL 与之前工作中的四种基于指纹的室内定位框架的设备不变性。

Figure 6 depicts the floorplans, with each fingerprinted location or reference point (RP) denoted by a yellow dot.
图6描绘了平面图，每个指纹位置或参考点(RP)用一个黄点表示。

We chose a granularity of 1-meter for our experiments (distance between each yellow dot) which we believe is sufficient for the purpose of localizing humans.
我们的实验选择了1米的粒度(每个黄点之间的距离)，我们认为这对于定位人类的目的已经足够了。
在这里插入图片描述
图6:基准套件内的室内平面图。参考点用黄色方框标注。

第二段（实验设置介绍）

Each path (ranging 60 to 80 meters) was selected due to its salient features that may impact indoor localization.
每条路径(60 - 80米)的选择都是基于其可能影响室内定位的显著特征。

The Classroom floorplan is part of one of the oldest buildings on campus that is constructed from wood and concrete.
教室的平面图是校园里最古老的建筑之一，由木材和混凝土建造而成。

This path is surrounded by a combination of labs that hold heavy metallic equipment as well as large classrooms with open areas.
这条路径被放置重金属设备的实验室和带有开放区域的大教室所包围。

A total of 81 unique WiFi APs were visible on this path.
在这条路径上总共可以看到81个独特的WiFi ap。

The Auditorium and Library floorplans are part of relatively new buildings on campus that have a mix of metal and wooden structures with open study areas and bookshelves.
礼堂和图书馆的平面图是校园中相对较新的建筑的一部分，这些建筑混合了金属和木结构，带有开放的学习区和书架。

We observed 130 and 300 unique APs on the Auditorium and Library floorplans, respectively.
我们在礼堂和图书馆的平面图上分别观察到130和300个独特的ap。

The experiments performed do not consider factors such as the physical placement of the APs, the channel each AP is set to (assumed to be automatic), and the frequency at which an AP is broadcasting (2.5 or 5 GHz).
所进行的实验没有考虑AP的物理位置、每个AP设置的频道(假设是自动的)以及AP广播的频率(2.5 GHz或5 GHz)等因素。

This is a deliberate choice made to highlight the advantages for using fingerprinting-based indoor localization.
这是一个经过深思熟虑的选择，以突出使用基于指纹的室内定位的优势。

第三段（实验设置介绍3）

The Office path is on the second floor of an Engineering building that is surrounded by small offices and covered by 180 APs overall.
办公室通道位于一栋工程大楼的二楼，周围环绕着小办公室，总共有180个ap。

The Labs path is in the Engineering basement and is surrounded by labs consisting of a sizable amount of electronic and mechanical equipment with about 120 visible APs.
实验室通道位于工程部地下室，周围是由大量电子和机械设备组成的实验室，约有 120 个可见 AP。

Large quantities of metal and electronics on the Office and Labs paths lead to noisy WiFi fingerprints that hinder indoor localization efforts.
办公室和实验室路径上的大量金属和电子设备会导致嘈杂的WiFi指纹，从而阻碍室内定位工作。

The fingerprints at both locations were captured in usual working hours and so the human occupancy of these indoor locales is not artificially manipulated in any way.
这两个地点的指纹都是在正常工作时间采集的，因此这些室内场所的人员占用不会以任何方式被人为操纵。

A total of 8 fingerprints per RP are used for training ANVIL in the offline phase and 2 fingerprints per RP are used in the online phase.
在离线阶段，每个RP总共使用8个指纹用于训练ANVIL，在在线阶段，每个RP使用2个指纹。

We employ six smartphones from distinct vendors to capture the WiFi fingerprints across the five floorplans shown in figure 6. The specifications of these smartphones are listed under table I.
我们使用了来自不同供应商的六款智能手机来捕捉图6所示的五个平面图上的WiFi指纹。这些智能手机的规格列在表1下。
在这里插入图片描述
实验中使用的智能手机的细节。

2) Comparison with Prior Work 与前期工作的比较

第一段（本文中实现的四项技术）

Four state-of-the-art were implemented to establish the efficacy of our proposed ANVIL framework.
实施了四项最先进的技术，以确定我们建议的ANVIL框架的有效性。

The first work (LearnLoc [5]), builds on the K-Nearest-Neighbor (KNN) using a lightweight Euclidean distance-based metric to match fingerprints.
第一项工作(LearnLoc[5])建立在k -最近邻(KNN)的基础上，使用轻量级的基于欧几里得距离的度量来匹配指纹。

Interestingly, the authors of [27] performed an error analysis to understand the merits of distance calculations.
有趣的是，[27]的作者进行了误差分析，以了解距离计算的优点。

The primary focus was to create a comparison of Euclidean versus Manhattan distance-based metrics to match fingerprints.
主要的重点是创建欧几里得和曼哈顿距离为基础的指标来匹配指纹的比较。

The work shows evidence of Euclidean distancebased metric to match fingerprints being more accurate.
这项工作显示了基于欧几里得距离的度量来匹配指纹更准确的证据。

These works are incognizant of device heterogeneity and thus, LearnLoc [5] serves as a one of our motivations.
这些作品没有意识到设备的异质性，因此，LearnLoc[5]是我们的动机之一。

The second work, AdTrain [17], is a deep learning-based approach that achieves device invariance through the addition of noise at the input and output labels.
第二项工作，AdTrain[17]，是一种基于深度学习的方法，通过在输入和输出标签处添加噪声来实现设备不变性。

This creates an adversarial training scenario where the feed-forward neural network-based model attempts to converge in the presence of high-noise (adversity).
这创造了一个对抗性的训练场景，其中基于前馈神经网络的模型试图在高噪声(逆境)的存在下收敛。

The third work, SHERPA [28], is similar to the KNN-based approach in [2], [5] but is enhanced to withstand variations across devices in the offline and online phases of fingerprinting-based indoor localization.
第三项工作，SHERPA[28]，类似于[2]，[5]中基于knn的方法，但在基于指纹的室内定位的离线和在线阶段进行了增强，以承受不同设备的变化。

SHERPA achieves variation resilience by employing Pearson’s correlation as a distance metric to match the overall pattern of the fingerprints instead of a Euclidian distance-based approach in [5].
SHERPA通过使用Pearson’s correlation作为距离度量来匹配指纹的整体模式，而不是在[5]中使用基于欧几里德距离的方法来实现变异弹性。

Lastly, the fourth work employs Stacked Auto-Encoders (SAEs) [16] designed to sustain stable localization accuracy in the presence of device heterogeneity across the offline and online phases.
最后，第四项工作采用堆叠自编码器(sae)[16]，旨在在离线和在线阶段存在设备异构的情况下保持稳定的定位精度。

The authors of SAE propose using an ensemble of stacked autoencoders (denoising autoencoders) to overcome the variation in RSSI fingerprinting across different devices.
SAE的作者建议使用堆叠自编码器的集成(去噪自编码器)来克服不同设备之间RSSI指纹识别的差异。

A Gaussian process classifier with a radial basis kernel is then employed to produce the final location of the user.
然后使用具有径向基核的高斯过程分类器来产生用户的最终位置。

B. Experimental Results 实验结果

1) Accuracy Comparsion on Benchmark Suite 基准套件的精度比较

第一段（模型定位精度的图）

A color-coded tabular compilation of the mean localization accuracy in meters across all floorplans in the benchmark suite using every combination of offline (training) and online (testing) devices is presented in figure 7.
使用离线(训练)和在线(测试)设备的每种组合，基准套件中所有平面图的平均定位精度(以米为单位)的颜色编码的表格汇编如图7所示。
在这里插入图片描述与基准测试套件中所有智能手机组合的建议工作(ANVIL)相比，先前工作的平均定位误差。

Each rowgroup in figure 7 captures the mean indoor localization error in meters for a single floorplan.
图7中的每个行组捕获单个平面图的平均室内定位误差(以米为单位)。

The device abbreviations on the vertical axes indicate the smartphones used in the offline phase, whereas the devices listed on the horizontal axes indicate smartphones employed by a user in the online phase.
纵轴上的设备缩写表示离线阶段使用的智能手机，而横轴上列出的设备表示用户在在线阶段使用的智能手机。

The observed localization errors are color coded from green (lower/better) to red (higher/worse) per floorplan.
观察到的定位误差从绿色(较低/较好)到红色(较高/较差)标记为每个平面图。

第二段（模型定位精度的不同）

From figure 7, we immediately observe that the general performance of all five localization frameworks varies across various floorplans.
从图7中，我们立即观察到所有五种本地化框架的总体性能在不同的平面图中有所不同。

The possible explanation for such an observation could be the differences in path length, overall visibility of WiFi APs, path shape, and other environmental factors.
对这种观察结果的可能解释可能是路径长度、WiFi ap的整体可见性、路径形状和其他环境因素的差异。

The variances of results across different floorplans also highlights the importance of considering a diverse set of environments when evaluating fingerprinting-based indoor localization frameworks.
在评估基于指纹的室内定位框架时，不同平面图的结果差异也强调了考虑不同环境的重要性。

When making observations about the performance of the different localization frameworks we note that across the whole benchmark suite, LearnLoc (with the exception of SAE, discussed later) delivers the least stability in the presence of device heterogeneity.
在观察不同本地化框架的性能时，我们注意到，在整个基准测试套件中，LearnLoc (SAE除外，稍后讨论)在设备异构的情况下提供的稳定性最低。

The simple introduction of Pearson correlation-based pattern-matching metric through SHERPA greatly improves the overall localization accuracy.
通过SHERPA简单地引入基于Pearson相关性的模式匹配度量，极大地提高了整体定位精度。

However, there are some offline-online combinations of smartphones where SHERPA is not very effective.
然而，在一些智能手机的离线-在线组合中，SHERPA并不是很有效。

For example, in the case when the LG-BLU (offlineonline) is used in the Auditorium floorplan, SHERPA is not
very effective.
例如，在礼堂平面图中使用LG-BLU(离线-在线)的情况下，SHERPA不是很有效。

Based on our empirical analysis, the BLU device exhibits high noise across fingerprints of the same location.
根据我们的实证分析，BLU设备在同一位置的指纹之间表现出高噪声。

Traditional machine learning models, unlike their deep-learning counterparts, are unable to withstand such noisy fingerprints.
与深度学习模型不同，传统的机器学习模型无法承受如此嘈杂的指纹。

In contrast, AdTrain and ANVIL are relatively resilient to such noisy combinations of devices which can be attributed to the benefits from adversarial training and FASt layer for fingerprint augmentation, respectively.
相比之下，AdTrain和ANVIL对这种嘈杂的设备组合具有相对的弹性，这可以分别归因于对抗性训练和用于指纹增强的FASt层的好处。

第三段（分析不同模型，不同模型对比）

From figure 7, we also note that SAE produces stable localization performance across device combinations i.e., different offline-online smartphone combinations produce similar localization error.
从图7中，我们还注意到SAE在不同的设备组合中产生稳定的定位性能，即不同的离线-在线智能手机组合产生相似的定位错误。

However, the localization errors by themselves are extremely large.
然而，定位本身的误差是非常大的。

Due to such poor performance of SAE, we no longer consider it in the subsequent experiments in this paper.
由于SAE的性能如此之差，我们在本文后续的实验中不再考虑它。

AdTrain shows the best resilience to device heterogeneity second only to our ANVIL framework.
AdTrain显示了对设备异质性的最佳恢复能力，仅次于我们的ANVIL框架。

We suspect that the simplistic input and label noise induction plays a critical role in the success of AdTrain.
我们怀疑简单的输入和标签噪声诱导在AdTrain的成功中起着关键作用。

An interesting observation is that the approaches proposed by the authors in AdTrain appear to be orthogonal to the ones proposed by us for the ANVIL framework.
一个有趣的观察是，作者在AdTrain中提出的方法似乎与我们为ANVIL框架提出的方法是正交的。

While ANVIL outperforms AdTrain, its performance could be further improved by combining the approaches of both frameworks.
虽然ANVIL优于AdTrain，但通过结合这两个框架的方法，它的性能可以进一步提高\

Even though figure 7 presents a detailed view of the pair-wise evaluation of various offline-online devices, it is difficult to derive generalized conclusions for the performance of individual frameworks.
尽管图7给出了各种离线-在线设备成对评估的详细视图，但很难得出单个框架性能的一般结论。

We therefore capture the average localization errors and associated standard deviations of all offline-online device pairs in figure 8.
因此，我们在图8中捕获了所有离线-在线设备对的平均定位误差和相关的标准偏差。
在这里插入图片描述
ANVIL的定位性能与以往同类产品的对比。

From figure 8, we note that as compared to LearnLoc, both AdTrain and SHERPAproduce better results, with the exception of AdTrain in the Classroom and Labs floorplan.
从图8中，我们注意到，与LearnLoc相比，AdTrain和SHERPA都产生了更好的结果，除了AdTrain在教室和实验室平面图中

In contrast, ANVILconsistently produces better results across all floorplans and is more stable (lower standard deviation).
相比之下，ANVIL在所有平面图上都能产生更好的结果，并且更稳定(更低的标准差)。

ANVIL is able to perform up to ~30% better as compared to SHERPA and up to ~35% better than Adtrain on the Labs and the Office paths.
在实验室和办公室路径上，ANVIL的性能比SHERPA高30%，比Adtrain高35%

第四段（总结本文框架更好）

In general, based on our experimental analysis in this subsection, we conclude that our proposed ANVIL framework outperforms all previous state-of-the-art frameworks considered in this paper, and delivers superior localization accuracy (low error) and stability (similar error across smartphone pairs via low standard deviation) across a diverse set of smartphones and benchmark floorplans.
总的来说，根据我们在本小节中的实验分析，我们得出结论，我们提出的ANVIL框架优于本文中考虑的所有先前最先进的框架，并在各种智能手机和基准平面图中提供卓越的定位精度(低误差)和稳定性(通过低标准偏差在智能手机对中产生类似误差)

2) Generalizability of FASt Layer FASt层的通用性

第一段（消融实验-FASt）

Within the ANVIL framework, we proposed multi-head attention and the FASt layer as two major contributions that play a significant role in strengthening its device invariance
在ANVIL框架中，我们提出了多头关注和FASt层作为增强其设备不变性的两个主要贡献

To evaluate the importance of each of these contributions individually and highlight the generalizability of the FASt layer, we apply it to previous works that employ feed-forward FF-DNNs [9] and CNNs (CNNLOC) [6] for fingerprintingbased indoor localization.
为了单独评估这些贡献的重要性并突出FASt层的泛化性，我们将其应用于先前使用前馈ff - dnn[9]和cnn (CNNLOC)[6]进行基于指纹的室内定位的工作。

Our analysis considers two versions of these frameworks: one without the FASt layer (FF-DNN, CNNLOC) and one with the FASt layer (FF-DNN+FASt, CNNLOC+FASt).
我们的分析考虑了这些框架的两个版本:一个没有FASt层(FF-DNN, CNNLOC)，一个有FASt层(FF-DNN+FASt, CNNLOC+FASt)。

We also evaluate two versions of ANVIL, one with FASt (ANVIL) and the other without (ANVIL+NoFASt).
我们还评估了两个版本的ANVIL，一个带有FASt (ANVIL)，另一个没有(ANVIL+NoFASt)。

For the sake of brevity results are averaged across all offline and online phase devices. The results for this analysis are presented in figure 9.
为简洁起见，结果在所有离线和在线相位设备上平均。此分析的结果如图9所示。
在这里插入图片描述
ANVIL和以前的异构不识别框架在有和没有FASt层时的定位误差

第二段（分析图9）

From figure 9, we note that CNNLOC, which is heterogeneity incognizant, produces the highest localization error.
从图9中，我们注意到CNNLOC是异构不可识别的，它产生的定位误差最高。

It is possible that CNNLOC tends to overfit the pattern present in the offline phase device.
CNNLOC可能倾向于过拟合存在于脱机相位器件中的模式。

The CNNLOC+FAStvariant shows improvement but is still worse than most other frameworks across all paths.
CNNLOC+FASt变体显示出改进，但仍然比所有路径上的大多数其他框架差。

Surprisingly FF-DNN delivers stronger invariance to device heterogeneity.
令人惊讶的是，FF-DNN对设备异质性提供了更强的不变性。

This can be attributed to the fact that the FF-DNN based approach is unable to overfit to the input.
这可以归因于基于FF-DNN的方法无法对输入进行过拟合。

The FF-DNN+FASt approach offers some limited benefits on Auditorium, Library and the Office paths.
FF-DNN+FASt 方法在礼堂、图书馆和办公室路径上的优势有限。

Across most of the paths evaluated in figure 9, the ANVIL and ANVIL+NoFASt produce the best results.
在图9中评估的大多数路径中，ANVIL和ANVIL+NoFASt产生了最好的结果。

The FASt layer when applied to ANVIL and also other frameworks leads to improvement in localization accuracy with the exception of the Classroom path.
当将FASt层应用于ANVIL和其他框架时，除了教室路径之外，还可以提高定位精度。

Notably, the Classroom path generally exhibits the least levels of degradation in localization quality as seen in figures 7, 8, and 9.
值得注意的是，如图7、8和9所示，教室路径通常表现出本地化质量的最低退化水平。

Given that most frameworks achieve ~1.5 meters of average accuracy on the Classroom path, there is very limited room for improvement on that path
考虑到大多数框架在教室路径上的平均精度达到1.5米，在这条路径上的改进空间非常有限

第三段（总结本文两个方法的作用）

In summary, based on the observations from figure 9, our evaluations strongly highlight the role of 1) the domain adapted fingerprint augmentation (FASt), and 2) multi-head attention, as proposed in ANVIL, for the successful realization of device invariant indoor localization.
综上所述，基于图9的观察结果，我们的评估强烈强调了1)域适应指纹增强(FASt)和2)ANVIL中提出的多头注意力对于成功实现设备不变室内定位的作用。

We believe that these two components can be broadly applicable to many other deep learning based indoor localization frameworks, to improve localization accuracy across heterogeneous devices.
我们相信这两个组件可以广泛应用于许多其他基于深度学习的室内定位框架，以提高跨异构设备的定位精度。

7. CONCLUSION 结论

7.1 逐句翻译

第一段（本文提出的方法性能有所提高）

In this paper, we presented a novel framework called ANVIL that combines smart fingerprint augmentation as a layer together with multi-head attention in a neural network, for device invariant indoor localization.
在本文中，我们提出了一种新的框架ANVIL，该框架将智能指纹增强作为一个层与神经网络中的多头注意相结合，用于设备不变室内定位。

We evaluated WiFi RSSI fingerprints from smartphones of different vendors and made key empirical observations that informed our fingerprint augmentation strategy.
我们评估了来自不同厂商的智能手机的WiFi RSSI指纹，并进行了关键的实证观察，为我们的指纹增强策略提供了信息。

Our proposed framework was evaluated against several contemporary indoor localization frameworks using six different smartphones across five diverse indoor environments.
我们提出的框架与几个当代室内定位框架进行了评估，这些框架使用了五种不同室内环境中的六种不同智能手机。

Through the evaluations, we deduced that ANVIL delivers considerable resiliency to device heterogeneity and provides up to 35% better performance compared to the previous works.
通过评估，我们推断ANVIL对设备异质性具有相当大的弹性，与以前的工作相比，性能提高了35%。

Our ongoing work is focusing on exploring incorporating secure indoor localization techniques [29], efficient model compression [30], [31], deep learning model engineering [32], and long-term fingerprint aging resilience [33] into ANVIL.
我们正在进行的工作重点是探索将安全的室内定位技术[29]、高效的模型压缩[30]、[31]、深度学习模型工程[32]和长期指纹老化弹性[33]纳入ANVIL。

醉酒柴柴

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
【定位系列论文阅读】-Multi-Head Attention Neural Network for Smartphone Invariant Indoor Localization

针对传统的室内定位解决方案可能无法应对设备引起的变化，会降低其定位精度的问题。本文提出了ANVIL框架，与以前的框架相比，性能提高了35%。
复制链接

扫一扫