Skype&Type: Keyboard Eavesdropping in Voice-over-IP

Voice-over-IP(VoIP) software is a type of widely spread and pervasive software. However, its drawback is ignored that VoIP transmits information along with voice, such as keystroke sounds, which reveals what someone is typing on a keyboard. In general, circumvention of cryptograhic-based data protection techniques requires comprising one of the end-hosts to capture plain-text before it is encrypted. A more convenient way to capture plain-text before encryption without comprising a system is eavesdropping on unintentional leakage of physical emanations that often happen during the regular devices’operations, including electromagnetic , visual, tactile and acoustic emanations. And I/O peripherals (e.g., keyboards, mice, touch-screens, and printers) become convenient targets for physical eavesdropping attacks because I/O peripherals directly leak information on the unencrypted input or output text. The exploitation of keyboard acoustic emanations have already been proved effective in reconstructing the typed input and learn what a victim is typing via analyzing the sound produced by the keystrokes. Keystrokes are recorded either directly, using microphones, or by exploiting various sensors(e.g., accelerometers). Once collected through eavesdropping, the audio stream is typically using techniques like supervised and unsupervised machine learning or triangulation to fully or partially reconstruct victim’s input. In the past years, all proposed keyboard acoustic eavesdropping attacks required a comprised(i.e., controlled by the adversary) microphone near the victim’s keyboard requiring physical access, therefore strongly limit applicability of such attacks and reduce real-world feasibility. Recent proposals called Skype&Type attack(or S&T attack for short) relaxed the physical proximity requirement by exploiting VoIP applications to move the adversary in a remote-setting scenario. Launching such attacks premise to the observation that people involved in VoIP calls often engage in secondary activities, many of which involve using the keyboard(e.g., entering a password). VoIP software automatically acquires all acoustic emanations including those of the keyboard and faithfully transmits them to all other parties involved in the call as well. Hence, this provides opportunities for one or more possible parties malicious to determine what the user typed based on keystroke sounds. Such an adversary is realistic inasmuch as it is not always the case that two parties engaged in a VoIP call have mutual trust, e.g., when between lawyers on opposite sides of a legal case or negotiations for different parties. Additionally, the pervasiveness of VoIP software provides an attacker with a huge attack surface, thereupon it is hard to achieve desirable defense performance with previous approaches. Considering Microsoft Skype alone, one very popular VoIP software, the number of active monthly user is about 300 million. This conveys enough audio information to reconstruct the victim’s input——keystrokes typed on the remote keystroke. The aforementioned facts, to a certain extent, have eavesdropping on keyboard inputs become an active and popular area of research. In this paper, the authors present and assess a new keyboard acoustic eavesdropping attack involved VoIP, called Skype&Type(S&T). Unlike previous attacks that assume a stronger adversary model, S&T is more practical and feasible in many real-world settings, without requiring physically close to the victim(either in person or with a recording device) and precise profiling of the victim’s typing style and keyboard. Besides, S&T can work a very small amount of leaked keystrokes which are likely during a VoIP call. The experiments show that S&T attains top-5 accuracy of 91.7% in guessing a random key pressed by the victim, and S&T is effective with many different recording devices (such as laptop microphones, headset microphone, and smartphones located in proximity of the target keyboard), diverse typing styles and speed. In particular, S&T achieves a higher attack success rate when the victim is typing in a known language. The contributions of this paper are concluded as follows:(1) the authors demonstrate S&T attack based on remote keyboard acoustic eavesdropping over VoIP software, with the goal of recovering text by the users during a VoIP call with the attack and random text as well, such as randomly generated passwords or PINs. (2) S&T attack is highly accurate with minimal profiling of the victim’s typing style and keyboard and remains quite accurate even if no profiling is available to the adversary, ergo is more feasible and applicable to real-world settings under realistic assumptions. (3) Extensive experiments show that S&T works well with different common and inexpensive recording devices on a great variety of typing styles and speed, and is also robust to VoIP-related issues, such as limited available bandwidth that degrades call quality, as well as human speech over keystroke sounds. (4) Based on the insights from the design and evaluation phases of this work, the authors propose a countermeasure to S&T and similar attacks that exploit spectral properties of keystroke sound. Their countermeasure is transparent and does not severely input the quality of the voice during the call, and is able to disrupt spectral features——making previous data collected by an adversary useless. The novel contributions of this work, compared to the preliminary version, lie in a greatly extended experimental evaluation in improvements to the performance of S&T and propose countermeasure.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
目标检测(Object Detection)是计算机视觉领域的一个核心问题,其主要任务是找出图像中所有感兴趣的目标(物体),并确定它们的类别和位置。以下是对目标检测的详细阐述: 一、基本概念 目标检测的任务是解决“在哪里?是什么?”的问题,即定位出图像中目标的位置并识别出目标的类别。由于各类物体具有不同的外观、形状和姿态,加上成像时光照、遮挡等因素的干扰,目标检测一直是计算机视觉领域最具挑战性的任务之一。 二、核心问题 目标检测涉及以下几个核心问题: 分类问题:判断图像中的目标属于哪个类别。 定位问题:确定目标在图像中的具体位置。 大小问题:目标可能具有不同的大小。 形状问题:目标可能具有不同的形状。 三、算法分类 基于深度学习的目标检测算法主要分为两大类: Two-stage算法:先进行区域生成(Region Proposal),生成有可能包含待检物体的预选框(Region Proposal),再通过卷积神经网络进行样本分类。常见的Two-stage算法包括R-CNN、Fast R-CNN、Faster R-CNN等。 One-stage算法:不用生成区域提议,直接在网络中提取特征来预测物体分类和位置。常见的One-stage算法包括YOLO系列(YOLOv1、YOLOv2、YOLOv3、YOLOv4、YOLOv5等)、SSD和RetinaNet等。 四、算法原理 以YOLO系列为例,YOLO将目标检测视为回归问题,将输入图像一次性划分为多个区域,直接在输出层预测边界框和类别概率。YOLO采用卷积网络来提取特征,使用全连接层来得到预测值。其网络结构通常包含多个卷积层和全连接层,通过卷积层提取图像特征,通过全连接层输出预测结果。 五、应用领域 目标检测技术已经广泛应用于各个领域,为人们的生活带来了极大的便利。以下是一些主要的应用领域: 安全监控:在商场、银行
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值