Hololens 学习摘要及翻译记录 五 Voice

阅朗微软官方文档并作出简要记录:

https://developer.microsoft.com/zh-cn/windows/holographic/documentation

个人学习记录,有错误欢迎指出。


HoloLens的六个基本概念之四

Voice : 语音 : https://developer.microsoft.com/zh-cn/windows/holographic/voice_input


1.Voice is one of the three key forms of input on HoloLens. It allows you to directly command a hologram without having to use gestures. You simply gaze at a hologram and speak your command. Voice input can be a natural way to communicate your intent. Voice is especially good at traversing complex interfaces because it lets users cut through nested menus with one command.

语音是HoloLens三种主要输入方式之一。它可以让你不用手势操作而控制一个全息影像。你只要简单的凝视着一个全息影像,然后说出你的语音命令。语音能够很自然的表达你的意图。语音在处理复杂操作是比较有优势,因为它可以通过一个命令就唤出内嵌菜单。

2.Voice input is powered by the same engine that supports speech in all other Universal Windows Apps.

语音输入能被所有UWA里同样支持语音的引擎应用。


A.The "Select" Command : 选取命令

1.Even without specifically adding voice support to your app, your users can activate your holograms simply by saying "select". This behaves the same as a press and release with your hand or a clicker

即使应用中没有特地增加语音支持,你也能简单的通过“select”语音命令来激活全息影像。就好敲击手势一样。


B.Hey Cortana

1.You can also say "Hey Cortana" to bring up Cortana at anytime. You don't have to wait for her to appear to continue asking her your question or giving her an instruction - for example, try saying "Hey Cortana what's the weather?" as a single sentence. For more information about Cortana and what you can do, simply ask her! Say "Hey Cortana what can I say?" and she'll pull up a list of working and suggested commands. If you're already in the Cortana app you can also click the ? icon on the sidebar to pull up this same menu.

你在任何时候可以说“Hey Cortana”,将Cortana唤醒。你不用等待它出现再问出你的问题或给予它操作。比如:试着说"Hey Cortana what's the weather?"这样一句话。更多关于Cortana的信息和你所能做的事情,你可以简单的对它说"Hey Cortana what can I say?",它会列出一个工作和建议的命令清单。如果你已经在Cortana应用里面了,你可以点击侧边栏上的?号,同样可以唤出这个清单。

HoloLens-specific commands(这个自己尝试吧。。。就不翻了)

  • What can I say?
  • Go home | Go to Start - instead of bloom to get to Start Menu
  • Launch <app>
  • Move <app> here
  • Take a picture
  • Start recording
  • Stop recording
  • Increase the brightness
  • Decrease the brightness
  • Increase the volume
  • Decrease the volume
  • Mute | Unmute
  • Shut down the device
  • Restart the device
  • Go to sleep
  • What time is it?
  • How much battery do I have left?
  • Call <contact> (requires HoloSkype)


C.See it , Say it  :  随看随说

1.HoloLens has a "see it, say it" model for voice input, where labels on buttons tell users what voice commands they can say as well. For example, when looking at a 2D app, a user can say the "Adjust" command which they see in the App bar to adjust the position of the app in the world.

HoloLens有一种随看随说的语音输入模式。按钮上的标签能够很好的告诉你,你能够说什么语音指令。比如:当你看着一个2D应用时,使用者能够看到“Adjust”标签,你可以说“Adjust”,这样就能调整你应用在世界中的位置了。


2.When apps follow this rule, users can easily understand what to say to control the system. To reinforce this, while gazing at a button, you will see a "voice dwell" tooltip that comes up after a second if the button is voice-enabled and displays the command to speak to "press" it.

当应用遵守这个规则的时候,使用者可以简单的了解到用什么语音指令可以控制系统。为了加强了解,如果一个按钮能够使用语音指令,当你凝视这个按钮一秒后将看见一个悬浮的语音提示。


D.Voice commands for fast Hologram Manipulation : 语音指令对全息影像的快速操作。

There are also a number of voice commands you can say while gazing at a hologram to quickly perform manipulation tasks. These voice commands work on 2D apps as well as 3D objects you have placed in the world.

你可以使用几个语音指令对你凝视的全息影像进行快速的操作。不管是2D应用还是世界中的3D对象都可以使用。

Hologram Manipulation Commands

  • Face me
  • Bigger | Enhance
  • Smaller

E.Dictation : 口述

1.Rather than typing with air-taps, voice dictation can be more efficient to enter text into an app. This can greatly accelerate input with less effort for the user.

比起用敲打空气的手势来进行打字,用声音口述的方式来得更高效,输入对于用户来说也会事半功倍。

2.Any time the holographic keyboard is active, you can switch to dictation mode instead of typing. Select the microphone on the side of the text input box to get started.

只要全息键盘激活了,你就能使用口述模式来代替打字模式。选择输入框旁边的麦克风按钮来激活。


F.Communication : 交流

1.For applications that want to take advantage of the customized audio input processing options provided by HoloLens, it is important to understand the various audio stream categories your app can consume. Windows 10 supports several different stream categories and HoloLens makes use of three of these to enable custom processing to optimize the microphone audio quality tailored for speech, communication and other which can be used for ambient environment audio capture (i.e. "camcorder") scenarios.

一些应用想要加强HoloLens的自定义音频输入操作处理,主要是需要了解应用能够使用的audio stream categories (音频流) 类型。Win10支持若干类型的音频流,HoloLens能支持其中的三种。用来自行加工优化的麦克风音频品质用于特定的语音,交流和捕捉周围环境中的音频。


  • The AudioCategory_Communications stream category is customized for call quality and narration scenarios and provides the client with a 16kHz 24bit mono audio stream of the user's voice
  • AudioCategory_Communications音频流类型,可以用来实现高品质的通话,情节描述,提供给客户16kHz 24bit的单声道音频流语音。

  • The AudioCategory_Speech stream category is customized for the HoloLens (Windows) speech engine and provides it with a 16kHz 24bit mono stream of the user's voice. This category can be used by 3rd party speech engines if needed.
  • AudioCategory_Speech 音频流类型,被用在HoloLens设备(Windows系统)语音引擎,提供给客户16kHz 24bit的单声道音频流语音。如果需要,也可以用在第三方语音引擎当中。
  • The AudioCategory_Other stream category is customized for ambient environment audio recording and provides the client with a 48kHz 24 bit stereo audio stream.
  • AudioCategory_Other 音频流类型,被用于记录周围环境音频,提供给客户48kHz 24bit的立体声道音频流。

2.All this audio processing is hardware accelerated which means the features drain a lot less power than if the same processing was done on the HoloLens CPU. Avoid running other audio input processing on the CPU to maximize system battery life and take advantage of the built in, offloaded audio input processing.

所有的音频处理都是被硬件加速的,这意味着它们比起在HoloLens CPU上同类型的处理有着更低的功耗。避免使用其他的在CPU上的音频输入处理能够最大限度地提高系统的电量生命周期,由于是内置的,也避免了音频卸载的处理过程。


G.Troubleshooting : 问题

1.If you're having any issues using "select" and "Hey Cortana", try moving to a quieter space, turning away from the source of noise, or by speaking louder. At this time, all speech recognition on HoloLens is tuned and optimized specifically to native speakers of United States English.

如果你使用“select”或“Hey Cortana”这些语音指令时出现任何问题。尝试寻找一个安静的环境,隔离嘈杂的声音,提高自己的音量。同时使用HoloLens能够识别的美式英语。


See also

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
目标检测(Object Detection)是计算机视觉领域的一个核心问题,其主要任务是找出图像中所有感兴趣的目标(物体),并确定它们的类别和位置。以下是对目标检测的详细阐述: 一、基本概念 目标检测的任务是解决“在哪里?是什么?”的问题,即定位出图像中目标的位置并识别出目标的类别。由于各类物体具有不同的外观、形状和姿态,加上成像时光照、遮挡等因素的干扰,目标检测一直是计算机视觉领域最具挑战性的任务之一。 二、核心问题 目标检测涉及以下几个核心问题: 分类问题:判断图像中的目标属于哪个类别。 定位问题:确定目标在图像中的具体位置。 大小问题:目标可能具有不同的大小。 形状问题:目标可能具有不同的形状。 三、算法分类 基于深度学习的目标检测算法主要分为两大类: Two-stage算法:先进行区域生成(Region Proposal),生成有可能包含待检物体的预选框(Region Proposal),再通过卷积神经网络进行样本分类。常见的Two-stage算法包括R-CNN、Fast R-CNN、Faster R-CNN等。 One-stage算法:不用生成区域提议,直接在网络中提取特征来预测物体分类和位置。常见的One-stage算法包括YOLO系列(YOLOv1、YOLOv2、YOLOv3、YOLOv4、YOLOv5等)、SSD和RetinaNet等。 四、算法原理 以YOLO系列为例,YOLO将目标检测视为回归问题,将输入图像一次性划分为多个区域,直接在输出层预测边界框和类别概率。YOLO采用卷积网络来提取特征,使用全连接层来得到预测值。其网络结构通常包含多个卷积层和全连接层,通过卷积层提取图像特征,通过全连接层输出预测结果。 、应用领域 目标检测技术已经广泛应用于各个领域,为人们的生活带来了极大的便利。以下是一些主要的应用领域: 安全监控:在商场、银行
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值