Tutorial: Detecting When A User Blows Into The Mic

http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/

 

 

 

 

If, a couple of years back, you’d told me that people would expect to be able to shake their phone or blow into the mic to make something happen I would have laughed. And here we are.

Detecting a shake gesture is straightforward, all the more so in 3.0 with the introduction of motion events.

Detecting when a user blows into the microphone is a bit more difficult. In this tutorial we’ll create a simple simple single-view app that writes a log message to the console when a user blows into the mic.

Source/Github

The code for this tutorial is available on GitHub. You can either clone the repository or download this zip.

Overview

The job of detecting when a user blows into the microphone is separable into two parts: (1) taking input from the microphone and (2) listening for a blowing sound.

We’ll use the new-in-3.0 AVAudioRecorder class to grab the mic input. Choosing AVAudioRecorder lets us use Objective-C without — as other options require — dropping down to C.

The noise/sound of someone blowing into the mic is made up of low-frequency sounds. We’ll use a low pass filter to reduce the high frequency sounds coming in on the mic; when the level of the filtered signal spikes we’ll know someone’s blowing into the mic.

Creating The Project

Launch Xcode and create a new View-Based iPhone application called MicBlow:

  1. Create a new project using File > New Project… from Xcode’s menu
  2. Select View-based Application from the iPhone OS > Application section, click Choose…
  3. Name the project as MicBlow and click Save

Adding The AVFoundation Framework

In order to use the SDK’s AVAudioRecorder class, we’ll need to add the AVFoundation framework to the project:

  1. Expand the Targets branch in the Groups & Files panel of the project
  2. Control-click or right-click the MicBlow item
  3. Choose Add > Existing Frameworks…
  4. Click the + button at the bottom left beneath Linked Libraries
  5. Choose AVFoundation.framework and click Add
  6. AVFoundation.framework will now be listed under Linked Libraries. Close the window

Next, we’ll import the AVFoundation headers in our view controller’s interface file and set up an AVAudioRecorder instance variable:

  1. Expand the MicBlow project branch in the Groups & Files panel of the project
  2. Expand the Classes folder
  3. Edit MicBlowViewController.h by selecting it
  4. Update the file. Changes are bold:

#import <UIKit/UIKit.h>
#import <AVFoundation/AVFoundation.h>
#import <CoreAudio/CoreAudioTypes.h>

@interface MicBlowViewController : UIViewController {
	AVAudioRecorder *recorder;
}

@end

 

To save a step later, we also imported the CoreAudioTypes headers; we’ll need some of its constants when we set up the AVAudioRecorder.

Taking Input From The Mic

We’ll set everything up and start listening to the mic in ViewDidLoad:

  1. Uncomment the boilerplate ViewDidLoad method
  2. Update it as follows. Changes are bold:

- (void)viewDidLoad {
	[super viewDidLoad];

  	NSURL *url = [NSURL fileURLWithPath:@"/dev/null"];

  	NSDictionary *settings = [NSDictionary dictionaryWithObjectsAndKeys:
  	  	[NSNumber numberWithFloat: 44100.0],                 AVSampleRateKey,
  	  	[NSNumber numberWithInt: kAudioFormatAppleLossless], AVFormatIDKey,
  	  	[NSNumber numberWithInt: 1],                         AVNumberOfChannelsKey,
   	  	[NSNumber numberWithInt: AVAudioQualityMax],         AVEncoderAudioQualityKey,
  	  nil];

  	NSError *error;

  	recorder = [[AVAudioRecorder alloc] initWithURL:url settings:settings error:&error];

  	if (recorder) {
  		[recorder prepareToRecord];
  		recorder.meteringEnabled = YES;
  		[recorder record];
  	} else
  		NSLog([error description]);

}

 

The primary function of AVAudioRecorder is, as the name implies, to record audio. As a secondary function it provides audio-level information. So, here we discard the audio input by dumping it to the /dev/null bit bucket — while I can’t find any documentation to support it, the consensus seems to be that /dev/null will perform the same as on any Unix — and explicitly turn on audio metering.

Note: if you’re adapting the code for your own use, be sure to send the prepareToRecord (or, record) message before setting the meteringEnabled property or the audio level metering won’t work.

Remember to release the recorder in dealloc. Changes are bold:

- (void)dealloc {
  	[recorder release];
  	[super dealloc];
}

 

Sampling The Audio Level

We’ll use a timer to check the audio levels approximately 30 times a second. Add an NSTimer instance variable and its callback method to it in MicBlowViewController.h. Changes are bold:

#import <UIKit/UIKit.h>
#import <AVFoundation/AVFoundation.h>
#import <CoreAudio/CoreAudioTypes.h>

@interface MicBlowViewController : UIViewController {
	AVAudioRecorder *recorder;
	NSTimer *levelTimer;
}

- (void)levelTimerCallback:(NSTimer *)timer;

@end

 

Update the .m file’s ViewDidLoad to enable the timer. Changes are bold:

- (void)viewDidLoad {
	[super viewDidLoad];

  	NSURL *url = [NSURL fileURLWithPath:@"/dev/null"];

  	NSDictionary *settings = [NSDictionary dictionaryWithObjectsAndKeys:
  	  	[NSNumber numberWithFloat: 44100.0],                 AVSampleRateKey,
  	  	[NSNumber numberWithInt: kAudioFormatAppleLossless], AVFormatIDKey,
  	  	[NSNumber numberWithInt: 1],                         AVNumberOfChannelsKey,
   	  	[NSNumber numberWithInt: AVAudioQualityMax],         AVEncoderAudioQualityKey,
  	  nil];

  	NSError *error;

  	recorder = [[AVAudioRecorder alloc] initWithURL:url settings:settings error:&error];

  	if (recorder) {
  		[recorder prepareToRecord];
  		recorder.meteringEnabled = YES;
  		[recorder record];
		levelTimer = [NSTimer scheduledTimerWithTimeInterval: 0.03 target: self selector: @selector(levelTimerCallback:) userInfo: nil repeats: YES];
  	} else
  		NSLog([error description]);

}

 

For now, we’ll just sample the audio input level directly/with no filtering. Add the implementation of levelTimerCallback: to the .m file:

- (void)levelTimerCallback:(NSTimer *)timer {
	[recorder updateMeters];
	NSLog(@"Average input: %f Peak input: %f", [recorder averagePowerForChannel:0], [recorder peakPowerForChannel:0]);
}

 

Sending the updateMeters message refreshes the average and peak power meters. The meter use a logarithmic scale, with -160 being complete quiet and zero being maximum input.

Don’t forget to release the timer in dealloc. Changes are bold:

- (void)dealloc {
	[levelTimer release];
	[recorder release];
  	[super dealloc];
}

 

Listening For A Blowing Sound

As mentioned in the overview, we’ll be using a low pass filter to diminish high frequencies sounds’ contribution to the level. The algorithm creates a running set of results incorporating past sample input; we’ll need an instance variable to hold the results. Update the .h file. Changes are bold:

#import <UIKit/UIKit.h>
#import <AVFoundation/AVFoundation.h>
#import <CoreAudio/CoreAudioTypes.h>

@interface MicBlowViewController : UIViewController {
	AVAudioRecorder *recorder;
	NSTimer *levelTimer;
	double lowPassResults;
}

 

Implement the algorithm by replacing the levelTimerCallback: method with:

- (void)levelTimerCallback:(NSTimer *)timer {
	[recorder updateMeters];

	const double ALPHA = 0.05;
	double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0]));
	lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * lowPassResults;	

	NSLog(@"Average input: %f Peak input: %f Low pass results: %f", [recorder averagePowerForChannel:0], [recorder peakPowerForChannel:0], lowPassResults);
}

 

Each time the timer’s callback method is triggered the lowPassResults level variable is recalculated. As a convenience, it’s converted to a 0-1 scale, where zero is complete quiet and one is full volume.

We’ll recognize someone as having blown into the mic when the low pass filtered level crosses a threshold. Choosing the threshold number is somewhat of an art. Set it too low and it’s easily triggered; set it too high and the person has to breath into the mic at gale force and at length. For my app’s need, 0.95 works. We’ll replace the log line with a simple conditional:

- (void)listenForBlow:(NSTimer *)timer {
	[recorder updateMeters];

	const double ALPHA = 0.05;
	double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0]));
	lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * lowPassResults;

	if (lowPassResults > 0.95)
		NSLog(@"Mic blow detected");
}

 

Voila! You can detect when someone blows into the mic.

Caveats and Acknowledgements

This approach works well in most situations, but not universally: I’m writing this article in-flight. The roar of the engines constantly triggers the algorithm. Similarly, a noisy room will often have enough low-frequency sound to trigger the algorithm.

The algorithm was extracted/adapted from this Stack Overflow post. The post used the SCListener library for its audio level detection. SCListener pre-dates AVAudioRecorder; it was created to hide the details of dropping down to C to get audio input. With AVAudioRecorder this is no longer so tough.

Finally, this does work on the simulator. You just need to locate the built in mic on your Mac. To my surprise, the mic is located in the tiny hole to the left of the camera on my first generation Macbook.

1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 6、可私信博主看论文后选择购买源代码。 1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 6、可私信博主看论文后选择购买源代码。 1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 6、可私信博主看论文后选择购买源代码。
应用背景为变电站电力巡检,基于YOLO v4算法模型对常见电力巡检目标进行检测,并充分利用Ascend310提供的DVPP等硬件支持能力来完成流媒体的传输、处理等任务,并对系统性能做出一定的优化。.zip深度学习是机器学习的一个子领域,它基于人工神经网络的研究,特别是利用多层次的神经网络来进行学习和模式识别。深度学习模型能够学习数据的高层次特征,这些特征对于图像和语音识别、自然语言处理、医学图像分析等应用至关重要。以下是深度学习的一些关键概念和组成部分: 1. **神经网络(Neural Networks)**:深度学习的基础是人工神经网络,它是由多个层组成的网络结构,包括输入层、隐藏层和输出层。每个层由多个神经元组成,神经元之间通过权重连接。 2. **前馈神经网络(Feedforward Neural Networks)**:这是最常见的神经网络类型,信息从输入层流向隐藏层,最终到达输出层。 3. **卷积神经网络(Convolutional Neural Networks, CNNs)**:这种网络特别适合处理具有网格结构的数据,如图像。它们使用卷积层来提取图像的特征。 4. **循环神经网络(Recurrent Neural Networks, RNNs)**:这种网络能够处理序列数据,如时间序列或自然语言,因为它们具有记忆功能,能够捕捉数据中的时间依赖性。 5. **长短期记忆网络(Long Short-Term Memory, LSTM)**:LSTM 是一种特殊的 RNN,它能够学习长期依赖关系,非常适合复杂的序列预测任务。 6. **生成对抗网络(Generative Adversarial Networks, GANs)**:由两个网络组成,一个生成器和一个判别器,它们相互竞争,生成器生成数据,判别器评估数据的真实性。 7. **深度学习框架**:如 TensorFlow、Keras、PyTorch 等,这些框架提供了构建、训练和部署深度学习模型的工具和库。 8. **激活函数(Activation Functions)**:如 ReLU、Sigmoid、Tanh 等,它们在神经网络中用于添加非线性,使得网络能够学习复杂的函数。 9. **损失函数(Loss Functions)**:用于评估模型的预测与真实值之间的差异,常见的损失函数包括均方误差(MSE)、交叉熵(Cross-Entropy)等。 10. **优化算法(Optimization Algorithms)**:如梯度下降(Gradient Descent)、随机梯度下降(SGD)、Adam 等,用于更新网络权重,以最小化损失函数。 11. **正则化(Regularization)**:技术如 Dropout、L1/L2 正则化等,用于防止模型过拟合。 12. **迁移学习(Transfer Learning)**:利用在一个任务上训练好的模型来提高另一个相关任务的性能。 深度学习在许多领域都取得了显著的成就,但它也面临着一些挑战,如对大量数据的依赖、模型的解释性差、计算资源消耗大等。研究人员正在不断探索新的方法来解决这些问题。
深度学习是机器学习的一个子领域,它基于人工神经网络的研究,特别是利用多层次的神经网络来进行学习和模式识别。深度学习模型能够学习数据的高层次特征,这些特征对于图像和语音识别、自然语言处理、医学图像分析等应用至关重要。以下是深度学习的一些关键概念和组成部分: 1. **神经网络(Neural Networks)**:深度学习的基础是人工神经网络,它是由多个层组成的网络结构,包括输入层、隐藏层和输出层。每个层由多个神经元组成,神经元之间通过权重连接。 2. **前馈神经网络(Feedforward Neural Networks)**:这是最常见的神经网络类型,信息从输入层流向隐藏层,最终到达输出层。 3. **卷积神经网络(Convolutional Neural Networks, CNNs)**:这种网络特别适合处理具有网格结构的数据,如图像。它们使用卷积层来提取图像的特征。 4. **循环神经网络(Recurrent Neural Networks, RNNs)**:这种网络能够处理序列数据,如时间序列或自然语言,因为它们具有记忆功能,能够捕捉数据中的时间依赖性。 5. **长短期记忆网络(Long Short-Term Memory, LSTM)**:LSTM 是一种特殊的 RNN,它能够学习长期依赖关系,非常适合复杂的序列预测任务。 6. **生成对抗网络(Generative Adversarial Networks, GANs)**:由两个网络组成,一个生成器和一个判别器,它们相互竞争,生成器生成数据,判别器评估数据的真实性。 7. **深度学习框架**:如 TensorFlow、Keras、PyTorch 等,这些框架提供了构建、训练和部署深度学习模型的工具和库。 8. **激活函数(Activation Functions)**:如 ReLU、Sigmoid、Tanh 等,它们在神经网络中用于添加非线性,使得网络能够学习复杂的函数。 9. **损失函数(Loss Functions)**:用于评估模型的预测与真实值之间的差异,常见的损失函数包括均方误差(MSE)、交叉熵(Cross-Entropy)等。 10. **优化算法(Optimization Algorithms)**:如梯度下降(Gradient Descent)、随机梯度下降(SGD)、Adam 等,用于更新网络权重,以最小化损失函数。 11. **正则化(Regularization)**:技术如 Dropout、L1/L2 正则化等,用于防止模型过拟合。 12. **迁移学习(Transfer Learning)**:利用在一个任务上训练好的模型来提高另一个相关任务的性能。 深度学习在许多领域都取得了显著的成就,但它也面临着一些挑战,如对大量数据的依赖、模型的解释性差、计算资源消耗大等。研究人员正在不断探索新的方法来解决这些问题。
深度学习是机器学习的一个子领域,它基于人工神经网络的研究,特别是利用多层次的神经网络来进行学习和模式识别。深度学习模型能够学习数据的高层次特征,这些特征对于图像和语音识别、自然语言处理、医学图像分析等应用至关重要。以下是深度学习的一些关键概念和组成部分: 1. **神经网络(Neural Networks)**:深度学习的基础是人工神经网络,它是由多个层组成的网络结构,包括输入层、隐藏层和输出层。每个层由多个神经元组成,神经元之间通过权重连接。 2. **前馈神经网络(Feedforward Neural Networks)**:这是最常见的神经网络类型,信息从输入层流向隐藏层,最终到达输出层。 3. **卷积神经网络(Convolutional Neural Networks, CNNs)**:这种网络特别适合处理具有网格结构的数据,如图像。它们使用卷积层来提取图像的特征。 4. **循环神经网络(Recurrent Neural Networks, RNNs)**:这种网络能够处理序列数据,如时间序列或自然语言,因为它们具有记忆功能,能够捕捉数据中的时间依赖性。 5. **长短期记忆网络(Long Short-Term Memory, LSTM)**:LSTM 是一种特殊的 RNN,它能够学习长期依赖关系,非常适合复杂的序列预测任务。 6. **生成对抗网络(Generative Adversarial Networks, GANs)**:由两个网络组成,一个生成器和一个判别器,它们相互竞争,生成器生成数据,判别器评估数据的真实性。 7. **深度学习框架**:如 TensorFlow、Keras、PyTorch 等,这些框架提供了构建、训练和部署深度学习模型的工具和库。 8. **激活函数(Activation Functions)**:如 ReLU、Sigmoid、Tanh 等,它们在神经网络中用于添加非线性,使得网络能够学习复杂的函数。 9. **损失函数(Loss Functions)**:用于评估模型的预测与真实值之间的差异,常见的损失函数包括均方误差(MSE)、交叉熵(Cross-Entropy)等。 10. **优化算法(Optimization Algorithms)**:如梯度下降(Gradient Descent)、随机梯度下降(SGD)、Adam 等,用于更新网络权重,以最小化损失函数。 11. **正则化(Regularization)**:技术如 Dropout、L1/L2 正则化等,用于防止模型过拟合。 12. **迁移学习(Transfer Learning)**:利用在一个任务上训练好的模型来提高另一个相关任务的性能。 深度学习在许多领域都取得了显著的成就,但它也面临着一些挑战,如对大量数据的依赖、模型的解释性差、计算资源消耗大等。研究人员正在不断探索新的方法来解决这些问题。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值