Introduction

OpenEars is an shared-source iOS framework for iPhone voice recognition and speech synthesis (TTS). It lets you easily implement round-trip English language speech recognition and text-to-speech on the iPhone and iPad and uses the open source CMU Pocketsphinx, CMU Flite, and CMUCLMTK libraries, and it is free to use in an iPhone or iPad app. It is the most popular offline framework for speech recognition and speech synthesis on iOS and has been featured in development books such as O'Reilly's Basic Sensors in iOS by Alasdair Allan and Cocos2d for iPhone 1 Game Development Cookbook by Nathan Burba.

Highly-accurate large-vocabulary recognition (that is, trying to recognize any word the user speaks out of many thousands of known words) is not yet a reality for local in-app processing on the iPhone given the hardware limitations of the platform; even Siri does its large-vocabulary recognition on the server side. However, Pocketsphinx (the open source voice recognition engine that OpenEars uses) is capable of local recognition on the iPhone of vocabularies with hundreds of words depending on the environment and other factors, and performs very well with command-and-control language models. The best part is that it uses no network connectivity because all processing occurs locally on the device.

The current version of OpenEars is 1.2.4. Download OpenEars 1.2.4or read its changelog.

Features of OpenEars

OpenEars can:

Listen continuously for speech on a background thread, while suspending or resuming speech processing on demand, all while using less than 4% CPU on average on an iPhone 4(decoding speech, text-to-speech, updating the UI and other intermittent functions use more CPU),
Use any of 9 voices for speech, including male and female voices with a range of speed/quality level, and switch between them on the fly,
Change the pitch, speed and variance of any text-to-speech voice,
Know whether headphones are plugged in and continue voice recognition during text-to-speech only when they are plugged in,
Support bluetooth audio devices (experimental),
Dispatch information to any part of your app about the results of speech recognition and speech, or changes in the state of the audio session (such as an incoming phone call or headphones being plugged in),
Deliver level metering for both speech input and speech output so you can design visual feedback for both states.
Support JSGF grammars,
Dynamically generate new ARPA language models in-app based on input from an NSArray of NSStrings,
Switch between ARPA language models or JSGF grammars on the fly,
Get n-best lists with scoring,
Test existing recordings,
Be easily interacted with via standard and simple Objective-C methods,
Control all audio functions with text-to-speech and speech recognition in memory instead of writing audio files to disk and then reading them,
Drive speech recognition with a low-latency Audio Unit driver for highest responsiveness,
Be installed in a Cocoa-standard fashion using an easy-peasy already-compiled framework.
In addition to its various new features and faster recognition/text-to-speech responsiveness, OpenEars now has improved recognition accuracy.
- OpenEars is free to use in an iPhone or iPad app.

Warning

Before using OpenEars, please note it has to use a different audio driver on the Simulator that is less accurate, so it is always necessary to evaluate accuracy on a real device. Please don't submit support requests for accuracy issues with the Simulator.

Warning

Because Apple has removed armv6 architecture compiling in Xcode 4.5, and it is only possible to support upcoming devices using the armv7s architecture available in Xcode 4.5, there was no other option than to end support for armv6 devices after OpenEars 1.2. That means that current version of OpenEars only supports armv7 and armv7s devices (iPhone 3GS and later). If your app supports older devices like the first generation iPhone or the iPhone 3G, you can continue to download the legacy edition of OpenEars 1.2 here, but that edition will not update further – all updated versions of OpenEars starting with 1.2.1 will not support armv6 devices, just armv7 and armv7s. If you have previously been supporting older devices and you want to submit an app update removing that support, you must set your minimum deployment target to iOS 4.3 or later, or your app will be rejected by Apple. The framework is 100% compatible with LLVM-using versions of Xcode which precede version 4.5, but your app must be set to not compile the armv6 architecture in order to use it.

Installation

To use OpenEars:

Download the distribution and unpack it.

Create your own app, and add the iOS frameworks AudioToolbox and AVFoundation to it.

Inside your downloaded distribution there is a folder called "Frameworks". Drag the "Frameworks" folder into your app project in Xcode.

OK, now that you've finished laying the groundwork, you have to...wait, that's everything. You're ready to start using OpenEars. Give the sample app a spin to try out the features (the sample app uses ARC so you'll need a recent Xcode version) and then visit the Politepix interactive tutorial generator for a customized tutorial showing you exactly what code to add to your app for all of the different functionality of OpenEars.

If the steps on this page didn't work for you, you can get free support at the forums, read the FAQ, brush up on the documentation, or open aprivate email support incident at the Politepix shop. If you'd like to read the documentation, simply read onward.

Basic concepts

There are a few basic concepts to understand about voice recognition and OpenEars that will make it easiest to create an app.

Local or offline speech recognition versus server-based or online speech recognition: most speech recognition on the iPhone is done by streaming the speech audio to servers. OpenEars works by doing the recognition inside the iPhone without using the network. This saves bandwidth and results in faster response, but since a server is much more powerful than a phone it means that we have to work with much smaller vocabularies to get accurate recognition.
Language Models. The language model is the vocabulary that you want OpenEars to understand, in a format that its speech recognition engine can understand. The smaller and better-adapted to your users' real usage cases the language model is, the better the accuracy. An ideal language model for PocketsphinxController has fewer than 200 words.
The parts of OpenEars. OpenEars has a simple, flexible and very powerful architecture. PocketsphinxController recognizes speech using a language model that was dynamically created byLanguageModelGenerator. FliteController creates synthesized speech (TTS). And OpenEarsEventsObserver dispatches messages about every feature of OpenEars (what speech was understood by the engine, whether synthesized speech is in progress, if there was an audio interruption) to any part of your app.

- (void) say:	(NSString *)	statement
withVoice:	(FliteVoice *)	voiceToUse

- (NSError *) generateLanguageModelFromArray:	(NSArray *)	languageModelArray
withFilesNamed:	(NSString *)	fileName

- (NSError *) generateLanguageModelFromTextFile:	(NSString *)	pathToTextFile
withFilesNamed:	(NSString *)	fileName

- (void) startListeningWithLanguageModelAtPath:	(NSString *)	languageModelPath
dictionaryAtPath:	(NSString *)	dictionaryPath
languageModelIsJSGF:	(BOOL)	languageModelIsJSGF

- (void) changeLanguageModelToFile:	(NSString *)	languageModelPathAsString
withDictionary:	(NSString *)	dictionaryPathAsString

OpenEars 语音处理Welcome to OpenEars: free speech recognition and speech synthesis for the iPhone

Introduction

The current version of OpenEars is 1.2.4. Download OpenEars 1.2.4or read its changelog.

Features of OpenEars

Installation

Basic concepts

FliteController Class Reference

Detailed Description

Usage examples

Method Documentation

Property Documentation

LanguageModelGenerator Class Reference

Detailed Description

Usage examples

Method Documentation

Property Documentation

OpenEarsEventsObserver Class Reference

Detailed Description

Property Documentation

OpenEarsLogging Class Reference

Detailed Description

Method Documentation

PocketsphinxController Class Reference

Detailed Description

Usage examples

Method Documentation

Property Documentation

<OpenEarsEventsObserverDelegate> Protocol Reference

Detailed Description

Usage examples

Method Documentation

- (void) runRecognitionOnWavFileAtPath:	(NSString *)	wavPath
usingLanguageModelAtPath:	(NSString *)	languageModelPath
dictionaryAtPath:	(NSString *)	dictionaryPath
languageModelIsJSGF:	(BOOL)	languageModelIsJSGF