Android中使用自带TextToSpeech实现离线语音合成功能

场景

需要实现在安卓端将一段文字进行语音合成并播报。

此过程可离线不需要网络,不用借助第三方形如科大讯飞或者百度等语音合成SDK或者相关工具等。

注:

博客:
https://blog.csdn.net/badao_liumang_qizhi
关注公众号
霸道的程序猿
获取编程相关电子书、教程推送与免费下载。

实现

TextToSpeech

TextToSpeech能将一段文字转换为语音。

TextToSpeech是Android系统自带的类,不用导入别的包

实现效果

下载地址:

https://files.cnblogs.com/files/badaoliumangqizhi/%E8%AF%AD%E9%9F%B3%E6%92%AD%E6%8A%A5.rar

页面实现

为了搭建测试demo首先在layout中某页面上添加一个PlainText和一个Button实现页面布局如下

 

并给这两个组件添加Id属性。

然后在对应的activity中的onCreate方法中

  Button button = (Button)findViewById(R.id.button);
  EditText  editText  = (EditText )findViewById(R.id.editTextTextPersonName2);
  button.setOnClickListener(new View.OnClickListener(){

   @Override
   public void onClick(View v) {
    SpeechUtils.getInstance(LoginActivity.this).speakText(editText.getText().toString());
   }
  });

获取到两个控件,并获取到PlainText控件的Text和设置Button按钮的点击事件。

可以看到在点击事件中调用了一个工具类SpeechUtils的一个方法speakText。

这个方法就是为了进行语音合成播放方便封装的工具类。只需要给工具类方法中传入要进行语音合成的String字符串内容即可。

这里将工具类设计成单例模式。

在项目某目录下新建一个工具类的包,并新建一个SpeechUtils,代码如下

import android.content.Context;
import android.speech.tts.TextToSpeech;
import android.util.Log;
import android.widget.Toast;

import java.util.Locale;



public class SpeechUtils {
    private Context context;


    private static final String TAG = "SpeechUtils";
    private static SpeechUtils singleton;

    private TextToSpeech textToSpeech; // TTS对象

    public static SpeechUtils getInstance(Context context) {
        if (singleton == null) {
            synchronized (SpeechUtils.class) {
                if (singleton == null) {
                    singleton = new SpeechUtils(context);
                }
            }
        }
        return singleton;
    }

    public SpeechUtils(Context context) {
        this.context = context;
        textToSpeech = new TextToSpeech(context, new TextToSpeech.OnInitListener() {
            @Override
            public void onInit(int i) {
                if (i == TextToSpeech.SUCCESS) {
                    //textToSpeech.setLanguage(Locale.US);
                    textToSpeech.setLanguage(Locale.CHINA);
                    textToSpeech.setPitch(1.5f);// 设置音调,值越大声音越尖(女生),值越小则变成男声,1.0是常规
                    textToSpeech.setSpeechRate(0.5f);

                }
            }
        });
    }

    public void speakText(String text) {
        if (textToSpeech != null) {
            textToSpeech.speak(text,
                    TextToSpeech.QUEUE_FLUSH, null);
        }

    }

}

注意:

1.以上代码设计为单例模式,在调用时直接使用

SpeechUtils.getInstance(LoginActivity.this).speakText(editText.getText().toString());

去调用,其中第一个参数是Context对象,如果是在Activity中,必须使用Activity的名字.this,不能直接使用this。

2.在工具类中

import android.speech.tts.TextToSpeech;

可以看到TextToSpeech是直接在android包下引入的是自带的,没有引入其他第三方的依赖。

3.关于设置语言支持的问题,此前网络上大多说不支持中文,应该是很老的版本不支持中文,修改语言的位置在如下,

可以看到支持的语言种类很多,并且已经支持中文。所以只需要设置

textToSpeech.setLanguage(Locale.CHINA);

 

4.其他设置的属性

textToSpeech.setPitch(1.5f);// 设置音调,值越大声音越尖(女生),值越小则变成男声,1.0是常规
textToSpeech.setSpeechRate(0.5f);//设置速度

5.更多属性API可以参照Android官方文档:

https://developer.android.google.cn/reference/android/speech/tts/TextToSpeech

 

官方文档部分:

TextToSpeech

Kotlin |Java

public class TextToSpeech
extends Object

java.lang.Object
   ↳android.speech.tts.TextToSpeech


 


Synthesizes speech from text for immediate playback or to create a sound file.

A TextToSpeech instance can only be used to synthesize text once it has completed its initialization. Implement the TextToSpeech.OnInitListener to be notified of the completion of the initialization.
When you are done using the TextToSpeech instance, call the shutdown() method to release the native resources used by the TextToSpeech engine. Apps targeting Android 11 that use text-to-speech should declare TextToSpeech.Engine#INTENT_ACTION_TTS_SERVICE in the queries elements of their manifest:

 

 <queries>
   ...
  <intent>
      <action android:name="android.intent.action.TTS_SERVICE" />
  </intent>
 </queries>
 

 

 

Summary

Nested classes

classTextToSpeech.Engine

Constants and parameter names for controlling text-to-speech. 

classTextToSpeech.EngineInfo

Information about an installed text-to-speech engine. 

interfaceTextToSpeech.OnInitListener

Interface definition of a callback to be invoked indicating the completion of the TextToSpeech engine initialization. 

interfaceTextToSpeech.OnUtteranceCompletedListener

This interface was deprecated in API level 18. Use UtteranceProgressListener instead. 

Constants

StringACTION_TTS_QUEUE_PROCESSING_COMPLETED

Broadcast Action: The TextToSpeech synthesizer has completed processing of all the text in the speech queue.

intERROR

Denotes a generic operation failure.

intERROR_INVALID_REQUEST

Denotes a failure caused by an invalid request.

intERROR_NETWORK

Denotes a failure caused by a network connectivity problems.

intERROR_NETWORK_TIMEOUT

Denotes a failure caused by network timeout.

intERROR_NOT_INSTALLED_YET

Denotes a failure caused by an unfinished download of the voice data.

intERROR_OUTPUT

Denotes a failure related to the output (audio device or a file).

intERROR_SERVICE

Denotes a failure of a TTS service.

intERROR_SYNTHESIS

Denotes a failure of a TTS engine to synthesize the given input.

intLANG_AVAILABLE

Denotes the language is available for the language by the locale, but not the country and variant.

intLANG_COUNTRY_AVAILABLE

Denotes the language is available for the language and country specified by the locale, but not the variant.

intLANG_COUNTRY_VAR_AVAILABLE

Denotes the language is available exactly as specified by the locale.

intLANG_MISSING_DATA

Denotes the language data is missing.

intLANG_NOT_SUPPORTED

Denotes the language is not supported.

intQUEUE_ADD

Queue mode where the new entry is added at the end of the playback queue.

intQUEUE_FLUSH

Queue mode where all entries in the playback queue (media to be played and text to be synthesized) are dropped and replaced by the new entry.

intSTOPPED

Denotes a stop requested by a client.

intSUCCESS

Denotes a successful operation.

Public constructors

TextToSpeech(Context context, TextToSpeech.OnInitListener listener)

The constructor for the TextToSpeech class, using the default TTS engine.

TextToSpeech(Context context, TextToSpeech.OnInitListener listener, String engine)

The constructor for the TextToSpeech class, using the given TTS engine.

Public methods

intaddEarcon(String earcon, String packagename, int resourceId)

Adds a mapping between a string of text and a sound resource in a package.

intaddEarcon(String earcon, String filename)

This method was deprecated in API level 21. As of API level 21, replaced by addEarcon(java.lang.String, java.io.File).

intaddEarcon(String earcon, File file)

Adds a mapping between a string of text and a sound file.

intaddSpeech(CharSequence text, File file)

Adds a mapping between a CharSequence (may be spanned with TtsSpans and a sound file.

intaddSpeech(String text, String packagename, int resourceId)

Adds a mapping between a string of text and a sound resource in a package.

intaddSpeech(CharSequence text, String packagename, int resourceId)

Adds a mapping between a CharSequence (may be spanned with TtsSpans) of text and a sound resource in a package.

intaddSpeech(String text, String filename)

Adds a mapping between a string of text and a sound file.

booleanareDefaultsEnforced()

Checks whether the user's settings should override settings requested by the calling application.

Set<Locale>getAvailableLanguages()

Query the engine about the set of available languages.

StringgetDefaultEngine()

Gets the package name of the default speech synthesis engine.

LocalegetDefaultLanguage()

This method was deprecated in API level 21. As of API level 21, use getDefaultVoice().getLocale() (getDefaultVoice())

VoicegetDefaultVoice()

Returns a Voice instance that's the default voice for the default Text-to-speech language.

List<TextToSpeech.EngineInfo>getEngines()

Gets a list of all installed TTS engines.

Set<String>getFeatures(Locale locale)

This method was deprecated in API level 21. As of API level 21, please use voices. In order to query features of the voice, call getVoices() to retrieve the list of available voices and Voice#getFeatures() to retrieve the set of features.

LocalegetLanguage()

This method was deprecated in API level 21. As of API level 21, please use getVoice().getLocale() (getVoice()).

static intgetMaxSpeechInputLength()

Limit of length of input string passed to speak and synthesizeToFile.

VoicegetVoice()

Returns a Voice instance describing the voice currently being used for synthesis requests sent to the TextToSpeech engine.

Set<Voice>getVoices()

Query the engine about the set of available voices.

intisLanguageAvailable(Locale loc)

Checks if the specified language as represented by the Locale is available and supported.

booleanisSpeaking()

Checks whether the TTS engine is busy speaking.

intplayEarcon(String earcon, int queueMode, HashMap<StringString> params)

This method was deprecated in API level 21. As of API level 21, replaced by playEarcon(java.lang.String, int, android.os.Bundle, java.lang.String).

intplayEarcon(String earcon, int queueMode, Bundle params, String utteranceId)

Plays the earcon using the specified queueing mode and parameters.

intplaySilence(long durationInMs, int queueMode, HashMap<StringString> params)

This method was deprecated in API level 21. As of API level 21, replaced by playSilentUtterance(long, int, java.lang.String).

intplaySilentUtterance(long durationInMs, int queueMode, String utteranceId)

Plays silence for the specified amount of time using the specified queue mode.

intsetAudioAttributes(AudioAttributes audioAttributes)

Sets the audio attributes to be used when speaking text or playing back a file.

intsetEngineByPackageName(String enginePackageName)

This method was deprecated in API level 15. This doesn't inform callers when the TTS engine has been initialized. TextToSpeech(android.content.Context, android.speech.tts.TextToSpeech.OnInitListener, java.lang.String) can be used with the appropriate engine name. Also, there is no guarantee that the engine specified will be loaded. If it isn't installed or disabled, the user / system wide defaults will apply.

intsetLanguage(Locale loc)

Sets the text-to-speech language.

intsetOnUtteranceCompletedListener(TextToSpeech.OnUtteranceCompletedListener listener)

This method was deprecated in API level 15. Use setOnUtteranceProgressListener(android.speech.tts.UtteranceProgressListener) instead.

intsetOnUtteranceProgressListener(UtteranceProgressListener listener)

Sets the listener that will be notified of various events related to the synthesis of a given utterance.

intsetPitch(float pitch)

Sets the speech pitch for the TextToSpeech engine.

intsetSpeechRate(float speechRate)

Sets the speech rate.

intsetVoice(Voice voice)

Sets the text-to-speech voice.

voidshutdown()

Releases the resources used by the TextToSpeech engine.

intspeak(CharSequence text, int queueMode, Bundle params, String utteranceId)

Speaks the text using the specified queuing strategy and speech parameters, the text may be spanned with TtsSpans.

intspeak(String text, int queueMode, HashMap<StringString> params)

This method was deprecated in API level 21. As of API level 21, replaced by speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String).

intstop()

Interrupts the current utterance (whether played or rendered to file) and discards other utterances in the queue.

intsynthesizeToFile(CharSequence text, Bundle params, ParcelFileDescriptor fileDescriptor, String utteranceId)

Synthesizes the given text to a ParcelFileDescriptor using the specified parameters.

intsynthesizeToFile(CharSequence text, Bundle params, File file, String utteranceId)

Synthesizes the given text to a file using the specified parameters.

intsynthesizeToFile(String text, HashMap<StringString> params, String filename)

This method was deprecated in API level 21. As of API level 21, replaced by synthesizeToFile(java.lang.CharSequence, android.os.Bundle, java.io.File, java.lang.String).

Inherited methods

From class java.lang.Object

Constants

ACTION_TTS_QUEUE_PROCESSING_COMPLETED

Added in API level 4

public static final String ACTION_TTS_QUEUE_PROCESSING_COMPLETED

Broadcast Action: The TextToSpeech synthesizer has completed processing of all the text in the speech queue. Note that this notifies callers when the engine has finished has processing text data. Audio playback might not have completed (or even started) at this point. If you wish to be notified when this happens, see OnUtteranceCompletedListener.

 

Constant Value: "android.speech.tts.TTS_QUEUE_PROCESSING_COMPLETED"

ERROR

Added in API level 4

public static final int ERROR

Denotes a generic operation failure.

 

Constant Value: -1 (0xffffffff)

ERROR_INVALID_REQUEST

Added in API level 21

public static final int ERROR_INVALID_REQUEST

Denotes a failure caused by an invalid request.

 

Constant Value: -8 (0xfffffff8)

ERROR_NETWORK

Added in API level 21

public static final int ERROR_NETWORK

Denotes a failure caused by a network connectivity problems.

 

Constant Value: -6 (0xfffffffa)

ERROR_NETWORK_TIMEOUT

Added in API level 21

public static final int ERROR_NETWORK_TIMEOUT

Denotes a failure caused by network timeout.

 

Constant Value: -7 (0xfffffff9)

ERROR_NOT_INSTALLED_YET

Added in API level 21

public static final int ERROR_NOT_INSTALLED_YET

Denotes a failure caused by an unfinished download of the voice data.

 

See also:

Constant Value: -9 (0xfffffff7)

ERROR_OUTPUT

Added in API level 21

public static final int ERROR_OUTPUT

Denotes a failure related to the output (audio device or a file).

 

Constant Value: -5 (0xfffffffb)

ERROR_SERVICE

Added in API level 21

public static final int ERROR_SERVICE

Denotes a failure of a TTS service.

 

Constant Value: -4 (0xfffffffc)

ERROR_SYNTHESIS

Added in API level 21

public static final int ERROR_SYNTHESIS

Denotes a failure of a TTS engine to synthesize the given input.

 

Constant Value: -3 (0xfffffffd)

LANG_AVAILABLE

Added in API level 4

public static final int LANG_AVAILABLE

Denotes the language is available for the language by the locale, but not the country and variant.

 

Constant Value: 0 (0x00000000)

LANG_COUNTRY_AVAILABLE

Added in API level 4

public static final int LANG_COUNTRY_AVAILABLE

Denotes the language is available for the language and country specified by the locale, but not the variant.

 

Constant Value: 1 (0x00000001)

LANG_COUNTRY_VAR_AVAILABLE

Added in API level 4

public static final int LANG_COUNTRY_VAR_AVAILABLE

Denotes the language is available exactly as specified by the locale.

 

Constant Value: 2 (0x00000002)

LANG_MISSING_DATA

Added in API level 4

public static final int LANG_MISSING_DATA

Denotes the language data is missing.

 

Constant Value: -1 (0xffffffff)

LANG_NOT_SUPPORTED

Added in API level 4

public static final int LANG_NOT_SUPPORTED

Denotes the language is not supported.

 

Constant Value: -2 (0xfffffffe)

QUEUE_ADD

Added in API level 4

public static final int QUEUE_ADD

Queue mode where the new entry is added at the end of the playback queue.

 

Constant Value: 1 (0x00000001)

QUEUE_FLUSH

Added in API level 4

public static final int QUEUE_FLUSH

Queue mode where all entries in the playback queue (media to be played and text to be synthesized) are dropped and replaced by the new entry. Queues are flushed with respect to a given calling app. Entries in the queue from other callees are not discarded.

 

Constant Value: 0 (0x00000000)

STOPPED

Added in API level 21

public static final int STOPPED

Denotes a stop requested by a client. It's used only on the service side of the API, client should never expect to see this result code.

 

Constant Value: -2 (0xfffffffe)

SUCCESS

Added in API level 4

public static final int SUCCESS

Denotes a successful operation.

 

Constant Value: 0 (0x00000000)

Public constructors

TextToSpeech

Added in API level 4

public TextToSpeech (Context context, 
                TextToSpeech.OnInitListener listener)

The constructor for the TextToSpeech class, using the default TTS engine. This will also initialize the associated TextToSpeech engine if it isn't already running.

 

Parameters
contextContext: The context this instance is running in.

 

listenerTextToSpeech.OnInitListener: The TextToSpeech.OnInitListener that will be called when the TextToSpeech engine has initialized. In a case of a failure the listener may be called immediately, before TextToSpeech instance is fully constructed.

 

TextToSpeech

Added in API level 14

public TextToSpeech (Context context, 
                TextToSpeech.OnInitListener listener, 
                String engine)

The constructor for the TextToSpeech class, using the given TTS engine. This will also initialize the associated TextToSpeech engine if it isn't already running.

 

Parameters
contextContext: The context this instance is running in.

 

listenerTextToSpeech.OnInitListener: The TextToSpeech.OnInitListener that will be called when the TextToSpeech engine has initialized. In a case of a failure the listener may be called immediately, before TextToSpeech instance is fully constructed.

 

engineString: Package name of the TTS engine to use.

 

Public methods

addEarcon

Added in API level 4

public int addEarcon (String earcon, 
                String packagename, 
                int resourceId)

Adds a mapping between a string of text and a sound resource in a package. Use this to add custom earcons.

 

Parameters
earconString: The name of the earcon. Example: "[tick]"

 

packagenameString: the package name of the application that contains the resource. This can for instance be the package name of your own application. Example: "com.google.marvin.compass"
The package name can be found in the AndroidManifest.xml of the application containing the resource.

<manifest xmlns:android="..." package="com.google.marvin.compass">

 

resourceIdint: Example: R.raw.tick_snd

 

Returns
intCode indicating success or failure. See ERROR and SUCCESS.

 

See also:

addEarcon

Added in API level 4
Deprecated in API level 21

public int addEarcon (String earcon, 
                String filename)

 

This method was deprecated in API level 21.
As of API level 21, replaced by addEarcon(java.lang.String, java.io.File).

Adds a mapping between a string of text and a sound file. Use this to add custom earcons.

 

Parameters
earconString: The name of the earcon. Example: "[tick]"

 

filenameString: The full path to the sound file (for example: "/sdcard/mysounds/tick.wav")

 

Returns
intCode indicating success or failure. See ERROR and SUCCESS.

 

See also:

addEarcon

Added in API level 21

public int addEarcon (String earcon, 
                File file)

Adds a mapping between a string of text and a sound file. Use this to add custom earcons.

 

Parameters
earconString: The name of the earcon. Example: "[tick]"

 

fileFile: File object pointing to the sound file.

 

Returns
intCode indicating success or failure. See ERROR and SUCCESS.

 

See also:

addSpeech

Added in API level 21

public int addSpeech (CharSequence text, 
                File file)

Adds a mapping between a CharSequence (may be spanned with TtsSpans and a sound file. Using this, it is possible to add custom pronounciations for a string of text. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

 

Parameters
textCharSequence: The string of text. Example: "south_south_east"

 

fileFile: File object pointing to the sound file.

 

Returns
intCode indicating success or failure. See ERROR and SUCCESS.

 

addSpeech

Added in API level 4

public int addSpeech (String text, 
                String packagename, 
                int resourceId)

Adds a mapping between a string of text and a sound resource in a package. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

 

Parameters
textString: The string of text. Example: "south_south_east"

 

packagenameString: Pass the packagename of the application that contains the resource. If the resource is in your own application (this is the most common case), then put the packagename of your application here.
Example: "com.google.marvin.compass"
The packagename can be found in the AndroidManifest.xml of your application.

<manifest xmlns:android="..." package="com.google.marvin.compass">

 

resourceIdint: Example: R.raw.south_south_east

 

Returns
intCode indicating success or failure. See ERROR and SUCCESS.

 

addSpeech

Added in API level 21

public int addSpeech (CharSequence text, 
                String packagename, 
                int resourceId)

Adds a mapping between a CharSequence (may be spanned with TtsSpans) of text and a sound resource in a package. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

 

Parameters
textCharSequence: The string of text. Example: "south_south_east"

 

packagenameString: Pass the packagename of the application that contains the resource. If the resource is in your own application (this is the most common case), then put the packagename of your application here.
Example: "com.google.marvin.compass"
The packagename can be found in the AndroidManifest.xml of your application.

<manifest xmlns:android="..." package="com.google.marvin.compass">

 

resourceIdint: Example: R.raw.south_south_east

 

Returns
intCode indicating success or failure. See ERROR and SUCCESS.

 

addSpeech

Added in API level 4

public int addSpeech (String text, 
                String filename)

Adds a mapping between a string of text and a sound file. Using this, it is possible to add custom pronounciations for a string of text. After a call to this method, subsequent calls to speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String) will play the specified sound resource if it is available, or synthesize the text it is missing.

 

Parameters
textString: The string of text. Example: "south_south_east"

 

filenameString: The full path to the sound file (for example: "/sdcard/mysounds/hello.wav")

 

Returns
intCode indicating success or failure. See ERROR and SUCCESS.

 

areDefaultsEnforced

Added in API level 8
Deprecated in API level 21

public boolean areDefaultsEnforced ()

Checks whether the user's settings should override settings requested by the calling application. As of the Ice cream sandwich release, user settings never forcibly override the app's settings.

 

Returns
boolean

 

getAvailableLanguages

Added in API level 21

public Set<Locale> getAvailableLanguages ()

Query the engine about the set of available languages.

 

Returns
Set<Locale>

 

getDefaultEngine

Added in API level 8

public String getDefaultEngine ()

Gets the package name of the default speech synthesis engine.

 

Returns
StringPackage name of the TTS engine that the user has chosen as their default.

 

getDefaultLanguage

Added in API level 18
Deprecated in API level 21

public Locale getDefaultLanguage ()

 

This method was deprecated in API level 21.
As of API level 21, use getDefaultVoice().getLocale() (getDefaultVoice())

Returns a Locale instance describing the language currently being used as the default Text-to-speech language. The locale object returned by this method is NOT a valid one. It has identical form to the one in getLanguage(). Please refer to getLanguage() for more information.

 

Returns
Localelanguage, country (if any) and variant (if any) used by the client stored in a Locale instance, or null on error.

 

getDefaultVoice

Added in API level 21

public Voice getDefaultVoice ()

Returns a Voice instance that's the default voice for the default Text-to-speech language.

 

Returns
VoiceThe default voice instance for the default language, or null if not set or on error.

 

getEngines

Added in API level 14

public List<TextToSpeech.EngineInfo> getEngines ()

Gets a list of all installed TTS engines.

 

Returns
List<TextToSpeech.EngineInfo>A list of engine info objects. The list can be empty, but never null.

 

getFeatures

Added in API level 15
Deprecated in API level 21

public Set<String> getFeatures (Locale locale)

 

This method was deprecated in API level 21.
As of API level 21, please use voices. In order to query features of the voice, call getVoices() to retrieve the list of available voices and Voice#getFeatures() to retrieve the set of features.

Queries the engine for the set of features it supports for a given locale. Features can either be framework defined, e.g. TextToSpeech.Engine#KEY_FEATURE_NETWORK_SYNTHESIS or engine specific. Engine specific keys must be prefixed by the name of the engine they are intended for. These keys can be used as parameters to TextToSpeech#speak(String, int, java.util.HashMap) and TextToSpeech#synthesizeToFile(String, java.util.HashMap, String). Features values are strings and their values must meet restrictions described in their documentation.

 

Parameters
localeLocale: The locale to query features for.

 

Returns
Set<String>Set instance. May return null on error.

 

getLanguage

Added in API level 4
Deprecated in API level 21

public Locale getLanguage ()

 

This method was deprecated in API level 21.
As of API level 21, please use getVoice().getLocale() (getVoice()).

Returns a Locale instance describing the language currently being used for synthesis requests sent to the TextToSpeech engine. In Android 4.2 and before (API <= 17) this function returns the language that is currently being used by the TTS engine. That is the last language set by this or any other client by a TextToSpeech#setLanguage call to the same engine. In Android versions after 4.2 this function returns the language that is currently being used for the synthesis requests sent from this client. That is the last language set by a TextToSpeech#setLanguage call on this instance. If a voice is set (by setVoice(android.speech.tts.Voice)), getLanguage will return the language of the currently set voice. Please note that the Locale object returned by this method is NOT a valid Locale object. Its language field contains a three-letter ISO 639-2/T code (where a proper Locale would use a two-letter ISO 639-1 code), and the country field contains a three-letter ISO 3166 country code (where a proper Locale would use a two-letter ISO 3166-1 code).

 

Returns
Localelanguage, country (if any) and variant (if any) used by the client stored in a Locale instance, or null on error.

 

getMaxSpeechInputLength

Added in API level 18

public static int getMaxSpeechInputLength ()

Limit of length of input string passed to speak and synthesizeToFile.

 

Returns
int

 

See also:

getVoice

Added in API level 21

public Voice getVoice ()

Returns a Voice instance describing the voice currently being used for synthesis requests sent to the TextToSpeech engine.

 

Returns
VoiceVoice instance used by the client, or null if not set or on error.

 

See also:

getVoices

Added in API level 21

public Set<Voice> getVoices ()

Query the engine about the set of available voices. Each TTS Engine can expose multiple voices for each locale, each with a different set of features.

 

Returns
Set<Voice>

 

See also:

isLanguageAvailable

Added in API level 4

public int isLanguageAvailable (Locale loc)

Checks if the specified language as represented by the Locale is available and supported.

 

Parameters
locLocale: The Locale describing the language to be used.

 

Returns
intCode indicating the support status for the locale. See LANG_AVAILABLELANG_COUNTRY_AVAILABLELANG_COUNTRY_VAR_AVAILABLELANG_MISSING_DATA and LANG_NOT_SUPPORTED.

 

isSpeaking

Added in API level 4

public boolean isSpeaking ()

Checks whether the TTS engine is busy speaking. Note that a speech item is considered complete once it's audio data has been sent to the audio mixer, or written to a file. There might be a finite lag between this point, and when the audio hardware completes playback.

 

Returns
booleantrue if the TTS engine is speaking.

 

playEarcon

Added in API level 4
Deprecated in API level 21

public int playEarcon (String earcon, 
                int queueMode, 
                HashMap<StringString> params)

 

This method was deprecated in API level 21.
As of API level 21, replaced by playEarcon(java.lang.String, int, android.os.Bundle, java.lang.String).

Plays the earcon using the specified queueing mode and parameters. The earcon must already have been added with addEarcon(java.lang.String, java.lang.String) or addEarcon(java.lang.String, java.lang.String, int). This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)) and using the Engine#KEY_PARAM_UTTERANCE_ID parameter.

 

Parameters
earconString: The earcon that should be played

 

queueModeintQUEUE_ADD or QUEUE_FLUSH.

 

paramsHashMap: Parameters for the request. Can be null. Supported parameter names: Engine#KEY_PARAM_STREAMEngine#KEY_PARAM_UTTERANCE_ID. Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used.

 

Returns
intERROR or SUCCESS of queuing the playEarcon operation.

 

playEarcon

Added in API level 21

public int playEarcon (String earcon, 
                int queueMode, 
                Bundle params, 
                String utteranceId)

Plays the earcon using the specified queueing mode and parameters. The earcon must already have been added with addEarcon(java.lang.String, java.lang.String) or addEarcon(java.lang.String, java.lang.String, int). This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)) and using the Engine#KEY_PARAM_UTTERANCE_ID parameter.

 

Parameters
earconString: The earcon that should be played

 

queueModeintQUEUE_ADD or QUEUE_FLUSH.

 

paramsBundle: Parameters for the request. Can be null. Supported parameter names: Engine#KEY_PARAM_STREAM, Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used.

 

utteranceIdString

 

Returns
intERROR or SUCCESS of queuing the playEarcon operation.

 

playSilence

Added in API level 4
Deprecated in API level 21

public int playSilence (long durationInMs, 
                int queueMode, 
                HashMap<StringString> params)

 

This method was deprecated in API level 21.
As of API level 21, replaced by playSilentUtterance(long, int, java.lang.String).

Plays silence for the specified amount of time using the specified queue mode. This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)) and using the Engine#KEY_PARAM_UTTERANCE_ID parameter.

 

Parameters
durationInMslong: The duration of the silence.

 

queueModeintQUEUE_ADD or QUEUE_FLUSH.

 

paramsHashMap: Parameters for the request. Can be null. Supported parameter names: Engine#KEY_PARAM_UTTERANCE_ID. Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used.

 

Returns
intERROR or SUCCESS of queuing the playSilence operation.

 

playSilentUtterance

Added in API level 21

public int playSilentUtterance (long durationInMs, 
                int queueMode, 
                String utteranceId)

Plays silence for the specified amount of time using the specified queue mode. This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)) and using the Engine#KEY_PARAM_UTTERANCE_ID parameter.

 

Parameters
durationInMslong: The duration of the silence.

 

queueModeintQUEUE_ADD or QUEUE_FLUSH.

 

utteranceIdString: An unique identifier for this request.

 

Returns
intERROR or SUCCESS of queuing the playSilentUtterance operation.

 

setAudioAttributes

Added in API level 21

public int setAudioAttributes (AudioAttributes audioAttributes)

Sets the audio attributes to be used when speaking text or playing back a file.

 

Parameters
audioAttributesAudioAttributes: Valid AudioAttributes instance.

 

Returns
intERROR or SUCCESS.

 

setEngineByPackageName

Added in API level 8
Deprecated in API level 15

public int setEngineByPackageName (String enginePackageName)

 

This method was deprecated in API level 15.
This doesn't inform callers when the TTS engine has been initialized. TextToSpeech(android.content.Context, android.speech.tts.TextToSpeech.OnInitListener, java.lang.String) can be used with the appropriate engine name. Also, there is no guarantee that the engine specified will be loaded. If it isn't installed or disabled, the user / system wide defaults will apply.

Sets the TTS engine to use.

 

Parameters
enginePackageNameString: The package name for the synthesis engine (e.g. "com.svox.pico")

 

Returns
intERROR or SUCCESS.

 

setLanguage

Added in API level 4

public int setLanguage (Locale loc)

Sets the text-to-speech language. The TTS engine will try to use the closest match to the specified language as represented by the Locale, but there is no guarantee that the exact same Locale will be used. Use isLanguageAvailable(java.util.Locale) to check the level of support before choosing the language to use for the next utterances. This method sets the current voice to the default one for the given Locale; getVoice() can be used to retrieve it.

 

Parameters
locLocale: The locale describing the language to be used.

 

Returns
intCode indicating the support status for the locale. See LANG_AVAILABLELANG_COUNTRY_AVAILABLELANG_COUNTRY_VAR_AVAILABLELANG_MISSING_DATA and LANG_NOT_SUPPORTED.

 

setOnUtteranceCompletedListener

Added in API level 4
Deprecated in API level 15

public int setOnUtteranceCompletedListener (TextToSpeech.OnUtteranceCompletedListener listener)

 

This method was deprecated in API level 15.
Use setOnUtteranceProgressListener(android.speech.tts.UtteranceProgressListener) instead.

Sets the listener that will be notified when synthesis of an utterance completes.

 

Parameters
listenerTextToSpeech.OnUtteranceCompletedListener: The listener to use.

 

Returns
intERROR or SUCCESS.

 

setOnUtteranceProgressListener

Added in API level 15

public int setOnUtteranceProgressListener (UtteranceProgressListener listener)

Sets the listener that will be notified of various events related to the synthesis of a given utterance. See UtteranceProgressListener and TextToSpeech.Engine#KEY_PARAM_UTTERANCE_ID.

 

Parameters
listenerUtteranceProgressListener: the listener to use.

 

Returns
intERROR or SUCCESS

 

setPitch

Added in API level 4

public int setPitch (float pitch)

Sets the speech pitch for the TextToSpeech engine. This has no effect on any pre-recorded speech.

 

Parameters
pitchfloat: Speech pitch. 1.0 is the normal pitch, lower values lower the tone of the synthesized voice, greater values increase it.

 

Returns
intERROR or SUCCESS.

 

setSpeechRate

Added in API level 4

public int setSpeechRate (float speechRate)

Sets the speech rate. This has no effect on any pre-recorded speech.

 

Parameters
speechRatefloat: Speech rate. 1.0 is the normal speech rate, lower values slow down the speech (0.5 is half the normal speech rate), greater values accelerate it (2.0 is twice the normal speech rate).

 

Returns
intERROR or SUCCESS.

 

setVoice

Added in API level 21

public int setVoice (Voice voice)

Sets the text-to-speech voice.

 

Parameters
voiceVoice: One of objects returned by getVoices().

 

Returns
intERROR or SUCCESS.

 

See also:

shutdown

Added in API level 4

public void shutdown ()

Releases the resources used by the TextToSpeech engine. It is good practice for instance to call this method in the onDestroy() method of an Activity so the TextToSpeech engine can be cleanly stopped.

 

speak

Added in API level 21

public int speak (CharSequence text, 
                int queueMode, 
                Bundle params, 
                String utteranceId)

Speaks the text using the specified queuing strategy and speech parameters, the text may be spanned with TtsSpans. This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)) and using the Engine#KEY_PARAM_UTTERANCE_ID parameter.

 

Parameters
textCharSequence: The string of text to be spoken. No longer than getMaxSpeechInputLength() characters.

 

queueModeint: The queuing strategy to use, QUEUE_ADD or QUEUE_FLUSH.

 

paramsBundle: Parameters for the request. Can be null. Supported parameter names: Engine#KEY_PARAM_STREAMEngine#KEY_PARAM_VOLUMEEngine#KEY_PARAM_PAN. Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used.

 

utteranceIdString: An unique identifier for this request.

 

Returns
intERROR or SUCCESS of queuing the speak operation.

 

speak

Added in API level 4
Deprecated in API level 21

public int speak (String text, 
                int queueMode, 
                HashMap<StringString> params)

 

This method was deprecated in API level 21.
As of API level 21, replaced by speak(java.lang.CharSequence, int, android.os.Bundle, java.lang.String).

Speaks the string using the specified queuing strategy and speech parameters. This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)) and using the Engine#KEY_PARAM_UTTERANCE_ID parameter.

 

Parameters
textString: The string of text to be spoken. No longer than getMaxSpeechInputLength() characters.

 

queueModeint: The queuing strategy to use, QUEUE_ADD or QUEUE_FLUSH.

 

paramsHashMap: Parameters for the request. Can be null. Supported parameter names: Engine#KEY_PARAM_STREAMEngine#KEY_PARAM_UTTERANCE_IDEngine#KEY_PARAM_VOLUMEEngine#KEY_PARAM_PAN. Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used.

 

Returns
intERROR or SUCCESS of queuing the speak operation.

 

stop

Added in API level 4

public int stop ()

Interrupts the current utterance (whether played or rendered to file) and discards other utterances in the queue.

 

Returns
intERROR or SUCCESS.

 

synthesizeToFile

Added in API level 30

public int synthesizeToFile (CharSequence text, 
                Bundle params, 
                ParcelFileDescriptor fileDescriptor, 
                String utteranceId)

Synthesizes the given text to a ParcelFileDescriptor using the specified parameters. This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)).

 

Parameters
textCharSequence: The text that should be synthesized. No longer than getMaxSpeechInputLength() characters. This value cannot be null.

 

paramsBundle: Parameters for the request. Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used. This value cannot be null.

 

fileDescriptorParcelFileDescriptor: ParcelFileDescriptor to write the generated audio data to. This value cannot be null.

 

utteranceIdString: An unique identifier for this request. This value cannot be null.

 

Returns
intERROR or SUCCESS of queuing the synthesizeToFile operation.

 

synthesizeToFile

Added in API level 21

public int synthesizeToFile (CharSequence text, 
                Bundle params, 
                File file, 
                String utteranceId)

Synthesizes the given text to a file using the specified parameters. This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)).

 

Parameters
textCharSequence: The text that should be synthesized. No longer than getMaxSpeechInputLength() characters.

 

paramsBundle: Parameters for the request. Cannot be null. Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used.

 

fileFile: File to write the generated audio data to.

 

utteranceIdString: An unique identifier for this request.

 

Returns
intERROR or SUCCESS of queuing the synthesizeToFile operation.

 

synthesizeToFile

Added in API level 4
Deprecated in API level 21

public int synthesizeToFile (String text, 
                HashMap<StringString> params, 
                String filename)

 

This method was deprecated in API level 21.
As of API level 21, replaced by synthesizeToFile(java.lang.CharSequence, android.os.Bundle, java.io.File, java.lang.String).

Synthesizes the given text to a file using the specified parameters. This method is asynchronous, i.e. the method just adds the request to the queue of TTS requests and then returns. The synthesis might not have finished (or even started!) at the time when this method returns. In order to reliably detect errors during synthesis, we recommend setting an utterance progress listener (see setOnUtteranceProgressListener(UtteranceProgressListener)) and using the Engine#KEY_PARAM_UTTERANCE_ID parameter.

 

Parameters
textString: The text that should be synthesized. No longer than getMaxSpeechInputLength() characters.

 

paramsHashMap: Parameters for the request. Cannot be null. Supported parameter names: Engine#KEY_PARAM_UTTERANCE_ID. Engine specific parameters may be passed in but the parameter keys must be prefixed by the name of the engine they are intended for. For example the keys "com.svox.pico_foo" and "com.svox.pico:bar" will be passed to the engine named "com.svox.pico" if it is being used.

 

filenameString: Absolute file filename to write the generated audio data to.It should be something like "/sdcard/myappsounds/mysound.wav".

 

Returns
intERROR or SUCCESS of queuing the synthesizeToFile operation.
  • 0
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
您好!基于Qt框架实现离线语音合成,可以使用讯飞离线语音合成SDK。具体步骤如下: 1. 在科大讯飞官网注册账号,申请离线语音合成SDK的授权。 2. 在QT项目添加SDK的头文件和库文件。 3. 调用SDK提供的接口,实现文字转语音的功能。 以下是一个简单的示例代码: ```c++ #include "qtts.h" #include "msp_cmn.h" #include "msp_errors.h" // 初始化语音合成 int InitTTS() { int ret = MSP_SUCCESS; const char* login_params = "appid = APPID, work_dir = ."; // 替换APPID为自己的应用ID ret = MSPLogin(nullptr, nullptr, login_params); if (MSP_SUCCESS != ret) { printf("MSPLogin failed, error code: %d.\n", ret); return ret; } // 设置语音合成参数 TTSConfig config = DEFAULT_TTS_CONFIG; config.vcn = "xiaoyan"; // 设置发音人为小燕 config.speed = 50; // 设置语速为50 config.volume = 50; // 设置音量为50 config.pitch = 50; // 设置音高为50 const char* session_begin_params = "voice_name = xiaoyan, text_encoding = UTF8, sample_rate = 16000, speed = 50, volume = 50, pitch = 50"; // 加载离线语音合成引擎 const char* res_path = "fo|res/iat/common.jet;fo|res/tts/xiaoyan.jet"; // 替换为自己下载的离线资源文件路径 ret = MSPUploadData("tts", res_path, nullptr, 0); if (MSP_SUCCESS != ret) { printf("MSPUploadData failed, error code: %d.\n", ret); return ret; } // 创建语音合成句柄 QTTSInit(); return MSP_SUCCESS; } // 文字转语音 int TextToSpeech(const char* text, const char* filename) { int ret = MSP_SUCCESS; const char* session_begin_params = "voice_name = xiaoyan, text_encoding = UTF8, sample_rate = 16000, speed = 50, volume = 50, pitch = 50"; const char* audio_format = "wav"; // 创建语音合成句柄 int synth_status = MSP_TTS_FLAG_STILL_HAVE_DATA; QTTSGetParam(nullptr, TTS_PARAM_SESSION_BEGIN, session_begin_params); const char* audio_data; unsigned int audio_len; FILE* fp = fopen(filename, "wb"); do { audio_data = QTTSSynthText(text, strlen(text), &audio_len, &synth_status, &ret); if (nullptr != audio_data) { fwrite(audio_data, audio_len, 1, fp); } } while (MSP_TTS_FLAG_STILL_HAVE_DATA == synth_status); fclose(fp); // 释放语音合成句柄 QTTSGetParam(nullptr, TTS_PARAM_SESSION_END, nullptr); return MSP_SUCCESS; } // 关闭语音合成 void CloseTTS() { QTTSFini(); MSPLogout(); } // 测试 int main(int argc, char** argv) { // 初始化语音合成 InitTTS(); // 文字转语音 TextToSpeech("科大讯飞,让世界聆听我们的声音。", "output.wav"); // 关闭语音合成 CloseTTS(); return 0; } ``` 需要将代码的APPID替换为自己的应用ID,并将离线资源文件路径替换为自己下载的离线资源文件路径。同时,需要将SDK的头文件和库文件添加到QT项目,并在项目包含Qt5Core.dll、msc.dll等动态链接库文件。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

霸道流氓气质

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值