5.ICU BreakIterator实现原理分析
上一篇性能测试,看到耗时和内存占用上的一些现象。当然,对于一个开源的东西,最高效的方式还是研究源代码了。接下来我们会深入到ICU源代码简要看看分词的实现方法。
5.1初始化:获取BreakIterator实例
以我们例子中用到的获取当前默认Locale的WordInstance为例:
java.text.BreakIterator.getWordInstance() ->
java.text.BreakIterator.getWordInstance(Locale.getDefault()) ->
java.text.IcuIteratorWrapper new IcuIteratorWrapper() ->
android.icu.text.BreakIterator.getWordInstance(ULocale where) ->
android.icu.text.BreakIterator.getBreakInstance(where, KIND_WORD)
OK,到了第一个关键逻辑:
android.icu.text.BreakIterator
/**
* Returns a particular kind of BreakIterator for a locale.
* Avoids writing a switch statement with getXYZInstance(where) calls.
* @internal
* @deprecated This API is ICU internal only.
*/
@Deprecated
public static BreakIterator getBreakInstance(ULocale where, int kind) {
if (where == null) {
throw new NullPointerException("Specified locale is null");
}
if (iterCache[kind] != null) {
BreakIteratorCache cache = (BreakIteratorCache)iterCache[kind].get();
if (cache != null) {
if (cache.getLocale().equals(where)) {
return cache.createBreakInstance();
}
}
}
// sigh, all to avoid linking in ICULocaleData...
BreakIterator result = getShim().createBreakIterator(where, kind);
BreakIteratorCache cache = new BreakIteratorCache(where, result);
iterCache[kind] = new SoftReference<Brea