未公开的Lua Frontier Pattern %f 比较实用

参见: http://lua-users.org/wiki/FrontierPattern

Frontier Pattern

lua-users home
wiki

The "frontier" expression pattern %f is undocumented in the standard Lua references (for reasons why see LuaList:2006-12/msg00536.html).

I would like to present here the usefulness of it, in an attempt to show how it can be used, and why it should be retained.

Let's consider a fairly straightforward task: to find all words in upper-case in a string.

First attempt: %u+

string.gsub ("the QUICK brown fox", "%u+", print)

QUICK

That looks OK, found a word in all caps. But look at this:

string.gsub ("the QUICK BROwn fox", "%u+", print)

QUICK
BRO

We also found a word which was partially capitalised.

Second attempt: %u+%A

string.gsub ("the QUICK BROwn fox", "%u+%A", print)

QUICK

The detection of non-letters correctly excluded the partially capitalised word. But wait! How about this:

string.gsub ("the QUICK brOWN fox", "%u+%A", print)

QUICK
OWN

We also have a second problem:

string.gsub ("the QUICK. brown fox", "%u+%A", print)

QUICK.

The punctuation after the word is now part of the captured string, which is not wanted.

Third attempt: %A%u+%A

string.gsub ("the QUICK brOWN FOx jumps", "%A%u+%A", print)

QUICK

This correctly excludes the two partially capitalised words, but still leaves the punctuation in, like this:

string.gsub ("the (QUICK) brOWN FOx jumps", "%A%u+%A", print)

(QUICK)

Also, there is another problem, apart from capturing the non-letters at the sides. Look at this:

string.gsub ("THE (QUICK) brOWN FOx JUMPS", "%A%u+%A", print)

(QUICK)

The correctly capitalised words at the start and end of the string are not detected.

The solution: The Frontier pattern: %f

string.gsub ("THE (QUICK) brOWN FOx JUMPS", "%f[%a]%u+%f[%A]", print)

THE
QUICK
JUMPS

The frontier pattern %f followed by a set detects the transition from "not in set" to "in set". The source string boundary qualifies as "not in set" so it also matches the word at the very start of the string to be matched.

The second frontier pattern is also matched at the end of the string, so our final word is also captured.

Alternatives without the frontier pattern

Without the frontier pattern, one might resort to things like this:

s = "THE (QUICK) brOWN FOx JUMPS"
s = "\0" .. s:gsub("(%A)(%u)", "%1\0%2")
:gsub("(%u)(%A)", "%1\0%2") .. "\0"
s = s:gsub("%z(%u+)%z", print)

This page brought to you by NickGammon. FindPage · RecentChanges · preferences
edit · history
Last edited July 7, 2007 7:17 pm GMT (diff)


看下代码lstrlib.c:

case 'f': { /* frontier? */
const char *ep; char previous;
p += 2;
if (*p != '[')
luaL_error(ms->L, "missing " LUA_QL("[") " after "
LUA_QL("%%f") " in pattern");
ep = classend(ms, p); /* points to what is next */
previous = (s == ms->src_init) ? '\0' : *(s-1);
if (matchbracketclass(uchar(previous), p, ep-1) ||
!matchbracketclass(uchar(*s), p, ep-1)) return NULL;
p=ep; goto init; /* else return match(ms, s, ep); */
}

就知道如何用了。
Stata是一种统计软件,被广泛用于数据分析和统计建模。它提供了各种功能和工具,使得用户能够对数据进行管理、处理、探索和可视化。Stata具有强大的统计分析能力,包括描述性统计、回归分析、时间序列分析、面板数据分析、生存分析等。它还可以通过编写Do文件或使用Stata命令语言(Stata Language)进行批处理和自动化分析。 FrontierFrontier Analysis)是一种非参数效率分析方法,用于评估一个组织或单位的相对效率水平。它基于经验数据,通过比较被评估单位与其他单位的输入产出效率,识别出处于生产技术边界上的最优单位,即效率最高的单位。这种方法在评估和比较不同单位的效率时,不需要任何先验假设,适用于各种规模和特征的组织和产业。 Frontier方法可以应用于不同领域,如经济学、管理学和金融学等。它可以评估企业的技术效率、经营效率和市场效率,为经营者提供改进和优化决策的依据。在金融领域,Frontier方法可以衡量投资组合的效率和风险,为投资者提供理性的配置建议。 在实际应用中,Stata可以与Frontier方法相结合,用于数据的预处理、输入输出变量的计算和效率评估的统计分析。使用Stata进行数据的清理、转换和计算,可以为Frontier方法提供可靠的输入数据。同时,Stata的回归分析功能和统计检验工具也可以为Frontier方法提供进一步的分析和解释。因此,结合Stata和Frontier方法可以更全面地进行效率评估和决策支持。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值