php数组bool,php - 如何使用PHP从 bool 表示法(DSL)创建数组树

最新推荐文章于 2023-12-09 18:58:43 发布

凌冽的风

最新推荐文章于 2023-12-09 18:58:43 发布

阅读量137

点赞数

文章标签： php数组bool

我的输入很简单：$input = '( ( "M" AND ( "(" OR "AND" ) ) OR "T" )';

其中(在树上开始一个新节点并)结束它。and和or单词是为布尔运算保留的，因此在它们不在“”标记内之前，它们有一个特殊的含义。在我的dsl中，and和or子句按节点级别变化，因此在级别上只能有and或or子句。如果和在或之后，它将启动一个新的子节点。“”中的所有字符都应视为它们本身。最后像往常一样“可以用”逃走。

在php中，有什么好的方法可以使翻译句子看起来像这样：

$output = array(array(array("M" , array("(", "AND")) , "T"), FALSE);

请注意，false是一个指示器，表示根级别具有或关键字。如果输入是：

( ( "M" AND ( "(" OR "AND" ) ) AND "T" )

那么输出将是：

$output = array(array(array("M", array("(", "AND")), "T"), TRUE);

使用replace('('、'array(')；和eval代码是很有诱惑力的，但是转义字符和包装文字将成为一个问题。

目前我还没有在dsl上实现not boolean运算符。

谢谢你的帮助。javasript代码也可以。

Python示例：

在使用php和javascript之前，我用python做了一些测试。我所做的是：

使用正则表达式查找字符串文本

用生成的键替换文本

将文字存储到关联列表

用括号将输入拆分为单级列表

查找根级布尔运算符

去掉布尔运算符和空白

用存储的值替换文本键

它可能有用，但我相信一定有更复杂的方法来做到这一点。

http://codepad.org/PdgQLviI

最佳答案

这是我的辅助项目的库修改。它应该处理这些字符串-执行一些压力测试，让我知道它是否在某处断裂。

需要使用tokenizer类型类来提取和标记变量，这样它们就不会干扰语法分析和tokenize parethesis，这样它们就可以直接匹配(lazy-计算内容不会捕获嵌套级别，greedy将覆盖同一级别上的所有上下文)。它也有一些关键字语法(比需要的稍微多一些，因为它只会被解析为根级别)。当试图用错误的键访问变量注册表时抛出InvalidArgumentException，当括号不匹配时抛出RuntimeException。class TokenizedInput

{

const VAR_REGEXP = '\"(?P.*?)\"';

const BLOCK_OPEN_REGEXP = '\(';

const BLOCK_CLOSE_REGEXP = '\)';

const KEYWORD_REGEXP = '(?OR|AND)';

// Token: $id

const TOKEN_DELIM_LEFT = '

const TOKEN_DELIM_RIGHT = '>';

const VAR_TOKEN = 'VAR';

const KEYWORD_TOKEN = 'KEYWORD';

const BLOCK_OPEN_TOKEN = 'BLOCK';

const BLOCK_CLOSE_TOKEN = 'ENDBLOCK';

const ID_DELIM = ':';

const ID_REGEXP = '[0-9]+';

private $original;

private $tokenized;

private $data = [];

private $blockLevel = 0;

private $varTokenId = 0;

protected $procedure = [

'varTokens' => self::VAR_REGEXP,

'keywordToken' => self::KEYWORD_REGEXP,

'blockTokens' => '(?P' . self::BLOCK_OPEN_REGEXP . ')|(?P' . self::BLOCK_CLOSE_REGEXP . ')'

];

private $tokenMatch;

public function __construct($input) {

$this->original = (string) $input;

}

public function string() {

isset($this->tokenized) or $this->tokenize();

return $this->tokenized;

}

public function variable($key) {

isset($this->tokenized) or $this->tokenize();

if (!isset($this->data[$key])) {

throw new InvalidArgumentException("Variable id:($key) does not exist.");

}

return $this->data[$key];

}

public function tokenSearchRegexp() {

if (!isset($this->tokenMatch)) {

$strings = $this->stringSearchRegexp();

$blocks = $this->blockSearchRegexp();

$this->tokenMatch = '#(?:' . $strings . '|' . $blocks . ')#';

}

return $this->tokenMatch;

}

public function stringSearchRegexp($id = null) {

$id = $id ?: self::ID_REGEXP;

return preg_quote(self::TOKEN_DELIM_LEFT . self::VAR_TOKEN . self::ID_DELIM)

. '(?P' . $id . ')'

. preg_quote(self::TOKEN_DELIM_RIGHT);

}

public function blockSearchRegexp($level = null) {

$level = $level ?: self::ID_REGEXP;

$block_open = preg_quote(self::TOKEN_DELIM_LEFT . self::BLOCK_OPEN_TOKEN . self::ID_DELIM)

. '(?P' . $level . ')'

. preg_quote(self::TOKEN_DELIM_RIGHT);

$block_close = preg_quote(self::TOKEN_DELIM_LEFT . self::BLOCK_CLOSE_TOKEN . self::ID_DELIM)

. '\k'

. preg_quote(self::TOKEN_DELIM_RIGHT);

return $block_open . '(?P.*)' . $block_close;

}

public function keywordSearchRegexp($keyword = null) {

$keyword = $keyword ? '(?P' . $keyword . ')' : self::KEYWORD_REGEXP;

return preg_quote(self::TOKEN_DELIM_LEFT . self::KEYWORD_TOKEN . self::ID_DELIM)

. $keyword

. preg_quote(self::TOKEN_DELIM_RIGHT);

}

private function tokenize() {

$current = $this->original;

foreach ($this->procedure as $method => $pattern) {

$current = preg_replace_callback('#(?:' . $pattern . ')#', [$this, $method], $current);

}

if ($this->blockLevel) {

throw new RuntimeException("Syntax error. Parenthesis mismatch." . $this->blockLevel);

}

$this->tokenized = $current;

}

protected function blockTokens($match) {

if (isset($match['close'])) {

$token = self::BLOCK_CLOSE_TOKEN . self::ID_DELIM . --$this->blockLevel;

} else {

$token = self::BLOCK_OPEN_TOKEN . self::ID_DELIM . $this->blockLevel++;

}

return $this->addDelimiters($token);

}

protected function varTokens($match) {

$this->data[$this->varTokenId] = $match[1];

return $this->addDelimiters(self::VAR_TOKEN . self::ID_DELIM . $this->varTokenId++);

}

protected function keywordToken($match) {

return $this->addDelimiters(self::KEYWORD_TOKEN . self::ID_DELIM . $match[1]);

}

private function addDelimiters($token) {

return self::TOKEN_DELIM_LEFT . $token . self::TOKEN_DELIM_RIGHT;

}

parser type类对标记化字符串执行匹配-提取已注册的变量，并通过clonig本身递归进入嵌套上下文。

运算符类型处理是不寻常的，这使得它更像是一个派生类，但无论如何，在解析器的世界中很难实现令人满意的抽象。

class ParsedInput

{

private $input;

private $result;

private $context;

public function __construct(TokenizedInput $input) {

$this->input = $input;

}

public function result() {

if (isset($this->result)) { return $this->result; }

$this->parse($this->input->string());

$this->addOperator();

return $this->result;

}

private function parse($string, $context = 'root') {

$this->context = $context;

preg_replace_callback(

$this->input->tokenSearchRegexp(),

[$this, 'buildStructure'],

$string

);

return $this->result;

}

protected function buildStructure($match) {

if (isset($match['contents'])) { $this->parseBlock($match['contents'], $match['level']); }

elseif (isset($match['id'])) { $this->parseVar($match['id']); }

}

protected function parseVar($id) {

$this->result[] = $this->input->variable((int) $id);

}

protected function parseBlock($contents, $level) {

$nested = clone $this;

$this->result[] = $nested->parse($contents, (int) $level);

}

protected function addOperator() {

$subBlocks = '#' . $this->input->blockSearchRegexp(1) . '#';

$rootLevel = preg_replace($subBlocks, '', $this->input->string());

$rootKeyword = '#' . $this->input->keywordSearchRegexp('AND') . '#';

return $this->result[] = (preg_match($rootKeyword, $rootLevel) === 1);

}

public function __clone() {

$this->result = [];

}

示例用法：

$input = '( ( "M" AND ( "(" OR "AND" ) ) AND "T" )';

$tokenized = new TokenizedInput($input);

$parsed = new ParsedInput($tokenized);

$result = $parsed->result();

我删除了名称空间/imports/intrefaces，所以您可以根据需要调整它们。也不想翻阅(现在可能无效)评论，所以也删除了它们。

凌冽的风

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫