最近研究多模式匹配算法,有个算法里面提到需要用一个有向无环字图,在网上找不到关于这方面的东西,经过多方面的努力,找到了建图的原理,以及建立的算法。自己编码实现了。
原理及其分析篇幅有点长,需要的可以给我邮件,现在提供算法,以及实现。
builddawg(S)
1. Create a node named source.
2. Let activenode be source.
3. For each word w of S do:*
A. For each letter a of w do:
Let activenode be update (activenode, a).
B. Let activenode be source.
4. Return source.
update (activenode, a)
1. If activenode has an outgoing edge labeled a, then*
A. Let newactivenode be the node that this edge leads to.*
B. If this edge is primary, return newactivenode.*
C. Else, return split (activenode, newactivenode).*
2. Else
A. Create a node named newactivenode.
B. Create a primary edge labeled a from activenode to newactivenode.
C. Let currentnode be activenode.
D. Let suflxnode be undefined.
E. While currentnode isn’t source and sufixnode is undefined do:
i. Let currentnode be the node pointed to by the sufftx pointer of currentnode.
ii. If currentnode has a primary outgoing edge labeled a, then let sufixnode be the
node that this edge leads to.
iii. Else, if currentnode has a secondary outgoing edge labeled a then
a. Let childnode be the node that this edge leads to.
b. Let suffixnode be split (currentnode, childnode).
iv. Else, create a secondary edge from currentnode to newactivenode labeled a
F. If sufixnode is still undefined, let suffixnode be source.
G. Set the suffix pointer of newactivenode to point to sufixnode.
H. Return newactivenode.
split (parentnode, childnode)
1. Create a node called newchildnode.
2. Make the secondary edge from parentnode to childnode into a primary edge from
parentnode to newchildnode (with the same label).
3. For every primary and secondary outgoing edge of childnode, create a secondary outgoing
edge of newchildnode with the same label and leading to the same node.
4. Set the suffix pointer of newchildnode equal to that of childnode.
5. Reset the suffix pointer of childnode to point to newchildnode.
6. Let currentnode be parentnode.
7. While currentnode isn’t source do:
A. Let currentnode be the node pointed to by the suffrx pointer of currentnode.
B. If currentnode has a secondary edge to childnode, then make it a secondary edge to
newchildnode (with the same label).
C. Else, break out of the while loop.
8. Return newchildnode.
实现:
CDawg.h
#define _DAWG_GRAPHIC_H
#define ALPHABET_SIZE 256
#define MAX_STATE 100
#define PATTERN_LEN 50
#include < map >
#include < iostream >
using namespace std;
struct CDawg_node
{
short node_order;
short next[ALPHABET_SIZE];
map < char , char > edge_type;
CDawg_node * suffix_node;
CDawg_node( short order)
{
node_order = order;
memset(next, 0 , ALPHABET_SIZE * sizeof ( short ));
suffix_node = 0 ;
}
void set_next_state( char accept_char, int next_state, char is_primary)
{
next[accept_char] = next_state;
edge_type[accept_char] = is_primary;
}
};
class CDawg<