一、实验目的
1.
掌握确定有限自动机(
DFA
)的最小化原理和算法,尤其是
Hopcroft
算法(即
课上所讲的
“
求异法
”
)。
2.
学习
DFA
状态等价性的判定方法,理解最小化过程中的分割和合并策略。
3.
实现
DFA
最小化算法,并验证最小化
DFA
的正确性。
4.
延续前两次实验的设计,确保数据结构能贯通整个自动机系列实验。
5.
提高算法优化和编程实现能力,增强对编译原理的理解
二、 实验内容与实验要求
1. 理论背景:
DFA
最小化是将
DFA
状态数减少到最小的过程,通过合并等价状态,实现
最优的状态机表示。
Hopcroft
算法是求异法的一种高效实现,它通过维护状态的
分割并使用快速查找机制来优化最小化过程。
2. 任务描述:
实现
DFA
最小化算法,将给定的
DFA
简化为状态数最少的等价
DFA
。验证
最小化
DFA
的正确性,并对比最小化前后的状态数量。
3. 实验步骤:
○ 理解 Hopcroft 算法的基本原理,包括状态等价的判定标准和状态合并的方
法。
○ 实现 Hopcroft 算法,将原 DFA 简化为等价的最小化 DFA。
○ 设计合理的数据结构表示最小化后的 DFA,确保其与前两次实验的 NFA 和 DFA
数据结构保持一致。
○ 验证最小化 DFA 的正确性,确保其接受的语言与原 DFA 相同。
4. 输入输出要求
输入:一个 DFA(包括状态集合、状态转换表、初始状态和接受状态集合)。
输出:最小化后的 DFA 状态集合及其转换关系,指明最小化前后的状态数和
状态转换关系。
5. 算法要求
实现 Hopcroft 算法,通过分割状态集合和快速查找机制来最小化 DFA。支
持状态等价性判定及状态的合并操作。
6. 数据结构要求
设计适合 Hopcroft 算法的高效数据结构,如用于记录状态分割的集合、合
并后的状态转换表等。
保持与前两次实验的数据结构一致,方便整个自动机系列实验的贯通实现。
7. 程序要求
使用 C/C++、Java、Python 等语言编写程序,代码结构清晰,具备良好的注
释。提供详细的实验报告,包括算法设计、实现过程、测试结果和问题分析。
5. 实验报告要求【整合到最后提交的个人所有实验报告中,加上目录】
○ 描述实验目的和内容。
○ 解释 Hopcroft 算法的原理和实现步骤,说明数据结构的设计思路。
○ 给出测试用例和结果,分析最小化前后的差异。
○ 总结实验的收获和遇到的挑战
三、 设计方案与算法描述
1.本次实验需要用到的结构体
1.1 Group, 存储最小化 DFA 的结构体。
struct Group{
vector<DFAState> groupStates;
string groupName;
int isFinalState=0;
int isStartState=0;
};
1.2 DFATrasition,存储最小化 DFA 的状态关系
struct GroupRelation{
Group fromState;
Group toState;
char transitionSymbol;
};
2.本文需要用到的函数
2.1 bool isGroupRelationInVector(Group fromState, Group toState, char symbol,vector<GroupRelation> groupRelations);
检查某个状态转换是否已经存在于状态集合中,避免重复添加。
bool isGroupRelationInVector(Group fromState, Group toState, char symbol,vector<GroupRelation> groupRelations){
for(GroupRelation groupRelation : groupRelations){
if(groupRelation.fromState.groupName == fromState.groupName && groupRelation.toState.groupName == toState.groupName && groupRelation.transitionSymbol == symbol){
return true;
}
}
return false;
}
2.2 bool findState(Group group, DFAState State);
检查一个给定的状态转移是否已经存在,避免重复添加转移
.
bool findState(Group group, DFAState State){
for(DFAState dfaState : group.groupStates){
if(dfaState.stateName == State.stateName){
return true;
}
}
return false;
}
2.3 void findallpair(DFAState dfaState,vector<DFATransition> dfaTransitions,vector<pair<char,DFAState>>& dfaStateTransitions);
找到以
dfaState
为头的所有转换边,将转换
symbol
和
toState
记录在
pair
里。
void findallpair(DFAState dfaState,vector<DFATransition> dfaTransitions,vector<pair<char,DFAState>>& dfaStateTransitions){
for(DFATransition dfaTransition : dfaTransitions){
if(dfaTransition.fromState.stateName == dfaState.stateName){
dfaStateTransitions.push_back({dfaTransition.transitionSymbol,dfaTransition.toState});
}
}
}
2.4 bool isInSameGroup(DFAState dfaState1,DFAStatedfaState2,vector<DFATransition> dfaTransitions,vector<vector<DFAState>> thegroup)
判断两个
DFAState
是否等价。
bool isInSameGroup(DFAState dfaState1,DFAState dfaState2,vector<DFATransition> dfaTransitions,vector<vector<DFAState>> thegroup){
//先找到所有以dfaState1为起始状态的转移Symbol和目标状态对
vector<pair<char,DFAState>> dfaState1Transitions;
vector<pair<char,DFAState>> dfaState2Transitions;
findallpair(dfaState1,dfaTransitions,dfaState1Transitions);
findallpair(dfaState2,dfaTransitions,dfaState2Transitions);
//输出dfaState1Transitions
cout << "dfaState1Transitions:" << endl;
for(pair<char,DFAState> dfaState1Transition : dfaState1Transitions){
cout << dfaState1Transition.first << " " << dfaState1Transition.second.stateName << endl;
}
//输出dfaState2Transitions
cout << "dfaState2Transitions:" << endl;
for(pair<char,DFAState> dfaState2Transition : dfaState2Transitions){
cout << dfaState2Transition.first << " " << dfaState2Transition.second.stateName << endl;
}
if(compare2pair(dfaState1Transitions,dfaState2Transitions,thegroup)){
return true;
}
else{
return false;
}
}
主要步骤
1. 找到所有以 dfaState1 和 dfaState2 为起点的转移关系
•
findallpair
函 数 被 调 用 两 次 , 用 于 从
dfaTransitions
中 查 找 所有 从
dfaState1
和
dfaState2
出发的转移关系。
•
结果保存在两个向量中:
o
dfaState1Transitions
:保存
dfaState1
所有输入符号及其对应的目标
状态。
o
dfaState2Transitions
:保存
dfaState2
所有输入符号及其对应的目标
状态。
2. 比较两个状态的转移关系
调用
compare2pair
函数,判断
dfaState1
和
dfaState2
在同一输入符号下是否都
转移到等价的目标状态。
关键辅助函数
• findallpair:根据给定状态和转移集合,找到该状态下的所有输入符号及
对应目标状态。
o
输入:起始状态、转移集合、结果存储向量。
o
输出:一个包含
<
输入符号
,
目标状态
>
的集合。
void findallpair(DFAState dfaState,vector<DFATransition> dfaTransitions,vector<pair<char,DFAState>>& dfaStateTransitions){
for(DFATransition dfaTransition : dfaTransitions){
if(dfaTransition.fromState.stateName == dfaState.stateName){
dfaStateTransitions.push_back({dfaTransition.transitionSymbol,dfaTransition.toState});
}
}
}
• compare2pair:比较两个状态的转移关系。
o
核心逻辑是查看在同一输入符号下,这两个状态是否转移到同一个
等价组(由
thegroup
提供)。
bool compare2pair(vector<pair<char,DFAState>> dfaState1Transitions,vector<pair<char,DFAState>> dfaState2Transitions,vector<vector<DFAState>> thegroup){
for(pair<char,DFAState> dfaState1Transition : dfaState1Transitions){
int flag = 0;
for(pair<char,DFAState> dfaState2Transition : dfaState2Transitions){
if(dfaState1Transition.first == dfaState2Transition.first){
if(findsame(dfaState1Transition.second,dfaState2Transition.second,thegroup)){
flag = 1;
break;
}
}
}
if(flag == 0){
return false;
}
}
return true;
}
2.5分割法最小化DFA
核心步骤
1. 初始化分组
将
DFA
的状态分为两组:
•
终止状态组
group1
:包含所有终止状态。
•
非终止状态组
group2
:包含所有非终止状态。
接着将这两个组加入总的分组
group
中。
2. 开始分割
通过不断地分割每一组状态来最小化
DFA
:
•
外层循环:每轮比较
newGroup
和
group
,直到分组不再变化。
•
内层循环:对每个组内的状态进一步划分。
3. 分割逻辑
对于每一组:
•
如果组的大小为
1
,直接保留。
•
否则,按转移关系将状态进一步划分为两个新组。
•
使用
isInSameGroup
判断某个状态与组内第一个状态是否属于同一组。
4. 生成最小化 DFA 状态
创建最小化
DFA
的新状态组
Group
,并判断:
•
是否是终止状态
。
•
是否是起始状态
。
5. 生成最小化 DFA 转移关系
对于原
DFA
中的每个转移关系:
•
找到转移对应的新分组。
•
若转移关系未记录在
minimizeDFAtransitions
中,则将其加入。
重要辅助函数
• isFinalState:判断某状态是否为终止状态。
• isInSameGroup:判断两个状态是否在同一组中。
• findState:查找某个状态是否属于特定分组。
• isGroupRelationInVector:判断某转移是否已存在于最小化转移集合中。
• printGroup:输出最终的状态分组信息。
void dfaMinimize(const vector<DFAState> dfaStates,const vector<DFATransition> dfaTransitions,const elem NFA_Elem, vector<Group> &dfaFinalStates, vector<GroupRelation> &minimizeDFAtransitions){
//初始化分组
vector<vector<DFAState>> group;//存放所有分组
vector<DFAState> group1;//终止状态
vector<DFAState> group2;//非终止状态
//将终止状态和非终止状态分开
for(DFAState dfaState : dfaStates){
if(isFinalState(dfaState,NFA_Elem)){
group1.push_back(dfaState);
}
else{
group2.push_back(dfaState);
}
}
group.push_back(group1);
group.push_back(group2);
//开始分割
vector<vector<DFAState>> newGroup;
while(newGroup.size() != group.size()){//说明上一轮分割有新的分组产生
newGroup = group;
group.clear();
for(vector<DFAState> dfaStateGroup : newGroup){
if(dfaStateGroup.size() == 1){
group.push_back(dfaStateGroup);//如果只有一个状态,不用分割
}
else{
//将dfaStateGroup分为两个组
vector<DFAState> group1;
vector<DFAState> group2;
for(DFAState dfaState : dfaStateGroup){//遍历每个分组
//判断dfaState和组内第一个状态是否在同一组
if(isInSameGroup(dfaState,dfaStateGroup[0],dfaTransitions,newGroup)){//如果dfaState和组内第一个状态在同一组,则加入group1,否则加入group2
group1.push_back(dfaState);
}
else{
group2.push_back(dfaState);
}
}
if(group1.size() != 0) group.push_back(group1);
if(group2.size() != 0) group.push_back(group2);
}
}
}
//生成最小化后的DFA状态
int i = 0;
for(vector<DFAState> dfaStateGroup : group){
Group group;
group.groupStates = dfaStateGroup;
group.groupName = "T" + to_string(i);
//判断是否是终止状态
for(DFAState dfaState : dfaStateGroup){
if(isFinalState(dfaState,NFA_Elem)){
group.isFinalState = 1;
break;
}
}
//判断是否是起始状态
for(DFAState dfaState : dfaStateGroup){
if(dfaState.stateName == dfaStates.front().stateName){
group.isStartState = 1;
break;
}
}
dfaFinalStates.push_back(group);
i++;
}
//生成最小化后的DFA转移关系
for(DFATransition dfaTransition : dfaTransitions){
Group fromState;
Group toState;
for(Group group : dfaFinalStates){
if(findState(group,dfaTransition.fromState)){
fromState = group;//找到fromState所在的group
}
if(findState(group,dfaTransition.toState)){
toState = group;//找到toState所在的group
}
}
//判断是否已经在minimizeDFAtransitions里
if(isGroupRelationInVector(fromState,toState,dfaTransition.transitionSymbol,minimizeDFAtransitions)){
continue;
}
//如果不在,加入
minimizeDFAtransitions.push_back({fromState,toState,dfaTransition.transitionSymbol});
}
cout << "DFA Minimize Done!" << endl;
cout << "The Final States are:" << endl;
printGroup(dfaFinalStates);
}
逐步构建
DFA
的状态集合和转移关系,最终生成一个等价的
DFA
。
1.初始化:
⚫
创建
NFA
初始状态集合
nfaInitialStateSet
,仅包含
NFA
的初始状态。
⚫
通过
eClosure
计算
NFA
初始状态的 ε
-
闭包,得到
DFA
的初始状态
dfaInitialState
。
⚫
将
dfaInitialState
添加到
DFA
状态集合
dfaStates
和临时状态集合
tempdfaStates
中。
2.构建 DFA:
⚫
使用一个临时状态集合
tempdfaStates
存储待处理的
DFA
状态。
⚫
当
tempdfaStates
非空时,依次处理其中的每个状态:
1.
从
tempdfaStates
中取出一个
dfaState
。
2.
遍历
NFA
的所有边,检查每个输入符号
symbol
。
3.
对当前
dfaState
调用
move
函数,计算在
symbol
输入下的目标状
态集合
nextState
。
4.
对目标状态集合
nextState
计算其 ε
-
闭包
dfaNextState
。
5.
如果
dfaNextState
是新状态(即未在
dfaStates
中出现过),将其加
入
dfaStates
和
tempdfaStates
。
6.
如果当前转移未在
dfaTransitions
中出现过,添加新的
DFA
转移。
3.更新 DFA 状态和转移:
持续更新
dfaStates
和
dfaTransitions
,直到所有状态处理完毕。
2.6 生成.dot 文件
void generateDotFile_minDFA(vector<Group> dfaFinalStates,vector<GroupRelation> minimizeDFAtransitions){
std::ofstream dotFile("minimizeDFA_graph.dot");
if (dotFile.is_open()) {
dotFile << "digraph MinimizeDFA {\n";
dotFile << " rankdir=LR; // 横向布局\n\n";
dotFile << " node [shape = circle]; // 初始状态\n\n";
// 遍历所有DFA状态,标记最终状态
for (const auto& group : dfaFinalStates) {
dotFile << " " << group.groupName;
dotFile << " [label=\"Group " << group.groupName;
// 判断是否为起始状态
if (group.isStartState == 1) {
dotFile << "\\n(startState)";
}
// 判断是否为最终状态
if (group.isFinalState == 1) { // 你需要定义这个判断条件
dotFile << "\\n(endState)";
dotFile << "\", shape=doublecircle];\n";
} else {
dotFile << "\"];\n";
}
}
dotFile << "\n";
// 添加DFA转移
for (const auto& groupRelation : minimizeDFAtransitions) {
dotFile <<" " << groupRelation.fromState.groupName << " -> "
<< groupRelation.toState.groupName << " [label=\"" << groupRelation.transitionSymbol << "\"];\n";
}
dotFile << "}\n";
dotFile.close();
std::cout << "Minimize DFA DOT file generated successfully.\n";
}
else {
std::cerr << "Unable to open DOT file.\n";
}
}
四、 测试结果
五、 源代码
1.main.cpp
#include "head.h"
int main() {
string Regular_Expression;
elem NFA_Elem;
input(Regular_Expression);
if (Regular_Expression.length() > 1) Regular_Expression = add_join_symbol(Regular_Expression);
infixToPostfix Solution(Regular_Expression);
//中缀转后缀
cout << "后缀表达式为:";
Regular_Expression = Solution.getResult();
cout << Regular_Expression << endl;
//表达式转NFA()
NFA_Elem = express_to_NFA(Regular_Expression);
//显示
Display(NFA_Elem);
//生成NFAdot文件6A
generateDotFile_NFA(NFA_Elem);
// 初始化 DFA 状态集合和转换关系
vector<DFAState> dfaStates; //用于存储所有的DFA状态
vector<DFATransition> dfaTransitions; //用于存储DFA状态之间的转移
vector<Group> dfaFinalStates; //用于存储DFA的终止状态
//存储最小化后的transitions
vector<GroupRelation> minimizeDFAtransitions;
buildDFAFromNFA(NFA_Elem, dfaStates, dfaTransitions);//从NFA构造DFA
// 显示 DFA
displayDFA(dfaStates, dfaTransitions);
//生成DFAdot文件
generateDotFile_DFA(dfaStates,dfaTransitions,NFA_Elem);
//dfa最小化,用分割法
dfaMinimize(dfaStates,dfaTransitions,NFA_Elem,dfaFinalStates, minimizeDFAtransitions);
//显示最小化后的DFA
printGroupRelation(minimizeDFAtransitions);
//生成最小化后的DFA的dot文件
generateDotFile_minDFA(dfaFinalStates,minimizeDFAtransitions);
return 0;
}
2.head.h
#ifndef HEAD_H
#define HEAD_H
#include <iostream>
#include <stdio.h>
#include <cctype>
#include <stack>
#include <string>
#include <map>
#include <set>
#include <vector>
#include<iterator>
#include <fstream>
using namespace std;
/*构造NFA和DFA所需要的结构体*/
//NFA的节点
struct node
{
string nodeName;
};
//NFA的边
struct edge
{
node startName; //起始点
node endName; //目标点
char tranSymbol; //转换符号
};
//NFA的组成单元,一个大的NFA单元可以是由很多小单元通过规则拼接起来
struct elem
{
int edgeCount; //边数
edge edgeSet[100]; //该NFA拥有的边
node startName; //开始状态
node endName; //结束状态
};
// 定义 DFA 的状态
struct DFAState {
set<string> nfaStates; //一个包含NFA状态的集合
string stateName;
bool operator==(const DFAState& other) const {
return this->stateName == other.stateName;
// 如果有其他成员变量需要比较,也可以在这里添加
}
};
// 定义 DFA 的转换关系
struct DFATransition {
DFAState fromState;
DFAState toState;
char transitionSymbol;
};
//DFA最小化的分组
//存储最小化DFA的结构体,
struct Group{
vector<DFAState> groupStates;
string groupName;
int isFinalState=0;
int isStartState=0;
};
//DFA最小化的分组关系
struct GroupRelation{
Group fromState;
Group toState;
char transitionSymbol;
};
//检查grouprelation转换边是否在边集合中,比如a->b是否已经在集合中
bool isGroupRelationInVector(Group fromState, Group toState, char symbol,vector<GroupRelation> groupRelations);
//检查group是否在group集合中
void printGroup(vector<Group> group);
void printGroupRelation(vector<GroupRelation> groupRelations);
bool findState(Group group, DFAState State);
void findallpair(DFAState dfaState,vector<DFATransition> dfaTransitions,vector<pair<char,DFAState>>& dfaStateTransitions);
bool compare2pair(vector<pair<char,DFAState>> dfaState1Transitions,vector<pair<char,DFAState>> dfaState2Transitions,vector<vector<DFAState>> group);
bool findsame(DFAState dfaState1Transition, DFAState toState,vector<vector<DFAState>> dfaStateGroup);
//DFA最小化
void dfaMinimize(const vector<DFAState> dfaStates,const vector<DFATransition> dfaTransitions,const elem NFA_Elem, vector<Group> &dfaFinalStates, vector<GroupRelation> &minimizeDFAtransitions);
bool isInSameGroup(DFAState dfaState1,DFAState dfaState2,vector<DFATransition> dfaTransitions,vector<vector<DFAState>> group);
void generateDotFile_minDFA(vector<Group> dfaFinalStates,vector<GroupRelation> minimizeDFAtransitions);
/*下面是转换为DFA的主要函数*/
// 计算 NFA 状态的ε闭包
DFAState eClosure(const set<string>& nfaStates, elem nfa);
// 计算 DFA 的状态转移
DFAState move(const DFAState& dfaState, char transitionSymbol,elem nfa);
// 检查 DFA 状态是否在状态集合中
bool isDFAStateInVector(const vector<DFAState>& dfaStates, const DFAState& targetState);
//检查转换边是否在边集合中,比如a->b是否已经在集合中
bool isTransitionInVector(DFAState, DFAState, char,vector<DFATransition>);
//NFA转换为DFA
void buildDFAFromNFA(const elem& NFA_Elem, vector<DFAState>& dfaStates, vector<DFATransition>& dfaTransitions);
// 显示 DFA 状态和转移关系
void displayDFA(const vector<DFAState>& dfaStates, const vector<DFATransition>& dfaTransitions);
//生成dot文件
void generateDotFile_DFA(vector<DFAState>& dfaStates, vector<DFATransition>& dfaTransitions, const elem& nfa);
bool isFinalState(const DFAState& state, const elem& nfa);
int isAcceptedByDFA(vector<DFAState>& dfaStates,vector<DFATransition>& dfaTransitions, string &testStr,elem nfa);
/*下面是构造NFA的主要函数*/
//创建新节点
node new_node();
//处理 a
elem act_Elem(char);
//处理a|b
elem act_Unit(elem, elem);
//组成单元拷贝函数
void elem_copy(elem&, elem);
//处理ab
elem act_join(elem, elem);
//处理 a*
elem act_star(elem);
void input(string&);
string add_join_symbol(string); //两个单元拼接在一起相当于中间有一个+,如ab相当于a+b
class infixToPostfix {
public:
infixToPostfix(const string& infix_expression);
int is_letter(char check);
int ispFunc(char c);
int icpFunc(char c);
void infToPost();
string getResult();
private:
string infix;
string postfix;
map<char, int> isp;
map<char, int> icp;
};
elem express_to_NFA(string);
void Display(elem);
int is_letter(char check);
void generateDotFile_NFA(const elem& nfa);
#endif
3.Func.cpp
#include "head.h"
int nodeNum = 0;
//用分割法写DFA最小化
void dfaMinimize(const vector<DFAState> dfaStates,const vector<DFATransition> dfaTransitions,const elem NFA_Elem, vector<Group> &dfaFinalStates, vector<GroupRelation> &minimizeDFAtransitions){
//初始化分组
vector<vector<DFAState>> group;//存放所有分组
vector<DFAState> group1;//终止状态
vector<DFAState> group2;//非终止状态
//将终止状态和非终止状态分开
for(DFAState dfaState : dfaStates){
if(isFinalState(dfaState,NFA_Elem)){
group1.push_back(dfaState);
}
else{
group2.push_back(dfaState);
}
}
group.push_back(group1);
group.push_back(group2);
//开始分割
vector<vector<DFAState>> newGroup;
while(newGroup.size() != group.size()){//说明上一轮分割有新的分组产生
newGroup = group;
group.clear();
for(vector<DFAState> dfaStateGroup : newGroup){
if(dfaStateGroup.size() == 1){
group.push_back(dfaStateGroup);//如果只有一个状态,不用分割
}
else{
//将dfaStateGroup分为两个组
vector<DFAState> group1;
vector<DFAState> group2;
for(DFAState dfaState : dfaStateGroup){//遍历每个分组
//判断dfaState和组内第一个状态是否在同一组
if(isInSameGroup(dfaState,dfaStateGroup[0],dfaTransitions,newGroup)){//如果dfaState和组内第一个状态在同一组,则加入group1,否则加入group2
group1.push_back(dfaState);
}
else{
group2.push_back(dfaState);
}
}
if(group1.size() != 0) group.push_back(group1);
if(group2.size() != 0) group.push_back(group2);
}
}
}
//生成最小化后的DFA状态
int i = 0;
for(vector<DFAState> dfaStateGroup : group){
Group group;
group.groupStates = dfaStateGroup;
group.groupName = "T" + to_string(i);
//判断是否是终止状态
for(DFAState dfaState : dfaStateGroup){
if(isFinalState(dfaState,NFA_Elem)){
group.isFinalState = 1;
break;
}
}
//判断是否是起始状态
for(DFAState dfaState : dfaStateGroup){
if(dfaState.stateName == dfaStates.front().stateName){
group.isStartState = 1;
break;
}
}
dfaFinalStates.push_back(group);
i++;
}
//生成最小化后的DFA转移关系
for(DFATransition dfaTransition : dfaTransitions){
Group fromState;
Group toState;
for(Group group : dfaFinalStates){
if(findState(group,dfaTransition.fromState)){
fromState = group;//找到fromState所在的group
}
if(findState(group,dfaTransition.toState)){
toState = group;//找到toState所在的group
}
}
//判断是否已经在minimizeDFAtransitions里
if(isGroupRelationInVector(fromState,toState,dfaTransition.transitionSymbol,minimizeDFAtransitions)){
continue;
}
//如果不在,加入
minimizeDFAtransitions.push_back({fromState,toState,dfaTransition.transitionSymbol});
}
cout << "DFA Minimize Done!" << endl;
cout << "The Final States are:" << endl;
printGroup(dfaFinalStates);
}
//判断group里是否有某个状态
bool findState(Group group, DFAState State){
for(DFAState dfaState : group.groupStates){
if(dfaState.stateName == State.stateName){
return true;
}
}
return false;
}
//判断两个DFAState是否在同一组
bool isInSameGroup(DFAState dfaState1,DFAState dfaState2,vector<DFATransition> dfaTransitions,vector<vector<DFAState>> thegroup){
//先找到所有以dfaState1为起始状态的转移Symbol和目标状态对
vector<pair<char,DFAState>> dfaState1Transitions;
vector<pair<char,DFAState>> dfaState2Transitions;
findallpair(dfaState1,dfaTransitions,dfaState1Transitions);
findallpair(dfaState2,dfaTransitions,dfaState2Transitions);
//输出dfaState1Transitions
cout << "dfaState1Transitions:" << endl;
for(pair<char,DFAState> dfaState1Transition : dfaState1Transitions){
cout << dfaState1Transition.first << " " << dfaState1Transition.second.stateName << endl;
}
//输出dfaState2Transitions
cout << "dfaState2Transitions:" << endl;
for(pair<char,DFAState> dfaState2Transition : dfaState2Transitions){
cout << dfaState2Transition.first << " " << dfaState2Transition.second.stateName << endl;
}
if(compare2pair(dfaState1Transitions,dfaState2Transitions,thegroup)){
return true;
}
else{
return false;
}
}
//比较两个vectorpair是否的目标状态是否在同一组
bool compare2pair(vector<pair<char,DFAState>> dfaState1Transitions,vector<pair<char,DFAState>> dfaState2Transitions,vector<vector<DFAState>> thegroup){
for(pair<char,DFAState> dfaState1Transition : dfaState1Transitions){
int flag = 0;
for(pair<char,DFAState> dfaState2Transition : dfaState2Transitions){
if(dfaState1Transition.first == dfaState2Transition.first){
if(findsame(dfaState1Transition.second,dfaState2Transition.second,thegroup)){
flag = 1;
break;
}
}
}
if(flag == 0){
return false;
}
}
return true;
}
//找到dfaState的所有转换边
void findallpair(DFAState dfaState,vector<DFATransition> dfaTransitions,vector<pair<char,DFAState>>& dfaStateTransitions){
for(DFATransition dfaTransition : dfaTransitions){
if(dfaTransition.fromState.stateName == dfaState.stateName){
dfaStateTransitions.push_back({dfaTransition.transitionSymbol,dfaTransition.toState});
}
}
}
bool findsame(DFAState dfaState1Transition, DFAState toState,vector<vector<DFAState>> group){
for(vector<DFAState> dfaStateGroup : group){
int flag1 = 0;
int flag2 = 0;
for(DFAState dfaState : dfaStateGroup){
if(dfaState.stateName == dfaState1Transition.stateName){
flag1 = 1;
}
if(dfaState.stateName == toState.stateName){
flag2 = 1;
}
}
if(flag1 == 1 && flag2 == 1){
return true;
}
}
return false;
}
//检查grouprelation转换边是否在边集合中,比如a->b是否已经在集合中
bool isGroupRelationInVector(Group fromState, Group toState, char symbol,vector<GroupRelation> groupRelations){
for(GroupRelation groupRelation : groupRelations){
if(groupRelation.fromState.groupName == fromState.groupName && groupRelation.toState.groupName == toState.groupName && groupRelation.transitionSymbol == symbol){
return true;
}
}
return false;
}
//打印group
void printGroup(vector<Group> group){
for(Group g : group){
cout << "Group Name: " << g.groupName << " ";
if(g.isFinalState == 1){
cout << " is Final State" << endl;
}
else{
cout << " is Not Final State" << endl;
}
}
}
//打印grouprelation
void printGroupRelation(vector<GroupRelation> groupRelations){
for(GroupRelation groupRelation : groupRelations){
cout << groupRelation.fromState.groupName << "-----" << groupRelation.transitionSymbol<<"-----" <<groupRelation.toState.groupName<< endl;
}
}
//生成最小化后的DFA的dot文件
void generateDotFile_minDFA(vector<Group> dfaFinalStates,vector<GroupRelation> minimizeDFAtransitions){
std::ofstream dotFile("minimizeDFA_graph.dot");
if (dotFile.is_open()) {
dotFile << "digraph MinimizeDFA {\n";
dotFile << " rankdir=LR; // 横向布局\n\n";
dotFile << " node [shape = circle]; // 初始状态\n\n";
// 遍历所有DFA状态,标记最终状态
for (const auto& group : dfaFinalStates) {
dotFile << " " << group.groupName;
dotFile << " [label=\"Group " << group.groupName;
// 判断是否为起始状态
if (group.isStartState == 1) {
dotFile << "\\n(startState)";
}
// 判断是否为最终状态
if (group.isFinalState == 1) { // 你需要定义这个判断条件
dotFile << "\\n(endState)";
dotFile << "\", shape=doublecircle];\n";
} else {
dotFile << "\"];\n";
}
}
dotFile << "\n";
// 添加DFA转移
for (const auto& groupRelation : minimizeDFAtransitions) {
dotFile <<" " << groupRelation.fromState.groupName << " -> "
<< groupRelation.toState.groupName << " [label=\"" << groupRelation.transitionSymbol << "\"];\n";
}
dotFile << "}\n";
dotFile.close();
std::cout << "Minimize DFA DOT file generated successfully.\n";
}
else {
std::cerr << "Unable to open DOT file.\n";
}
}
// 计算 NFA 状态的ε闭包
DFAState eClosure(const set<string>& nfaStates,elem nfa) {
DFAState eClosureState;
eClosureState.nfaStates = nfaStates;
stack<string> stateStack;
// 初始化栈,将初始状态加入栈,最开始nfaState里只有NFA_Elem.startName
for (const string& nfaState_name : nfaStates) {
stateStack.push(nfaState_name);
}
while (!stateStack.empty()) {
string currentState = stateStack.top();
stateStack.pop();
// 遍历 NFA 的边
for (int i = 0; i < nfa.edgeCount; i++) {
edge currentEdge = nfa.edgeSet[i];
// 如果边的起始状态是当前状态,并且边的转换符号是#,那么将目标状态加入ε闭包
if (currentEdge.startName.nodeName == currentState && currentEdge.tranSymbol == '#') {
// 检查目标状态是否已经在ε闭包中,避免重复添加
if (eClosureState.nfaStates.find(currentEdge.endName.nodeName) == eClosureState.nfaStates.end()) {
eClosureState.nfaStates.insert(currentEdge.endName.nodeName);
// 将目标状态加入栈以便进一步处理
stateStack.push(currentEdge.endName.nodeName);
}
}
}
}
// 为ε闭包分配一个唯一的名称
for (const string& nfaState_name : eClosureState.nfaStates) {
eClosureState.stateName += nfaState_name;
}
return eClosureState;
}
//move函数
DFAState move(const DFAState& dfaState, char transitionSymbol,elem nfa) {
DFAState nextState;
// 遍历 DFAState 中的每个 NFA 状态
for (const string& nfaState_name : dfaState.nfaStates) {
// 在这里遍历所有 NFA 状态的边
for (int i = 0; i < nfa.edgeCount; i++) {
edge currentEdge = nfa.edgeSet[i];
// 如果边的起始状态是当前状态,且边的转换符号等于输入符号,将目标状态加入 nextState
if (currentEdge.startName.nodeName == nfaState_name && currentEdge.tranSymbol == transitionSymbol&¤tEdge.tranSymbol!='#') {
nextState.nfaStates.insert(currentEdge.endName.nodeName);
}
}
}
// 为 nextState 分配一个唯一的名称
for (const string& nfaState_name : nextState.nfaStates) {
nextState.stateName += nfaState_name;
}
return nextState;
}
// 检查 DFA 状态是否在状态集合中,即dfaStates里有没有找到targetState
bool isDFAStateInVector(const vector<DFAState>& dfaStates, const DFAState& targetState) {
for (const DFAState& state : dfaStates) {
if (state.stateName == targetState.stateName) {
return true; // 找到匹配的状态
}
}
return false; // 没有找到匹配的状态
}
//检查转换边是否在边集合中,比如a->b是否已经在集合中
bool isTransitionInVector(DFAState dfaState, DFAState dfaNextState, char symbol,vector<DFATransition> dfaTransitions)
{
for (const DFATransition& transition : dfaTransitions) {
if (transition.fromState.stateName == dfaState.stateName && dfaNextState.stateName == dfaNextState.stateName&&symbol==transition.transitionSymbol) {
return true; //找到匹配的状态
}
}
return false;
}
void buildDFAFromNFA(const elem& NFA_Elem, vector<DFAState>& dfaStates, vector<DFATransition>& dfaTransitions) {
// 初始化 DFA 状态集合和转换关系
set<string> nfaInitialStateSet;
nfaInitialStateSet.insert(NFA_Elem.startName.nodeName);
DFAState dfaInitialState = eClosure(nfaInitialStateSet, NFA_Elem); // 计算 NFA 初始状态的 ε闭包
dfaStates.push_back(dfaInitialState);
// 开始构建 DFA
vector<DFAState> tempdfaStates;
tempdfaStates.push_back(dfaInitialState);
while(tempdfaStates.size()!=0)
{
DFAState dfaState = *tempdfaStates.begin();
tempdfaStates.erase(tempdfaStates.begin());
for (int i = 0; i < NFA_Elem.edgeCount; i++) {
char symbol = NFA_Elem.edgeSet[i].tranSymbol;
DFAState nextState = move(dfaState, symbol, NFA_Elem);
DFAState dfaNextState = eClosure(nextState.nfaStates, NFA_Elem);
if (!nextState.nfaStates.empty()) {
if (!isDFAStateInVector(dfaStates, dfaNextState)) {
dfaStates.push_back(dfaNextState);
tempdfaStates.push_back(dfaNextState);
}
if (!isTransitionInVector(dfaState, dfaNextState, symbol, dfaTransitions)) {
dfaTransitions.push_back({ dfaState, dfaNextState, symbol });
}
}
}
}
}
// 显示 DFA 状态和转移关系,包括起始和结束状态
void displayDFA(const vector<DFAState>& dfaStates, const vector<DFATransition>& dfaTransitions) {
cout << "DFA States:" << endl;
for (const DFAState& state : dfaStates) {
cout << "State " << state.stateName << " (NFA States: ";
for (const string& nfaState_name : state.nfaStates) {
cout << nfaState_name << " ";
}
cout << ")";
if (state.stateName == dfaStates.front().stateName) {
cout << " (Initial State)";
}
if (state.stateName == dfaStates.back().stateName) {
cout << " (Final State)";
}
cout << endl;
}
cout << "DFA Transitions:" << endl;
for (const DFATransition& transition : dfaTransitions) {
cout << "State " << transition.fromState.stateName << " --(" << transition.transitionSymbol << ")--> State " << transition.toState.stateName << endl;
}
}
//生成DFA的dot文件
void generateDotFile_DFA(vector<DFAState>& dfaStates, vector<DFATransition>& dfaTransitions, const elem& nfa) {
std::ofstream dotFile("dfa_graph.dot");
if (dotFile.is_open()) {
dotFile << "digraph DFA {\n";
dotFile << " rankdir=LR; // 横向布局\n\n";
dotFile << " node [shape = circle]; // 初始状态\n\n";
// 遍历所有DFA状态,标记最终状态
for (const auto& state : dfaStates) {
dotFile << " " << state.stateName;
dotFile << " [label=\"State " << state.stateName;
// 判断是否为起始状态
if (state.stateName == dfaStates.front().stateName) {
dotFile << "\\n(startState)";
}
// 判断是否为最终状态
if (isFinalState(state,nfa)) { // 你需要定义这个判断条件
dotFile << "\\n(endState)";
dotFile << "\", shape=doublecircle];\n";
} else {
dotFile << "\"];\n";
}
}
dotFile << "\n";
// 添加DFA转移
for (const auto& transition : dfaTransitions) {
dotFile <<" " << transition.fromState.stateName << " -> "
<< transition.toState.stateName << " [label=\"" << transition.transitionSymbol << "\"];\n";
}
dotFile << "}\n";
dotFile.close();
std::cout << "DFA DOT file generated successfully.\n";
}
else {
std::cerr << "Unable to open DOT file.\n";
}
}
void generateDotFile_minimizeDFA();
// 假设有一个函数来判断某个状态是否为最终状态
bool isFinalState(const DFAState& state, const elem& nfa) {
//定义终止态的nodename
string finalStateName = nfa.endName.nodeName;
//如果这个状态里面有NFA的终止状态,那么这个状态就是DFA的终止状态
if(state.nfaStates.find(finalStateName) != state.nfaStates.end()) {
return true;
}
return false;
}
int isAcceptedByDFA(vector<DFAState>& dfastates,vector<DFATransition>& dfaTransitions, string& testStr,elem nfa) {
//先确定初始dfa状态
DFAState CurrentState = *dfastates.begin();
//遍历字符串
int flag = 0;
for(char c : testStr){
//遍历dfaTransitions
for(DFATransition dfaTransition : dfaTransitions){
//如果当前状态是起始状态,且转移符号和当前字符相同
if(dfaTransition.fromState.stateName == CurrentState.stateName && dfaTransition.transitionSymbol == c){
CurrentState = dfaTransition.toState;
flag = 1;
break;
}
}
if(flag == 0){
return 0;
}
flag = 0;
}
//判断是否是最终状态
if(isFinalState(CurrentState,nfa)){
return 1;
}
else{
return 0;
}
}
/*下面是构造NFA的主要函数*/
//创建新节点
node new_node()
{
node newNode;
newNode.nodeName = nodeNum + 65;//将名字用大写字母表示
nodeNum++;
return newNode;
}
//接收输入正规表达式
void input(string& RE)
{
cout << "请输入正则表达式: (操作符:() * |;字符集:a~z A~Z)" << endl;
cin >> RE;
}
//组成单元拷贝函数
void elem_copy(elem& dest, elem source)
{
for (int i = 0; i < source.edgeCount; i++) {
dest.edgeSet[dest.edgeCount + i] = source.edgeSet[i];
}
dest.edgeCount += source.edgeCount;
}
//处理 a
elem act_Elem(char c)
{
//新节点
node startNode = new_node();
node endNode = new_node();
//新边
edge newEdge;
newEdge.startName = startNode;
newEdge.endName = endNode;
newEdge.tranSymbol = c;
//新NFA组成元素(小的NFA元素/单元)
elem newElem;
newElem.edgeCount = 0; //初始状态
newElem.edgeSet[newElem.edgeCount++] = newEdge;
newElem.startName = newElem.edgeSet[0].startName;
newElem.endName = newElem.edgeSet[0].endName;
return newElem;
}
//处理a|b
elem act_Unit(elem fir, elem sec)
{
elem newElem;
newElem.edgeCount = 0;
edge edge1, edge2, edge3, edge4;
//获得新的状态节点
node startNode = new_node();
node endNode = new_node();
//构建e1(连接起点和AB的起始点A)
edge1.startName = startNode;
edge1.endName = fir.startName;
edge1.tranSymbol = '#';
//构建e2(连接起点和CD的起始点C)
edge2.startName = startNode;
edge2.endName = sec.startName;
edge2.tranSymbol = '#';
//构建e3(连接AB的终点和终点)
edge3.startName = fir.endName;
edge3.endName = endNode;
edge3.tranSymbol = '#';
//构建e4(连接CD的终点和终点)
edge4.startName = sec.endName;
edge4.endName = endNode;
edge4.tranSymbol = '#';
//将fir和sec合并
elem_copy(newElem, fir);
elem_copy(newElem, sec);
//新构建的4条边
newElem.edgeSet[newElem.edgeCount++] = edge1;
newElem.edgeSet[newElem.edgeCount++] = edge2;
newElem.edgeSet[newElem.edgeCount++] = edge3;
newElem.edgeSet[newElem.edgeCount++] = edge4;
newElem.startName = startNode;
newElem.endName = endNode;
return newElem;
}
//处理 N(s)N(t)
elem act_join(elem fir, elem sec)
{
//将fir的结束状态和sec的开始状态合并,将sec的边复制给fir,将fir返回
//将sec中所有以StartState开头的边全部修改
for (int i = 0; i < sec.edgeCount; i++) {
if (sec.edgeSet[i].startName.nodeName.compare(sec.startName.nodeName) == 0)
{
sec.edgeSet[i].startName = fir.endName; //该边e1的开始状态就是N(t)的起始状态
}
else if (sec.edgeSet[i].endName.nodeName.compare(sec.startName.nodeName) == 0) {
sec.edgeSet[i].endName = fir.endName; //该边e2的结束状态就是N(t)的起始状态
}
}
sec.startName = fir.endName;
elem_copy(fir, sec);
//将fir的结束状态更新为sec的结束状态
fir.endName = sec.endName;
return fir;
}
//处理a*
elem act_star(elem Elem)
{
elem newElem;
newElem.edgeCount = 0;
edge edge1, edge2, edge3, edge4;
//获得新状态节点
node startNode = new_node();
node endNode = new_node();
//e1
edge1.startName = startNode;
edge1.endName = endNode;
edge1.tranSymbol = '#'; //闭包取空串
//e2
edge2.startName = Elem.endName;
edge2.endName = Elem.startName;
edge2.tranSymbol = '#';
//e3
edge3.startName = startNode;
edge3.endName = Elem.startName;
edge3.tranSymbol = '#';
//e4
edge4.startName = Elem.endName;
edge4.endName = endNode;
edge4.tranSymbol = '#';
//构建单元
elem_copy(newElem, Elem);
//将新构建的四条边加入EdgeSet
newElem.edgeSet[newElem.edgeCount++] = edge1;
newElem.edgeSet[newElem.edgeCount++] = edge2;
newElem.edgeSet[newElem.edgeCount++] = edge3;
newElem.edgeSet[newElem.edgeCount++] = edge4;
//构建NewElem的启示状态和结束状态
newElem.startName = startNode;
newElem.endName = endNode;
return newElem;
}
int is_letter(char check) {
if (check >= 'a' && check <= 'z' || check >= 'A' && check <= 'Z')
return true;
return false;
}
//
string add_join_symbol(string add_string)
{
int length = add_string.size();
int return_string_length = 0;
char* return_string = new char[2 * length + 2];//最多是两倍
char first, second;
for (int i = 0; i < length - 1; i++)
{
first = add_string.at(i);
second = add_string.at(i + 1);
return_string[return_string_length++] = first;
//要加的可能性如ab 、 *b 、 a( 、 )b 等情况
//若第二个是字母、第一个不是'('、'|'都要添加
if (first != '(' && first != '|' && is_letter(second))
{
return_string[return_string_length++] = '+';
}
//若第二个是'(',第一个不是'|'、'(',也要加
else if (second == '(' && first != '|' && first != '(')
{
return_string[return_string_length++] = '+';
}
}
//将最后一个字符写入second
return_string[return_string_length++] = second;
return_string[return_string_length] = '\0';
string STRING(return_string);
cout << "加'+'后的表达式:" << STRING << endl;
return STRING;
}
//类里的各类元素定义
infixToPostfix::infixToPostfix(const string& infix_expression) : infix(infix_expression), postfix("") {
isp = { {'+', 3}, {'|', 5}, {'*', 7}, {'(', 1}, {')', 8}, {'#', 0} };
icp = { {'+', 2}, {'|', 4}, {'*', 6}, {'(', 8}, {')', 1}, {'#', 0} };
}
int infixToPostfix::is_letter(char check) {
if (check >= 'a' && check <= 'z' || check >= 'A' && check <= 'Z')
return true;
return false;
}
int infixToPostfix::ispFunc(char c) {
int priority = isp.count(c) ? isp[c] : -1;
if (priority == -1) {
cerr << "error: 出现未知符号!" << endl;
exit(1); // 异常退出
}
return priority;
}
int infixToPostfix::icpFunc(char c) {
int priority = icp.count(c) ? icp[c] : -1;
if (priority == -1) {
cerr << "error: 出现未知符号!" << endl;
exit(1); // 异常退出
}
return priority;
}
void infixToPostfix::infToPost() {
string infixWithHash = infix + "#";
stack<char> stack;
int loc = 0;
while (!stack.empty() || loc < infixWithHash.size()) {
if (is_letter(infixWithHash[loc])) {
postfix += infixWithHash[loc];
loc++;
}
else {
char c1 = (stack.empty()) ? '#' : stack.top();
char c2 = infixWithHash[loc];
if (ispFunc(c1) < icpFunc(c2)) {
stack.push(c2);
loc++;
}
else if (ispFunc(c1) > icpFunc(c2)) {
postfix += c1;
stack.pop();
}
else {
if (c1 == '#' && c2 == '#') {
break;
}
stack.pop();
loc++;
}
}
}
}
string infixToPostfix::getResult() {
postfix = ""; // 清空结果
infToPost();
return postfix;
}
/**表达式转NFA处理函数,返回最终的NFA集合
*/
elem express_to_NFA(string expression)
{
int length = expression.size();
char element;
elem Elem, fir, sec;
stack<elem> STACK;
for (int i = 0; i < length; i++)
{
element = expression.at(i);
switch (element)
{
case '|':
sec = STACK.top();
STACK.pop();
fir = STACK.top();
STACK.pop();
Elem = act_Unit(fir, sec);
STACK.push(Elem);
break;
case '*':
fir = STACK.top();
STACK.pop();
Elem = act_star(fir);
STACK.push(Elem);
break;
case '+':
sec = STACK.top();
STACK.pop();
fir = STACK.top();
STACK.pop();
Elem = act_join(fir, sec);
STACK.push(Elem);
break;
default:
Elem = act_Elem(element);
STACK.push(Elem);
}
}
cout << "已将正则表达式转换为NFA!" << endl;
Elem = STACK.top();
STACK.pop();
return Elem;
}
//打印NFA
void Display( elem Elem) {
cout << "NFA States:" << endl;
cout << "Start State: " << Elem.startName.nodeName << endl;
cout << "End State: " << Elem.endName.nodeName << endl;
cout << "NFA Transitions:" << endl;
for (int i = 0; i < Elem.edgeCount; i++) {
cout << "Edge " << i + 1 << ": ";
cout << Elem.edgeSet[i].startName.nodeName << " --(" << Elem.edgeSet[i].tranSymbol << ")--> ";
cout << Elem.edgeSet[i].endName.nodeName << endl;
}
cout << "End" << endl;
}
//生成NFAdot文件
void generateDotFile_NFA(const elem& nfa) {
std::ofstream dotFile("nfa_graph.dot");
if (dotFile.is_open()) {
dotFile << "digraph NFA {\n";
dotFile << " rankdir=LR; // 横向布局\n\n";
dotFile << " node [shape = circle]; // 状态节点\n\n";
dotFile << nfa.endName.nodeName << " [shape=doublecircle];\n";
// 添加 NFA 状态
dotFile << " " << nfa.startName.nodeName << " [label=\"Start State: " << nfa.startName.nodeName << "\"];\n";
dotFile << " " << nfa.endName.nodeName << " [label=\"End State: " << nfa.endName.nodeName << "\"];\n";
// 添加 NFA 转移
for (int i = 0; i < nfa.edgeCount; i++) {
const edge& currentEdge = nfa.edgeSet[i];
dotFile << " " << currentEdge.startName.nodeName << " -> " << currentEdge.endName.nodeName << " [label=\"" << currentEdge.tranSymbol << "\"];\n";
}
dotFile << "}\n";
dotFile.close();
std::cout << "NFA DOT file generated successfully.\n";
}
else {
std::cerr << "Unable to open NFA DOT file.\n";
}
}
4.photo.py
import graphviz
#画最小化DFA图
with open("minimizeDFA_graph.dot") as f:
dot_graph = f.read()
dot = graphviz.Source(dot_graph)
dot.view()
六、 实验的收获和遇到的挑战
二、遇到的挑战
1. 对当前等价状态进行分割时程序分割错误
对最终的图表进行审查时发现,生成的最小化
DFA
有错误,于是通过断点逐
步查看发现是我的判断该
DFA
状态是否和某个状态等价的函数出现了问题。于
是更改了判断等价的函数,采用找到二者的所有转换边,比较转换到达的
DFAState
是否也等价的方法,最终运行成功。
2. 分割生成的最终 DFA 状态并不是有序的
一开始将最终
DFA
状态的首个当作
StartState
,最后一个当作
EndState
,
发现这样的图是不正确的,因为分割生成的
DFA
状态压入栈并不是有序的,再
生成
DFA
最小化的最终状态也不是有序的,于是在结构体中增加了两个
flag
,
判断是否是终止状态或者起始状态。在生成
dot
文件的时候判断两个
flag
就可以
了,生成的最小化
DFA
图的起始状态和终止状态也是正确的。