rule_解析`````````````````

代码小鬼才

已于 2024-08-05 17:28:38 修改

阅读量946

点赞数 17

文章标签： python 开发语言

于 2024-08-02 17:08:39 首次发布

本文链接：https://blog.csdn.net/qq_34171593/article/details/140875452

版权

"""编写一个Python函数，该函数接受两个参数：
一个文本文件【rule】和一个字符串【input】，
请根据文件中【rule】的替换规则返回一个字符串【output】。
def apply_replacement_rules(rules, input_string):
# 你的代码【rule】

解析要求：
1、文本文件【rule】包含按顺序应用的替换规则。
2、每条规则是由空格分隔的两个字符串：s1 s2，其中s1和s2是不包含任何空格字符的字符串。
其含义是：对于输入字符串中的每个子字符串s1，
如果该子字符串没有被之前的任何规则更改过，
则将其替换为s2。

举例：
考虑文件rule.txt的内容如下（包含3条规则）：
abc cde
ab b
c a

它在输入字符串babcabdcdc上的工作过程如下：
1.第一条规则：b + cde + abdcdc #abc cde
2.第二条规则：b + cde + b + dcdc #ab b
3.第三条规则：b + cde + b + d + a + d + a #c a
（注意，尽管“cde”包含“c”，但它不会被第三条规则出发，因为它是第一条规则产生的新字符串，而不是来自原始字符串）
所以输出是 bcdebdada

请注意：参数规则【rule】不是固定的，你的程序应该适用于任何顺序与任何替换规则，
如：c ab
abc cde
ab b
或
abcd ab
cde bcd
ade bc
请你尝试使用最优的时间复杂度来解决这个问题

解题思路：
一、先按行读取rule.txt文件，因其格式固定为每行一条规则，且每行格式为s1 s2，
读取时将其存为字典，该字典rule_dict格式为{n：[s1,s2]},n为行数
二、构造状态字典，用于存储字符串修改的历史状态及当前状态，该状态字典随规则顺序流动而更新。
如：输入字符串babcabdcdc：
1.第一条规则：b + cde + abdcdc #abc cde
此时该字典str_status为：
{1:{"source_str":"babcabdcdc","new_str":"bcdeabdcdc","change_index":["1,3"]}}
字段说明：键1表示第一条规则，键1对应的值表示第一条规则运行后的状态，source_str表示
执行第一条规则之前的输入字符串，new_str表示执行第一条规则之后的输出字符串，change_index"表示
new_str中哪一段索引是被改变的索引
2.第二条规则：b + cde + b + dcdc #ab b
此时该字典str_status为：
{1:{"source_str":"babcabdcdc","new_str":"bcdeabdcdc","change_index":["1,3"]},
2:{"source_str":"bcdeabdcdc","new_str":"bcdebdcdc","change_index":["1,3","4"]},
}
3.第三条规则：b + cde + b + d + a + d + a #c a
此时该字典str_status为：
{1:{"source_str":"babcabdcdc","new_str":"bcdeabdcdc","change_index":["1,2,3"]},
2:{"source_str":"bcdeabdcdc","new_str":"bcdebdcdc","change_index":["1,2,3","4"]},
3:{"source_str":"bcdebdcdc","new_str":"bcdebdada","change_index":["1,2,3","4","6,8"]},
}
三、字符串按规则处理函数
传入rule_dict和原生字符串。按照规则进行处理：对于输入字符串中的每个子字符串s1，
如果该子字符串没有被之前的任何规则更改过（通过str_status判断有没有更改过），则将其替换为s2，
之后更新str_status"""

"""编写一个Python函数，该函数接受两个参数：
一个文本文件【rule】和一个字符串【input】，
请根据文件中【rule】的替换规则返回一个字符串【output】。
def apply_replacement_rules(rules, input_string):
	# 你的代码【rule】

解析要求：
1、文本文件【rule】包含按顺序应用的替换规则。
2、每条规则是由空格分隔的两个字符串：s1 s2，其中s1和s2是不包含任何空格字符的字符串。
其含义是：对于输入字符串中的每个子字符串s1，
如果该子字符串没有被之前的任何规则更改过，
则将其替换为s2。

举例：
考虑文件rule.txt的内容如下（包含3条规则）：
abc cde
ab b
c a

它在输入字符串babcabdcdc上的工作过程如下：
1.第一条规则：b + cde + abdcdc #abc cde
2.第二条规则：b + cde + b + dcdc #ab b
3.第三条规则：b + cde + b + d + a + d + a #c a
（注意，尽管“cde”包含“c”，但它不会被第三条规则出发，因为它是第一条规则产生的新字符串，而不是来自原始字符串）
所以输出是 bcdebdada


请注意：参数规则【rule】不是固定的， 你的程序应该适用于任何顺序与任何替换规则，
如：c ab
abc cde
ab b
或
abcd ab
cde bcd
ade bc
请你尝试使用最优的时间复杂度来解决这个问题


解题思路：
一、先按行读取rule.txt文件，因其格式固定为每行一条规则，且每行格式为s1 s2，
读取时将其存为字典，该字典rule_dict格式为{n：[s1,s2]},n为行数
二、构造状态字典，用于存储字符串修改的历史状态及当前状态，该状态字典随规则顺序流动而更新。
如：输入字符串babcabdcdc：
1.第一条规则：b + cde + abdcdc #abc cde
此时该字典str_status为：
{1:{"source_str":"babcabdcdc","new_str":"bcdeabdcdc","change_index":["1,3"]}}
字段说明：键1表示第一条规则，键1对应的值表示第一条规则运行后的状态，source_str表示
执行第一条规则之前的输入字符串，new_str表示执行第一条规则之后的输出字符串，change_index"表示
new_str中哪一段索引是被改变的索引
2.第二条规则：b + cde + b + dcdc #ab b
此时该字典str_status为：
{1:{"source_str":"babcabdcdc","new_str":"bcdeabdcdc","change_index":["1,3"]},
2:{"source_str":"bcdeabdcdc","new_str":"bcdebdcdc","change_index":["1,3","4"]},
}
3.第三条规则：b + cde + b + d + a + d + a #c a
此时该字典str_status为：
{1:{"source_str":"babcabdcdc","new_str":"bcdeabdcdc","change_index":["1,2,3"]},
2:{"source_str":"bcdeabdcdc","new_str":"bcdebdcdc","change_index":["1,2,3","4"]},
3:{"source_str":"bcdebdcdc","new_str":"bcdebdada","change_index":["1,2,3","4","6,8"]},
}
三、字符串按规则处理函数
传入rule_dict和原生字符串。按照规则进行处理：对于输入字符串中的每个子字符串s1，
如果该子字符串没有被之前的任何规则更改过（通过str_status判断有没有更改过），则将其替换为s2，
之后更新str_status"""
import re


def read_rules_file(file_path):
    rule_dict = {}
    if "\n" in file_path:
        for n, line in enumerate(file_path.split("\n")):
            s1, s2 = line.strip().split()
            rule_dict[n] = [s1, s2]
    else:
        with open(file_path, 'r', encoding='utf-8') as file:
            for n, line in enumerate(file, start=1):
                s1, s2 = line.strip().split()
                rule_dict[n] = [s1, s2]
    return rule_dict


def find_substring(main_string, sub_string):
    indexes = [m.start() for m in re.finditer('(?={})'.format(re.escape(sub_string)), main_string)]
    target_index = []
    for index in indexes:
        if main_string.find(sub_string) == -1:
            return False
        else:
            target = list(range(index, index + len(sub_string)))
            str_target = ",".join(map(str, target))
            target_index.append(str_target)
    return target_index


def replace_by_indices(input_str, indices_list, replacement):
    new_str = list(input_str)
    for i, indices in enumerate(indices_list):
        #获取起始位置和结束位置
        start, end = map(int, [indices.split(',')[0], indices.split(',')[-1]])
        #考虑切片长度与要替换的字符串长度不等的情况
        len_qie = len(replacement) - (end + 1 - start)
        start = start + len_qie * i
        end = end + len_qie * i
        new_str[start:end + 1] = replacement

    return ''.join(new_str)


def parse_number_string(s: str):
    """将连续数字字符串转换为整数列表"""
    return [int(n) for n in s.split(',')]


def check_relationship(s1: str, s2: str):
    """
    判断两个连续数字字符串是否存在交叉或包含关系。

    :param s1: 第一个连续数字字符串
    :param s2: 第二个连续数字字符串
    :return: 返回关系类型，'none', 'cross', 或 'contain'
    """
    # 将字符串转换为整数列表
    list1 = parse_number_string(s1)
    list2 = parse_number_string(s2)

    # 获取每个列表的边界值
    min1, max1 = min(list1), max(list1)
    min2, max2 = min(list2), max(list2)

    # 检查包含关系
    if min1 >= min2 and max1 <= max2:
        return True
    elif min2 >= min1 and max2 <= max1:
        return True

    # 检查交叉关系
    if max1 >= min2 and min1 <= max2:
        return True
    elif max2 >= min1 and min2 <= max1:
        return True

    # 如果既不交叉也不包含，返回 'none'
    return False


def apply_replacement_rules(rules_file, input_string: str):
    rule_dict = read_rules_file(rules_file)
    # 规则条数
    len_rule_dict = len(rule_dict)
    str_status = {}
    current_str = input_string
    for rule_num, (s1, s2) in rule_dict.items():
        if not str_status:  # 如果为空，则为第一次规则
            # 记录第一条规则执行前的s1的所有位置
            before_pro = find_substring(input_string, s1)
            if before_pro:  # 只有s1存在于input_string中时才考虑替换
                current_str = input_string.replace(s1, s2)
                if len(s1) == len(s2):  # 当s1等于s2的长度时直接替换并更新状态字典
                    str_status[rule_num] = {
                        "source_str": input_string,
                        "new_str": current_str,
                        "change_index": before_pro
                    }
                else:  # 如果不等
                    len_d = len(s2) - len(s1)
                    before_pro_new = []
                    for index_ren, index_replace in enumerate(before_pro):
                        list_index = index_replace.split(",")
                        if len_d < 0:  # 如果小于
                            for i in range(abs(len_d)):
                                list_index.pop(-1)
                        else:  # 如果大于
                            N_ = 0
                            for i in range(abs(len_d)):
                                N_ += 1
                                list_index.append(str(int(list_index[-1]) + N_ + abs(len_d) * index_ren))
                            list_index[0] = str(int(list_index[0]) + abs(len_d) * index_ren)
                        list_index = ",".join(list_index)
                        before_pro_new.append(list_index)
                    str_status[rule_num] = {
                        "source_str": input_string,
                        "new_str": current_str,
                        "change_index": before_pro_new
                    }
        else:
            # 上次执行规则后的字符串
            new_str = str_status[sorted(list(str_status.keys()))[-1]]["new_str"]
            # 准备改动的位置
            before_pro = find_substring(new_str, s1)
            # 准备改动的位置，缓存
            before_pro_temp = find_substring(new_str, s1)
            # 已经改动的位置
            change_index = str_status[sorted(list(str_status.keys()))[-1]]["change_index"]
            # 只有s1存在于input_string中时才考虑替换
            if before_pro:
                # 每一个准备改动的位置
                for pre_chage in before_pro_temp:
                    # 每一个已经改动的位置
                    for change_part in change_index:
                        # 如果准备改动的位置与已经改动的位置存在交叉、包含关系
                        if check_relationship(pre_chage, change_part):
                            # 将不可改动的位置从准备改动的位置移除
                            before_pro.pop(before_pro.index(pre_chage))
                            # 将可以改动的位置按索引进行替换
                current_str = replace_by_indices(new_str, before_pro, s2)
                # 状态更新
                before_pro_new = change_index + before_pro
                str_status[rule_num] = {
                    "source_str": new_str,
                    "new_str": current_str,
                    "change_index": before_pro_new
                }

    return current_str, str_status


# Example usage
'''常规情况'''
# rules_file = "rule.txt"
# rules_file = "abc cde\nab b\nc a"#bcdebdada
# rules_file = "c ab\nabc cde\nab b"  # bbabbdabdab
# input_string = "babcabdcdc"
# output_string, status = apply_replacement_rules(rules_file, input_string)
# print("Output String:", output_string)
# print("Status:", status)

'''特殊情况'''
# rules_file = "abcd ab\ncde bcd\nade bc"#babcabdcdc
# input_string = "babcabdcdc"
# output_string, status = apply_replacement_rules(rules_file, input_string)
# print("Output String:", output_string)
# print("Status:", status)

'''自测试'''
# rules_file = "ab aby\ncde bcid\nade bc\nb g"  # gabycdabydbcidc
# input_string = "babcdabdcdec"
# output_string, status = apply_replacement_rules(rules_file, input_string)
# print("Output String:", output_string)
# print("Status:", status)

'''问题数据'''
# rules_file = "bcd ac\ncd ba\nab dd\nc a"  # ddaddaaacba
# input_string = "abcabcabcdcd"
# output_string, status = apply_replacement_rules(rules_file, input_string)
# print("Output String:", output_string)
# print("Status:", status)

代码小鬼才

关注

17
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
rule_解析`````````````````

（注意，尽管“cde”包含“c”，但它不会被第三条规则出发，因为它是第一条规则产生的新字符串，而不是来自原始字符串）执行第一条规则之前的输入字符串，new_str表示执行第一条规则之后的输出字符串，change_index"表示。字段说明：键1表示第一条规则，键1对应的值表示第一条规则运行后的状态，source_str表示。3.第三条规则：b + cde + b + d + a + d + a #c a。3.第三条规则：b + cde + b + d + a + d + a #c a。
复制链接

扫一扫