【perl脚本】单一文件中的相似内容批量处理

前言

这个脚本的功能本身有些拗口:单一文件中的相似内容批量处理,其实就是我们有时需要对log中的大量相似内容做同样的操作,这个时候单纯用vim做的话其实挺难受的,因此我就做了这个脚本。

其实这个脚本我做的有些年头了,只不过一直没有拿出来,一直作为不传之秘哈哈。话说回来,当年做完这个后我是真的第一次感觉到脚本对工作效率的提升是如此的明显,从此才开始走上不归路,所以今天我写的格外的细致,虽然脚本里还是没有注释吧~

适用场景

这个脚本的使用场景是很多的,举一个工作上的例子:

比如说在项目初期mem_ctrl_bus的值不稳定,此时我们可能需要批量的将mem_ctrl_bus的值force为恒定值以避免mem_warning引起的仿真报错,那么我们可以很轻易的通过verid来获取到这类ctrl_bus的信号名,比如我们得到了这样类型的文件:

********************************************
* VIA App: getModSignal.tcl
* Report : getModSignal.log
* Date   : 2021-10-21 10:17:28
********************************************

================================================================================
Target Modules: reg_unit
================================================================================

SIGNAL 1: GET SIGNAL RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 2: GET SIGNAL RISC_SPM.U_PROCESSING1.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 3: GET SIGNAL RISC_SPM.U_PROCESSING2.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 4: GET SIGNAL RISC_SPM.U_PROCESSING0.U_REG_Y.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 5: GET SIGNAL RISC_SPM.U_PROCESSING0.U_R3.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 6: GET SIGNAL RISC_SPM.U_PROCESSING1.U_R3.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 7: GET SIGNAL RISC_SPM.U_PROCESSING2.U_R3.sysn_mem_ctrl[7:0] in RISC_ASPM

================================================================================
Target Modules: reg_unit
================================================================================

而我们需要的是这样的文件:

force RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING1.U_REG_Z.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING2.U_REG_Z.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING0.U_REG_Y.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING0.U_R3.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING1.U_R3.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING2.U_R3.sysn_mem_ctrl[7:0] = '0;
release RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING1.U_REG_Z.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING2.U_REG_Z.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING0.U_REG_Y.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING0.U_R3.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING1.U_R3.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING2.U_R3.sysn_mem_ctrl[7:0];

如果是只有几条的话可能还比较容易处理,那么如果是成百上千条那么初期起来还是挺恶心的,开始的时候我是使用vim的宏录制来处理的,后来想了下为啥不做个脚本来搞呢?

于是我给他起名为:单一文件中的相似内容批量处理脚本!

行为分析

仔细观察一下内容和目标内容,是可以找到一个规律的,实际上我们是在做一件事情,即把一个字符串由ABC 转换为 DBE的形式,比如上面的例子中:

A :GET SIGNAL 

B:RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0]

C:in RISC_ASPM

D:force

B:RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0]

E:= '0;

那么这样看来我们要做的事情就很清楚了,

1.选取内容相似行的某一行中的与其他行出了“核心部分”不一样,其他内容一样的部分

SIGNAL 1: GET SIGNAL RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 2: GET SIGNAL RISC_SPM.U_PROCESSING1.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM

SIGNAL 3: GET SIGNAL RISC_SPM.U_PROCESSING2.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM

...

比如在上面的内容里,我们选取第一行的 GET SIGNAL RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM,原因是除了“核心部分”RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0]之外,其他内容GET SIGNAL 、 in RISC_ASPM所有待处理行都是一样的,注意不能选中前面的SIGNAL 1/2/3/4...,因为每行这个内容是不一样的;

2.写出这行的最终目标格式

比如现在的目标格式就是

force RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] = '0;

3.找到目标内容相对于“核心内容”的改变 

那么当前的改变就是在前面加了"force ",后面加了" = '0;"

4.对源文件中其他类似行(即GET SIGNAL xxxxx in RISC_ASPM类型的行)进行同样的处理,提取“核心内容”,前后添加目标格式的内容

emmm说的太多就会把自己绕进去,直接看成果吧!

成果展示

gvim一个ref文件,内容为(这个是relase的,反正都一个意思):

GET SIGNAL RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM
release RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0];

 执行脚本:

[xiaotu@xiaotu-eda ~/Desktop/rsic/src]$lot_replace ref path.log
release RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING1.U_REG_Z.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING2.U_REG_Z.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING0.U_REG_Y.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING0.U_R3.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING1.U_R3.sysn_mem_ctrl[7:0];
release RISC_SPM.U_PROCESSING2.U_R3.sysn_mem_ctrl[7:0];

再搞个force的效果图:

GET SIGNAL RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] in RISC_ASPM
force RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] = '0;
[xiaotu@xiaotu-eda ~/Desktop/rsic/src]$lot_replace ref path.log
force RISC_SPM.U_PROCESSING0.U_REG_Z.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING1.U_REG_Z.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING2.U_REG_Z.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING0.U_REG_Y.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING0.U_R3.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING1.U_R3.sysn_mem_ctrl[7:0] = '0;
force RISC_SPM.U_PROCESSING2.U_R3.sysn_mem_ctrl[7:0] = '0;

尾巴

使用这个脚本的核心就是ref文件中的第一行一定要写好,总结起来就是,源文件的若干相似内容应该是:

xxxA(B0)Cyyy

xxxxA(B1)Cyyyy

xxyxA(B2)Cxyyy

一定要把A(B0)C找到,然后第二行写D(B0)E,就ok了。说起来难其实做起来很简单,比如VCS的log中这类内容:

Path “xxxxxxxxxxx.v” MEM WARNING harness.aaa.sync at time 00000ns;

...

Path “xxxyyyxxxxxx.v” MEM WARNING harness.bbb.ccc.sync at time 00110ns;

那么直接提取不就是:

MEM WARNING harness.aaa.sync at time

set ignore path {harness.aaa.sync};

非常简单了可以说!!!

附件

完整代码,下次一定写注释!

#!/usr/bin/perl -w

#main func
my $min_len = 5;

if($#ARGV >= 1){
    $ref = $ARGV[0];
    $inp = $ARGV[1];
} else {
    die "input error";
}

open(REF, $ref);
open(INP, $inp);

my $incnt = 0;
my ($from, $to) = <REF>;
chomp $from;
chomp $to;

my ($to_pre, $to_post, $from_pre, $from_post) = get_match_str($from, $to);

while(<INP>){
    my $line = $_;
    chomp $line;

    #warn "$line\n$from_pre\n$from_post";

    my $start = index($line, $from_pre);
    my $end   = index($line, $from_post);
    next if $start == -1 or $end == -1;

    $end  = length($line) if $end == 0;

    #warn "$line, $start, $end ";
    $line = substr($line, $start, $end - $start + length($from_post));

    my $ker = get_kernel($line, $from_pre, $from_post);
    my $res = $to_pre.$ker.$to_post;

    print $res."\n";
}

close INP;
close REF;

#sub define
sub get_sub_str{
    my ($str, $min) = @_;
    
    #warn "$str, $min";

    my $len = length($str);
    my @sub_list;

    for (my $start = 0; $start <= $len - $min; $start++){
        for($sub = $min; $sub + $start <= $len; $sub++){
            push @sub_list, substr($str, $start, $sub);
            #print substr($str, $start, $sub), "\n";
        }
    }
    return @sub_list;
}

sub get_match_str{
    my ($from, $to) = @_;
    #warn "$from"; warn "$to";
    
    my @subs = get_sub_str($from, $min_len);

    my $len   = 0;
    my $start = 0;
    my $ker   = "";
    for my $sub(@subs){
        if(index($to, $sub) != -1 and length($sub) > $len){
            $ker   = $sub;
            $len   = length($sub);
            $start = index($to, $sub);
        }
    }

    #warn "$ker";
    my $to_pre    = substr($to, 0, $start);
    my $to_post   = substr($to, $start + $len);
    my $from_pre  = substr($from, 0, index($from, $ker));
    my $from_post = substr($from, index($from, $ker) + length($ker));

    #warn "$to_pre, $to_post";
    #warn "$from_pre, $from_post";
    return $to_pre, $to_post, $from_pre, $from_post;
}

sub get_kernel{
    my($org, $pre, $post) = @_;
    #warn "$org";
    #warn "$pre, $post";
    return substr($org, length($pre), length($org) - length($pre) - length($post));
}
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

尼德兰的喵

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值