python应用实例-linux源码整理

最新推荐文章于 2024-09-11 17:16:05 发布

1-programmer

最新推荐文章于 2024-09-11 17:16:05 发布

阅读量2.1k

点赞数

分类专栏：人生苦短，我用Python 文章标签： linux源码整理去除注释空行 python脚本工具

本文链接：https://blog.csdn.net/u012520854/article/details/53455751

版权

人生苦短，我用Python 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

一、想要达到的目的

在写关于Linux TCP/IP协议栈的体会时，需要引用源码。但源码中有许多注释和空行，不利于页面排版。我想把一个源码文件中所有的注释和空行去掉，同时又能保留源文件中代码的行数，方便读者对照源码阅读。也就是如下右图的效果。黄色行号是vim显示的，红色行号是通过脚本处理后加入的。

二、怎么做

要的效果实际上是一个简单的文件处理。思路很简单：

1 把源文件打开，把输出文件打开。

2 记录行数，处理当前行。判断当前行是不是注释行，是不是空行。如果都不是，则把当前行数（在源文件中的行数）和当前行输出到输出文件中去。

3 关闭两个文件

三、具体实现（代码）

感谢linux源码编码格式的规格性。所以采用三个正则式就可以完善匹配和判断。现分享代码如下：

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys
import re

if (len(sys.argv) != 2):
    print '''usage: python thisfile src.c. 
a src.c-post file is the post file'''
    exit()

input_file_name = sys.argv[1]
input_file = open(input_file_name)
output_file = open("%s-post"% input_file_name, 'w')
note_pattern1 = re.compile("[ \t]*/\*.*\*/") # /* ... */
note_pattern2_start = re.compile("[ \t]/\*\n")# /*
note_pattern2_end = re.compile("[ \t]*\*/") # */
index = 0
line_in_note = False

while True:
    read_in_line = input_file.readline()
    if read_in_line:
        index += 1
        if (note_pattern1.match(read_in_line)):
#            print 'find a /* ...  */ note'
            continue
        if (note_pattern2_start.match(read_in_line)):
#            print 'find a /* start'
            line_in_note = True
            continue
        if (note_pattern2_end.match(read_in_line)):
#            print 'find a /* end'           
            line_in_note = False
            continue
        if line_in_note: # in note block, so pass it.
            continue
        if len(read_in_line) >= 2: #when length is 1, it must be a blank line.
#            print index," ",read_in_line
            output_file.write("%d %s" % (index ,read_in_line))
    else:
        break

input_file.close()
output_file.close()

处理前103行的函数，现在只有53行。对排版来说，好多了。自己用的脚本，没有添加错误处理和完整的参数检查 :)。