linux中如何过滤中文字符,如何在Bash（linux）或Python上只过滤文件中的可打印字符？...

最新推荐文章于 2022-12-28 17:50:58 发布

墨尔本情人

最新推荐文章于 2022-12-28 17:50:58 发布

阅读量409

点赞数

文章标签： linux中如何过滤中文字符

hexdump显示.[16D中的点实际上是转义符，\x1b。

Esc[nD是删除n字符的ANSI escape code。因此Esc[16D告诉终端删除16个字符，这解释了cat输出。在

从文件中删除ANSI转义码有多种方法，可以使用Bash命令(例如使用sed，如Anubhava的回答)或Python。在

但是，在这种情况下，最好通过终端仿真器运行该文件，以解释文件中任何现有的编辑控制序列，这样在应用这些编辑序列之后，您就可以得到文件作者想要的结果。在

在Python中实现这一点的一种方法是使用pyte，这是一个Python模块，它实现了一个简单的与VTXXX兼容的终端仿真器。您可以使用pip轻松地安装它，下面是它在readthedocs上的文档。在

下面是一个简单的演示程序，它解释问题中给出的数据。它是为python2编写的，但是很容易适应python3。pyte支持Unicode，它的标准流类需要Unicode字符串，但是这个示例使用了ByteStream，所以我可以向它传递一个纯字节字符串。在#!/usr/bin/env python

''' pyte VTxxx terminal emulator demo

Interpret a byte string containing text and ANSI / VTxxx control sequences

Code adapted from the demo script in the pyte tutorial at

http://pyte.readthedocs.org/en/latest/tutorial.html#tutorial

Posted to http://stackoverflow.com/a/30571342/4014959

Written by PM 2Ring 2015.06.02

'''

import pyte

#hex dump of data

#00000000 48 45 4c 4c 4f 20 54 48 49 53 20 49 53 20 54 48 |HELLO THIS IS TH|

#00000010 45 20 54 45 53 54 1b 5b 31 36 44 20 20 20 20 20 |E TEST.[16D |

#00000020 20 20 20 20 20 20 20 20 20 20 20 1b 5b 31 36 44 | .[16D|

#00000030 20 20 | |

data = 'HELLO THIS IS THE TEST\x1b[16D \x1b[16D '

#Create a default sized screen that tracks changed lines

screen = pyte.DiffScreen(80, 24)

screen.dirty.clear()

stream = pyte.ByteStream()

stream.attach(screen)

stream.feed(data)

#Get index of last line containing text

last = max(screen.dirty)

#Gather lines, stripping trailing whitespace

lines = [screen.display[i].rstrip() for i in range(last + 1)]

print '\n'.join(lines)

输出

^{pr2}$

输出的十六进制转储00000000 48 45 4c 4c 4f 0a |HELLO.|

墨尔本情人

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
linux中如何过滤中文字符,如何在Bash（linux）或Python上只过滤文件中的可打印字符？...

hexdump显示.[16D中的点实际上是转义符，\x1b。Esc[nD是删除n字符的ANSI escape code。因此Esc[16D告诉终端删除16个字符，这解释了cat输出。在从文件中删除ANSI转义码有多种方法，可以使用Bash命令(例如使用sed，如Anubhava的回答)或Python。在但是，在这种情况下，最好通过终端仿真器运行该文件，以解释文件中任何现有的编辑控制序列，这样在应用...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。