27.7. Text Processing

27.7.1. iconv - Convert encoding of given files from one encoding to another

27.7.1.1. cconv - A iconv based simplified-traditional chinese conversion tool

cconv是建立在iconv之上,可以UTF8编码直接转换,并增加了词转换。

sudo apt-get install cconv
			

使用cconv进行简繁转换的方法为:

cconv -f UTF8-CN -t UTF8-HK zh-cn.txt -o zh-hk.txt
			
27.7.1.2. uconv - convert data from one encoding to another

安装

sudo apt-get install libicu-dev
			

例子

$ uconv -f cp1252 -t UTF-8 -o file_in_utf8.txt file_in_cp1252_encoding.txt
			

27.7.2. 字符串处理命令expr

		
字符串处理命令expr用法简介:
名称:expr
用途:求表达式变量的值。
语法: expr Expression
实例如下:
例子1:字串长度
shell>> expr length "this is a test content";
22
例子2:求余数
shell>> expr 20 % 9
2
例子3:从指定位置处截取字符串
shell>> expr substr "this is a test content" 3 5
is is
例子4:指定字符串第一次出现的位置
shell>> expr index "testforthegame" s
3
例子5:字符串真实重现
shell>> expr quote thisisatestformela
thisisatestformela
		
		

27.7.3. cat - concatenate files and print on the standard output

-b	不对空白行编号。
-e	使用 $ 字符显示行尾。
-n	从 1 开始对所有输出行编号。
-q	使用静默操作(禁止错误消息)。
-r	将所有多个空行替换为单行(“压缩”空白)。
-t	将制表符显示为 ^I。
-u	不对输出进行缓冲。
-v	可视地显示非打印控制字符。
		
27.7.3.1. -s, --squeeze-blank suppress repeated empty output lines

-S 将多个空白行压缩到单行中(与 -r 相同)

			
$ cat >> /tmp/test <<EOF
Line1

Line2


Line3




Line4


Line5

EOF

$ cat -s /tmp/test
Line1

Line2

Line3

Line4

Line5

			
			
27.7.3.2. -v, --show-nonprinting use ^ and M- notation, except for LFD and TAB

显示控制字符。例如Tab等,下面例子查看文件结尾换行符类型

			
[neo@netkiller ~]# cat -v file.txt
GRANT USAGE ON *.* TO 'esauser'@'localhost' IDENTIFIED BY xxxxxxx; ^M
^M
file^M
2059^M
			
			

27.7.4. nl - number lines of files

$ nl /etc/issue
     1  CentOS release 5.4 (Final)
     2  Kernel \r on an \m
		

27.7.5. od - dump files in octal and other formats

27.7.5.1. 16进制
$ echo "helloworld" | od -x
			

27.7.6. tr - translate or delete characters

":"替换为"\n"

$ cat /etc/passwd |tr ":" "\n"
		

27.7.7. cut - remove sections from each line of files

列操作

$ last | grep  'neo' | cut -d ' ' -f1
        
$ cat /etc/passwd | cut -d ':' -f1
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy

$ cat /etc/passwd | cut -d ':' -f1,3,4

# cat /etc/passwd | cut -d ':' -f1,6
root:/root
bin:/bin
daemon:/sbin
adm:/var/adm
lp:/var/spool/lpd
sync:/sbin
shutdown:/sbin
halt:/sbin
mail:/var/spool/mail
uucp:/var/spool/uucp
operator:/root
games:/usr/games
gopher:/var/gopher
ftp:/var/ftp
nobody:/
vcsa:/dev
saslauth:/var/empty/saslauth
postfix:/var/spool/postfix
sshd:/var/empty/sshd
rpc:/var/cache/rpcbind
rpcuser:/var/lib/nfs
nfsnobody:/var/lib/nfs
ntp:/etc/ntp
nagios:/var/log/nagios

        

行操作

$ cat /etc/passwd | cut -c 1-4
root
daem
bin:
sys:
sync
game
man:

$ echo "No such file or directory"| cut -c4-7
such

$ echo "No such file or directory"| cut -c -8
No such

$ echo "No such file or directory"| cut -c-8
No such

        

27.7.8. printf - format and print data

printf "%d\n" 1234
		
$ printf "\033[1;33m TEST COLOR \n\033[m"
		

27.7.9. Free `recode' converts files between various character sets and surfaces.

Following will convert text files between DOS, Mac, and Unix line ending styles:

		
$ recode /cl../cr <dos.txt >mac.txt
$ recode /cr.. <mac.txt >unix.txt
$ recode ../cl <unix.txt >dos.txt
		
		

27.7.10. /dev/urandom 随机字符串

		
[neo@test .deploy]$ echo `< /dev/urandom tr -dc A-Z-a-z-0-9 | head -c 8`
GidAuuNN
[neo@test .deploy]$ echo `< /dev/urandom tr -dc A-Z-a-z-0-9 | head -c 8`
UyGaWSKr
		
		

我常常使用这样的随机字符初始化密码

		
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:alnum:] | head -c 8`
xig8Meym
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:alnum:] | head -c 8`
23Ac1vZg
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:digit:] | head -c 8`
73652314
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:graph:] | head -c 8`
GO_o>OnJ
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:graph:] | head -c 10`
iGy0FS/aO5
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:graph:] | head -c 50`
;`E^{5(T4v~5$YovW.?%_?9la<`+qPcRh@7mD\!Whx;MJZVQ\K
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:print:] | head -c 50`
fy$[#:'(')jt'gp1/g-)d~p]8 :r9i;MO2d!8M<?Qs3t:QgK$O
[neo@test .deploy]$ echo `< /dev/urandom tr -dc [:graph:] | head -c 50`
6SivJ5y$/FTi8mf}rrqE&s0"WkA}r;uK-=MT!Wp0UlL_lF0|bL
		
		

批量生成

		
for i in {1..10}
do
echo `< /dev/urandom tr -dc A-Z-a-z-0-9 | head -c 8`
done
		
		
		
# cat /dev/urandom | tr -cd [:alnum:] | fold -w30 | head -n 20
AVqROzjF6ZATJGv2J6PzDHp3jLpKV4
ONt68UFNDwgXpSnLBV7oRDX3VLRYsX
EZTWCGvZc3mIEeuw9sxMtV8ZkzVRJv
BhUiv0a7utsjZFLYpKGZrY5aDXcZL4
5YfUl2hmDT1O9X61DRYg4wSp4lXoXX
ykyPJxH47PzxnNGlujIUF98ZtB01H0
QyP53mksQN8bCNNo1fSD3RtqhhEGfa
u2RkT1M9GUQF4a6O18tG5WD97OOXze
Whm5X7398Q8L9BONN8k2oLy9CL37JO
TmGQz7WB6WnkjhyB4wrBHBJ3HMIRyf
hww43yvddUDYUnbNOKjhv3sLhCA4YD
uY6zQtBC6miwLUl3jkCVVA0Xu8ASgj
jv58qu46VW7LvRIq4txNE8bG9NBlZl
pzaMkydAiCHCF5H2oQVqMn4DTTYgNL
yoN2A9LyrCwLfjP1ad9HMAwxExJL5i
J27iy2L90m9dpcPLJ8tl46GGb9xqmQ
6YwFCvuPHyyEwnctUTpqLFcvUafVZ2
Nuq9XgIgRQGynjlVqGLMOpO0MkGpsn
tChkRG7eoRuKVXgW7ccTGx45E54K3Y
qPv48XqdGlOrdULCOGZ45kwJ1v5kVX		
		
		

27.7.11. col - filter reverse line feeds from input

清除 ^M 字符

$ cat oldfile | col -b > newfile
		

27.7.12. apg - generates several random passwords

sudo apt-get install apg

$ apg

Please enter some random data (only first 16 are significant)
(eg. your old password):>
imlogNukcel5 (im-log-Nuk-cel-FIVE)
Drocdaf1 (Droc-daf-ONE)
fagJook0 (fag-Jook-ZERO)
heabugJer4 (heab-ug-Jer-FOUR)
5OsEsudy (FIVE-Os-Es-ud-y)
IrjOgneagOc9 (Irj-Og-neag-Oc-NINE)


$ apg -M SNCL -m 16
WoidWemFut6dryn,
byRowpEus-Flutt0
|QuogCagFaycsic0
ojHoadCyct4Freg_
Vir9blir`orhohoo
bapOip?Ibreawov2
		

27.7.13. head/tail

head -c 17 | tail -c 1
		

27.7.14. 反转字符串或文件内容

rev - reverse lines of a file or files

反转字符串

# echo hello | rev
olleh

# echo "hello world" | rev
dlrow olleh
		

反转文件内容

# rev /etc/passwd
hsab/nib/:toor/:toor:0:0:x:toor
nigolon/nibs/:nib/:nib:1:1:x:nib
nigolon/nibs/:nibs/:nomead:2:2:x:nomead
nigolon/nibs/:mda/rav/:mda:4:3:x:mda
nigolon/nibs/:dpl/loops/rav/:pl:7:4:x:pl
cnys/nib/:nibs/:cnys:0:5:x:cnys
nwodtuhs/nibs/:nibs/:nwodtuhs:0:6:x:nwodtuhs
tlah/nibs/:nibs/:tlah:0:7:x:tlah
nigolon/nibs/:liam/loops/rav/:liam:21:8:x:liam
nigolon/nibs/:pcuu/loops/rav/:pcuu:41:01:x:pcuu
nigolon/nibs/:toor/:rotarepo:0:11:x:rotarepo
nigolon/nibs/:semag/rsu/:semag:001:21:x:semag
nigolon/nibs/:rehpog/rav/:rehpog:03:31:x:rehpog
nigolon/nibs/:ptf/rav/:resU PTF:05:41:x:ptf
nigolon/nibs/:/:ydoboN:99:99:x:ydobon
nigolon/nibs/:ved/:renwo yromem elosnoc lautriv:96:96:x:ascv
nigolon/nibs/:ptn/cte/::83:83:x:ptn
nigolon/nibs/:htualsas/ytpme/rav/:"resu dhtualsaS":67:994:x:htualsas
nigolon/nibs/:xiftsop/loops/rav/::98:98:x:xiftsop
nigolon/nibs/:dhss/ytpme/rav/:HSS detarapes-egelivirP:47:47:x:dhss
hsab/nib/:lqsym/bil/rav/:revres LQSyM:994:894:x:lqsym
hsab/nib/:www/:noitacilppA beW:08:08:x:www
nigolon/nibs/:xnign/ehcac/rav/:resu xnign:894:794:x:xnign
		

27.7.15. TAB符号与空格处理

27.7.15.1. expand - convert tabs to spaces
转换 TAB 字符为空格
root@netkiller /var/log % yum --showduplicates list httpd | expand
Repository epel is listed more than once in the configuration
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
Available Packages
httpd.x86_64                    2.4.6-67.el7.centos                      os     
httpd.x86_64                    2.4.6-67.el7.centos.2                    updates
			
27.7.15.2. unexpand - convert spaces to tabs
转换空格为TAB符
root@netkiller /var/log % cat /etc/fstab | unexpand -t 16
/dev/vda1	     /	          ext3	     noatime,acl,user_xattr 1 1
proc	     /proc	          proc	     defaults	           0 0
sysfs	     /sys	          sysfs	     noauto	           0 0
debugfs	     /sys/kernel/debug    debugfs    noauto	           0 0
devpts	     /dev/pts	          devpts     mode=0620,gid=5       0 0
			

将16个空格替换为一个TAB符




原文出处:Netkiller 系列 手札
本文作者:陈景峯
转载请与作者联系,同时请务必标明文章原始出处和作者信息及本声明。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值