Linux常用处理-CSDN博客

本文链接：https://blog.csdn.net/iteye_1177/article/details/82486656

整理下常用‘文本处理’方法，帮自己建立一个解决此类问题的‘惯性’，步骤如下：

1.查找 grep

a、普通查找，如查找包含‘dd’的行：grep ’dd‘ dt.tx （提示：-v 不匹配）

b、正则：集合[a-z]，排除[^a-z]，开始结尾'^/$'等，基本上遵循普通正则规则（http://deerchao.net/tutorials/regex/regex.htm 正则表达式30分钟入门）

2.字符串提取

a、sed [sed 'action'，action必须写在''内] http://linux.vbird.org/linux_basic/0330regularex.php#sed

- 功能：新增行、删除行、替换行内字符串

- 新增行（ll | sed '1a hello'），删除行（ll | sed '1,2d'），替换（ll | sed 's/bohan/xiaopang/g'）

b、awk [awk -F '' '{}']

- 功能：字符内容提取 + 可编程

- 示例

写道

1.可编程
grep "act_store_code" ./wlb-gateway.log.2013-04-09 | awk -F '<' 'res = ""; {for(i=1; i<NF; i++){if($i ~/^order_code>/){t=i+1; res = res " " substr($t,9,20)} if($i ~/^act_store_code>/){t=i+1; res = res " " substr($t,9,6); print res}}}'

2.BEGIN/END设置全局变量
grep "act_store_code" ./wlb-gateway.log.2013-04-09 | awk -F 'act_store_code' 'BEGIN{count=0} {count++} END{print count}'

3.正则表达式
a./正则表达式/：使用通配符的扩展集
注意：必须写在 / / 中间，否则报error：unexpected newline or end of string
b. ~ 匹配正则表达式
如: $ awk '$1 ~/^root/' test 显示test文件第一列中以root开头的行

3.字符截取

a、substr，例：expr substr "hello" 2 2（在awk里面，直接以用函数方式使用，如：substr($t,9,20)，未细究）

b、cut，例：echo $PATH | cut -d ':' -f 1 或 echo $PATH | cut -c1-8（-c：每个字符算一位）
c、${}，例：str="hello"; echo ${str%e*} 或 str="hello"; echo ${str:2:3}，但${"hello":2:2}是错误的

4.辅助命令

a、sort

b、uniq （先排序，再uniq，否则如‘a b a’，直接uniq结果是1:a\1:b\1:a）

5.find

-maxdepth 搜索深度，1=仅当前目录

-type，搜索类型，d=目录，f=文件

-exec，可指定执行命令，如：find . -name 'cei*' ls -l {} \;，末尾必须添加分号

http://www.cnblogs.com/peida/archive/2012/11/14/2769248.html

6.xargs

http://zh.wikipedia.org/wiki/Xargs

它的作用是将参数列表转换成小块分段传递给其他命令，作为命令的输入。如下：在当前目录查找*app，拷贝到admin

find . -name "app*" | xargs -i cp {} /home/admin/

7.查看占据某端口的进程

如：80端口，netstat -tlnp | grep 80

8.快速格式long为date

- 时间戳到日期：date -d @1385637775

- 日期到时间戳：date +'%s'

实例：

1.分析access日志中的某个参数，日志格式如下

写道

125.45.237.121 6934 - [08/Nov/2013:14:00:00 +0800] "GET http://show.re.taobao.com/feature.htm?cb=tbcc_items_discounts_1383890395777&auction
_ids=24036252328,35082783349,26188964127,19958196760,20227913682,13339039483,35307696046,21395884635,19398785205,35281581852,3497389366,103
21832189,17599831517,35599869467,19783554337,20137738201&feature_names=promoPrice,promoOtherNeed" 200 304 "http://trade.taobao.com/trade/tr
ade/itemlist/list_bought_items.htm?spm=a1z02.1.5864393.d4912065.drn0Y0&event_submit_do_query=1&action=itemlist%2FQueryAction&user_type=0&_f
mt.q._0.c=I_HAS_NOT_COMMENT&_fmt.q._0.au=ALL&nekot=g%2Cyphgy33wmu2tknjvgu1383890360087&tracelog=mytaobao_daipingjia" "Mozilla/5.0 (compatib
le; MSIE 10.0; Windows NT 6.1; Trident/6.0)"
117.30.219.156 6310 - [08/Nov/2013:14:00:00 +0800] "GET http://show.re.taobao.com/feature_v1.htm?auction_ids=10742949701%2C15142995790%2C20
293050154%2C13523039125%2C17449331778%2C18301552348%2C12831153959%2C16140753495%2C19269241460%2C17295571991%2C17625334117%2C35415756198%2C3
5656272072&_ksTS=1383890399004_138&cb=jsonp139&feature_names=promoName,promoPrice,promoOtherNeed,coinTips&from=taobao_search&t=138389039900
3" 200 453 "http://s.taobao.com/search?initiative_id=staobaoz_20131108&jc=1&q=%D4%CB%B6%AF%BF%E3%CA%D5%BF%DA%C4%E1%CB%BF%B7%C4&stats_click=
search_radio_all%3A1" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1"

cat /home/a/logs/tengine/show.re.taobao.com-access_2013110715 | grep "promoPrice" | grep -o 'auction_ids=[^& "]*' | cut -d = -f 2 | grep '%2C' -v > f1

------- 分割线【临时记录】-------

1.批量解压文件

for tgzfile in saber{}.cm3/*.tar.gz; do tar -xvf $tgzfile -C saber{}.cm3/; done

2.批量copy文件

echo {1..4} | sed 's/ /\n/g'| xargs -t -L 1 -I {} scp bohan.sj@saber{}.cm3:/home/a/project/output/logs/saber/output/*.log.2014-04-19.* ./saber_logs/saber{}.cm3/

3.批量创建目录

mkdir home/saber{1..3}.cm6

4.在命令行下：跳到行首 ctrl+a，跳到行尾 ctrl+e

--- 临时命令 ---

1.查看磁盘空间

du：查看当前目录个文件夹大小

df：查看整个文件系统的磁盘空间占用情况，是以挂载点为粒度

2.yum

yum list ：列出所有包（包括已安装和未安装的包）

yum list installed：列出已安装包

yum install / remove：安装或卸载