Python学习_15 正则表达式


Python学习_15 正则表达式
1、正则表达式常用的特殊元素:

模式
描述
^
匹配字符串的开头
$
匹配字符串的末尾
[...]
用来表示一组字符,单独列出:[amk] 匹配'a','m'或'k'
[^...]
不在[]中的字符:[^abc] 匹配除了a,b,c之外的字符。
re*
匹配0个或多个的表达式
re+
匹配1个或多个的表达式
re?
匹配0个或1个由前面的正则表达式定义的片段,非贪婪方式
re{ n}
精确匹配 n 个前面表达式。例如, o{2} 不能匹配 "Bob" 中的 "o",但是能匹配 "food" 中的两个 o。
re{ n,}
匹配 n 个前面表达式。例如, o{2,} 不能匹配"Bob"中的"o",但能匹配 "foooood"中的所有 o。"o{1,}" 等价于 "o+"。"o{0,}" 则等价于 "o*"。
re{ n, m}
匹配 n 到 m 次由前面的正则表达式定义的片段,贪婪方式
a| b
匹配a或b
(re)
匹配括号内的表达式,也表示一个组
\w
匹配字母数字以及下划线
\W
匹配非字母数字及下划线
\s
匹配任意空白字符
\S
匹配任意非空字符
\d
匹配任意数字,等价于[0-9]
\D
匹配任意非数字
\A
匹配字符串开始
\Z
匹配字符串结束,如果存在换行,只匹配到换行前的结束字符串
\z
匹配字符串结束
\G
匹配最后匹配完成的位置
2、正则表达式实例

python
匹配python
[Pp]ython
匹配“Python”或“python”
rub[ye]
匹配 "ruby" 或 "rube"
[aeiou]
匹配中括号内的任意一个字母
python
匹配 "python".
[0-9]
匹配任何数字。类似于 [0123456789]
[a-z]
匹配任何小写字母
[A-Z]
匹配任何大写字母
[a-zA-Z0-9]
匹配任何字母及数字
[^aeiou]
除了aeiou字母以外的所有字符
[^0-9]
匹配除了数字外的字符
3、re模块复习
re.match 尝试从字符串起始位置匹配;
re.search扫描整个字符串并返回第一个成功的匹配
re.findall在字符串中找到正则表达式所匹配的所有字符串,并返回一个列表,如果没找到则返回空列表
4、练习一
匹配出skuid对应的值和 skuimgurl对应的值
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 2018\5\4 0004 0:01
# @Author : xiexiaolong
# @File : demon1.py
import re
import requests
session = requests.session()
r = session.get(url)
html = r.text
#print(html)
reg = re.compile (r"\"skuid\":\"(\d+)\",\s+\"spuid\":\"\d+\",\s+\"skuurl\":\"\S+\",\s+\"skuimgurl\":\"(\S+)\",")
result = reg.findall(html)
print(result)
结果:
D:\python\venv\Scripts\python.exe D:/python/0503/demon1.py
[('26878432382', ' https://img30.360buyimg.com/n7/jfs/t18226/169/1318243724/390477/5b0718ff/5ac44edcNa350dbd9.jpg '), ('5327182', ' https://img30.360buyimg.com/n7/jfs/t17461/138/1837663326/68820/5f8da5cd/5ad9b1e2N42bce837.jpg '), ('11731514723', ' https://img30.360buyimg.com/n7/jfs/t19231/337/2147939016/196162/4210a6ae/5aea6250N0235cd05.jpg '), ('19588651151', ' https://img30.360buyimg.com/n7/jfs/t11341/60/1553062810/120774/ab9534ff/5a02c3f4Naebe34b7.jpg '), ('15495544751', ' https://img30.360buyimg.com/n7/jfs/t18088/43/2048465630/167669/dd3c8b7b/5ae12c40N57c98ea8.jpg '), ('1780924', ' https://img30.360buyimg.com/n7/jfs/t17167/97/1957869461/43204/d064647b/5adda3e0Ne1d3aa86.jpg '), ('16675691362', ' https://img30.360buyimg.com/n7/jfs/t18490/21/2141098141/120513/b3ca521a/5ae90247N3b4909ae.jpg '), ('26222795271', ' https://img30.360buyimg.com/n7/jfs/t19441/291/1597121495/310550/9bc2e141/5ad05fc0N1510cae5.jpg '), ('4813030', ' https://img30.360buyimg.com/n7/jfs/t19198/83/1908967366/189260/7538e84b/5adda865N8f547981.jpg '), ('26348513019', ' https://img30.360buyimg.com/n7/jfs/t14857/240/2643838980/220943/c982fda1/5aaf2002Ndd25bc52.jpg '), ('26016197600', ' https://img30.360buyimg.com/n7/jfs/t19894/76/195725612/190103/23c60ca1/5aeabb94N3e0266bc.jpg '), ('27036535156', ' https://img30.360buyimg.com/n7/jfs/t19399/140/2175516321/123017/41e6d6a8/5aea87d3N9736cc9d.jpg '), ('25168000024', ' https://img30.360buyimg.com/n7/jfs/t17629/301/2062161127/434152/aa3560a5/5ae319f9N1ae1146c.jpg '), ('25965247088', ' https://img30.360buyimg.com/n7/jfs/t19270/67/2232771964/253207/25f41fd9/5aea61b0Nfd21a809.jpg '), ('10123099847', ' https://img30.360buyimg.com/n7/jfs/t15511/14/1469153129/729958/b0af0ca1/5a533063N15fea56c.jpg '), ('20000220615', ' https://img30.360buyimg.com/n7/jfs/t16426/172/2638358261/151693/87020840/5ab869ddN30621fec.jpg '), ('15904713681', ' https://img30.360buyimg.com/n7/jfs/t17287/197/2249621651/366556/d36ae213/5aeadb4cN97f413f3.jpg '), ('10114188069', ' https://img30.360buyimg.com/n7/jfs/t17110/210/2010165830/417320/31273aeb/5ae2c90eNe7caf222.jpg '), ('10503200866', ' https://img30.360buyimg.com/n7/jfs/t18139/246/1628563908/114414/9315ac7c/5ad0647eNa9f1e2af.jpg '), ('1658610413', ' https://img30.360buyimg.com/n7/jfs/t19411/79/1017814440/108641/1b185d6d/5ab8b479Nd2417e97.jpg ')]

Process finished with exit code 0
5、练习二
匹配出文件中upstream,和upstream对应的内容,用upstream名称作为文件名,每个upstream输出到一个文件

upstream orderCenter.ga10.wms5.jd.local {
    server 10.46.0.161:8023 weight=10 max_fails=2 fail_timeout=30s;
    server 10.46.0.162:8023 weight=10 max_fails=2 fail_timeout=30s;
}

upstream opperftrace.ga10.wms5.jd.local {
    server 10.46.0.164:8060 weight=10 max_fails=2 fail_timeout=30s;
}

upstream taskassign-c.ga10.wms5.jd.local {
    server 10.46.0.162:8005 weight=10 max_fails=2 fail_timeout=30s;
}

upstream smartQuery.ga10.wms5.jd.local {
    server 10.46.0.164:8013 weight=10 max_fails=2 fail_timeout=30s;
}

upstream center.ga10.wms5.jd.local {
    server 10.46.0.164:9020 weight=10 max_fails=2 fail_timeout=30s;
    server 10.46.0.163:9020 weight=10 max_fails=2 fail_timeout=30s;
}
upstream aps.wms5.jd.local {
  server 10.46.0.161:8001 weight=10 max_fails=2 fail_timeout=10s;
  server 10.46.0.162:8001 weight=10 max_fails=2 fail_timeout=10s;
}

upstream master.wms5.jd.local {
  server 10.46.0.163:8004 weight=10 max_fails=2 fail_timeout=10s;
  server 10.46.0.164:8004 weight=10 max_fails=2 fail_timeout=10s;
}
upstream clover.jd.local {
  server 10.46.0.163:1601 weight=10 max_fails=2 fail_timeout=10s;
  server 10.46.0.164:1601 weight=10 max_fails=2 fail_timeout=10s;
}

upstream backbone.web.wms5.jd.local {
  server 10.46.0.163:8006 weight=10 max_fails=2 fail_timeout=10s;
}
upstream wump-heartbeat.wms5.jd.local {
    server 10.46.0.130:8001 weight=10 max_fails=2 fail_timeout=10s;
}
upstream dec.wms5.jd.local {
        server 10.46.0.161:8012 weight=10 max_fails=2 fail_timeout=10s;
        server 10.46.0.162:8011 weight=10 max_fails=2 fail_timeout=10s;
}


# ##############################################################################

server
{
    listen                   80;
    server_name               ga10.wms5.jd.com 10.46.0.217 10.46.0.161;
    access_log               /export/servers/nginx/logs/ ga10.wms5.jd.com/ga10.wms5.jd.com_access.log main;
    error_log                /export/servers/nginx/logs/ ga10.wms5.jd.com/ga10.wms5.jd.com_error.log warn;
    #chunkin on;
    error_page 411 = @my_error;
    location @my_error {
        #chunkin_resume;
    }


        location /logs/ {
                autoindex       off;
                deny all;
        }

    # frontend    ########################################
    rewrite_log on;
    #more_set_headers "Foo: bar";
location /dec/ {
                proxy_next_upstream     http_500 http_502 http_503 http_504 error timeout invalid_header;
                proxy_set_header        Host  $host;
                proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
                expires                 0;
                proxy_pass http://dec.wms5.jd.local/ ;
        }
location ~ /pickingplan/((?:services/)?taskassign_.+) {
    proxy_next_upstream     http_500 http_502 http_503 http_504 error timeout invalid_header;
    proxy_set_header        Host  $host;
    proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
    expires                 0;
    rewrite /pickingplan/((?:services/)?taskassign_.+) /$1 break;
}

location /wump-heartbeat/ {
        proxy_next_upstream     http_500 http_502 http_503 http_504 error timeout invalid_header;
        proxy_set_header        Host  $host;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        expires                 0;
        proxy_pass http://wump-heartbeat.wms5.jd.local/ ;
}



    location /orderCenter/ {
        proxy_next_upstream     http_500 http_502 http_503 http_504 error timeout invalid_header;
        proxy_set_header        Host  $host;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        expires                 0;
        proxy_pass http://orderCenter.ga10.wms5.jd.local/ ;
    }
}
代码:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2018\5\4 0004 0:52
# @Author  : xiexiaolong
# @File    : demon2.py
import os
import codecs
import re

reg = re.compile(r "(upstream\s+(\S+)\s+{[^}]+})")
with codecs.open("test1.txt") as f:
    relist = reg.findall(f.read())
    if not os.path.exists("upstream"):
        os.mkdir("upstream")
    os.chdir("upstream")
    for name in relist:
        file = name[1]+".location.conf"
        with codecs.open(file, "w") as wr:
            wr.write(name[0])
结果:
分析: relist = reg.findall(f.read()) 这部分定义relist,正则表达式匹配的结果以list形式输出到这里;file = name[1]+".location.conf" 这部分定义取名称,因为正则表达式匹配的结果中是以list输出的,所以name[0]可以取到指定的名称

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值