day06反射和正则

最新推荐文章于 2024-11-03 20:27:08 发布

weixin_34029680

最新推荐文章于 2024-11-03 20:27:08 发布

阅读量61

点赞数

文章标签： python

原文链接：http://www.cnblogs.com/liuhailong-py-way/p/5593365.html

版权

Python 反射和正则

本章内容简介：

1. 递归

2. 反射

3. 反射重点--函数路径

4. 加密模块hashlib

5. 模块之正则（re）

一. 递归

如果函数包含了对其自身的调用，该函数就是递归的。

实例：

def func(n):
    if n == 1:
        return 1
    return n*func(n-1)

n = func(3)
print(n)

代码解析：

定义了一个func函数，传入数字参数，当数字参数为1时，返回1；else：执行参数*参数递减1，这种函数调用函数本身的行为，就是递归；

插入一些关于递归的解释，以下是从网上搜到的内容：
（1）递归就是在过程或函数里调用自身；
（2）在使用递归策略时，必须有一个明确的递归结束条件，称为递归出口。

递归算法一般用于解决三类问题：
（1）数据的定义是按递归定义的。（比如Fibonacci函数）
（2）问题解法按递归算法实现。（回溯）
（3）数据的结构形式是按递归定义的。（比如树的遍历，图的搜索）　　

递归的缺点：递归算法解题的运行效率较低。在递归调用的过程当中系统为每一层的返回点、局部量等开辟了栈来存储。递归次数过多容易造成栈溢出等。

二. 反射

python中的反射功能是由以下四个内置函数提供：hasattr、getattr、setattr、delattr，这四个函数分别用于对对象内部执行：检查是否含有某成员、获取成员、设置成员、删除成员。

class Foo(object):
 
    def __init__(self):
        self.name = 'liuhailong'
 
    def func(self):
        return 'func'
 
obj = Foo()
 
# #### 检查是否含有成员 ####
hasattr(obj, 'name')
hasattr(obj, 'func')
 
# #### 获取成员 ####
getattr(obj, 'name')
getattr(obj, 'func')
 
# #### 设置成员 ####
setattr(obj, 'age', 18)
setattr(obj, 'show', lambda num: num + 1)
 
# #### 删除成员 ####
delattr(obj, 'name')
delattr(obj, 'func')

详细解析：

当我们要访问一个对象的成员时，应该是这样操作：

class Foo(object):
 
    def __init__(self):
        self.name = 'alex'
 
    def func(self):
        return 'func'
 
obj = Foo()
 
# 访问字段
obj.name
# 执行方法
obj.func()

那么问题来了？

a、上述访问对象成员的 name 和 func 是什么？

答：是变量名

b、obj.xxx 是什么意思？

答：obj.xxx 表示去obj中或类中寻找变量名 xxx，并获取对应内存地址中的内容。

c、需求：请使用其他方式获取obj对象中的name变量指向内存中的值 “alex”

class Foo(object):
 
    def __init__(self):
        self.name = 'alex'
 
# 不允许使用 obj.name
obj = Foo()

答：有两种方式，如下：

方法一

class Foo(object):

    def __init__(self):
        self.name = 'alex'

    def func(self):
        return 'func'

# 不允许使用 obj.name
obj = Foo()

print(obj.__dict__['name'])

方法二

class Foo(object):

    def __init__(self):
        self.name = 'alex'

    def func(self):
        return 'func'

# 不允许使用 obj.name
obj = Foo()

print(getattr(obj, 'name'))

d、比较三种访问方式

obj.name
obj.__dict__['name']
getattr(obj, 'name')

答：第一种和其他种比，...
第二种和第三种比，...

#!/usr/bin/env python
#coding:utf-8
from wsgiref.simple_server import make_server

class Handler(object):

    def index(self):
        return 'index'

    def news(self):
        return 'news'


def RunServer(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    url = environ['PATH_INFO']
    temp = url.split('/')[1]
    obj = Handler()
    is_exist = hasattr(obj, temp)
    if is_exist:
        func = getattr(obj, temp)
        ret = func()
        return ret
    else:
        return '404 not found'

if __name__ == '__main__':
    httpd = make_server('', 8001, RunServer)
    print ("Serving HTTP on port 8000...")
    httpd.serve_forever()

三. 反射重点函数路径

　　通常我们在写程序时，为了程序可读性更好，会将程序内容按不同功能，写到多个目录里，但是程序间，有时需要调用不同目录下的函数，来实现相应的功能，增加代码的重用性等，这时就涉及到路径了。这部分是基础，也是关键！！！

实例（一）如图：

代码：

import sys
import os
project_path =os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.append(project_path)
from commons import authentication

if __name__ == '__main__':
    hasattr(authentication,"login")
    login = getattr(authentication,"login")
    login()
    authentication.login()

main.py
___________________________________

def login():
    print('登录页面')

def logout():
    print('退出页面')
    
def home():
    print('进入家目录')

authencatition.py

实例（二）

代码解析：

__file__ 获取文件的绝对路径，如果单一执行此内置函数，获取的是文件名；

os.path.abspath(__file__) 是获取文件的绝对路径，不是文件名；

__import__ 这是奇迹的根源，内置函数可以将字符串类型解析成模块目录名；

m,f = inp.split('/') 定义两个参数，传参时必须是两个参数，并且以‘/’作为分隔符；

总结： 反射就是利用字符串的形式去对象（模块）中操作（寻找/检查/删除/设置）成员。

　　　　反射方法用于自定义模块间调用函数方法，非常方便，是最佳之选！！！

四. 加密模块hashlib

源码：hashlib module - A common interface to many hash functions.多种hash函数的通用接口

hashlib用来替换md5和sha模块，并使他们的API一致。它由OpenSSL支持，支持如下算法：md5,sha1, sha224, sha256, sha384, sha512。

Hash objects have these methods:
 - update(arg): Update the hash object with the bytes in arg. Repeated calls
                are equivalent to a single call with the concatenation of all
                the arguments.
 - digest():    Return the digest of the bytes passed to the update() method
                so far.
 - hexdigest(): Like digest() except the digest is returned as a unicode
                object of double length, containing only hexadecimal digits.
 - copy():      Return a copy (clone) of the hash object. This can be used to
                efficiently compute the digests of strings that share a common
                initial substring.

For example, to obtain the digest of the string 'Nobody inspects the
spammish repetition':

    >>> import hashlib
    >>> m = hashlib.md5()
    >>> m.update(b"Nobody inspects")
    >>> m.update(b" the spammish repetition")
    >>> m.digest()
    b'\\xbbd\\x9c\\x83\\xdd\\x1e\\xa5\\xc9\\xd9\\xde\\xc9\\xa1\\x8d\\xf0\\xff\\xe9'

Python3和Python2有区别

import hashlib

obj = hashlib.md5(bytes("abcd",encoding='utf-8'))
obj.update(bytes('123',encoding="utf-8"))
result = obj.hexdigest()

print(result)

执行加密要使用bytes；

五. 模块之正则（re）

这个模块提供了与 Perl 相似l的正则表达式匹配操作。Unicode字符串也同样适用。

正则表达式使用反斜杠" \ "来代表特殊形式或用作转义字符，这里跟Python的语法冲突，因此，Python用" \\\\ "表示正则表达式中的" \ "，因为正则表达式中如果要匹配" \ "，需要用\来转义，变成" \\ "，而Python语法中又需要对字符串中每一个\进行转义，所以就变成了" \\\\ "。

上面的写法是不是觉得很麻烦，为了使正则表达式具有更好的可读性，Python特别设计了原始字符串(raw string)，需要提醒你的是，在写文件路径的时候就不要使用raw string了，这里存在陷阱。raw string就是用'r'作为字符串的前缀，如 r"\n"：表示两个字符"\"和"n"，而不是换行符了。Python中写正则表达式时推荐使用这种形式。

绝大多数正则表达式操作与模块级函数或RegexObject方法一样都能达到同样的目的。而且不需要你一开始就编译正则表达式对象，但是不能使用一些实用的微调参数。

Support for regular expressions (RE).

This module provides regular expression matching operations similar to
those found in Perl.  It supports both 8-bit and Unicode strings; both
the pattern and the strings being processed can contain null bytes and
characters outside the US ASCII range.

Regular expressions can contain both special and ordinary characters.
Most ordinary characters, like "A", "a", or "0", are the simplest
regular expressions; they simply match themselves.  You can
concatenate ordinary characters, so last matches the string 'last'.

The special characters are:
    "."      Matches any character except a newline.
    "^"      Matches the start of the string.
    "$"      Matches the end of the string or just before the newline at
             the end of the string.
    "*"      Matches 0 or more (greedy) repetitions of the preceding RE.
             Greedy means that it will match as many repetitions as possible.
    "+"      Matches 1 or more (greedy) repetitions of the preceding RE.
    "?"      Matches 0 or 1 (greedy) of the preceding RE.
    *?,+?,?? Non-greedy versions of the previous three special characters.
    {m,n}    Matches from m to n repetitions of the preceding RE.
    {m,n}?   Non-greedy version of the above.
    "\\"     Either escapes special characters or signals a special sequence.
    []       Indicates a set of characters.
             A "^" as the first character indicates a complementing set.
    "|"      A|B, creates an RE that will match either A or B.
    (...)    Matches the RE inside the parentheses.
             The contents can be retrieved or matched later in the string.
    (?aiLmsux) Set the A, I, L, M, S, U, or X flag for the RE (see below).
    (?:...)  Non-grouping version of regular parentheses.
    (?P<name>...) The substring matched by the group is accessible by name.
    (?P=name)     Matches the text matched earlier by the group named name.
    (?#...)  A comment; ignored.
    (?=...)  Matches if ... matches next, but doesn't consume the string.
    (?!...)  Matches if ... doesn't match next.
    (?<=...) Matches if preceded by ... (must be fixed length).
    (?<!...) Matches if not preceded by ... (must be fixed length).
    (?(id/name)yes|no) Matches yes pattern if the group with id/name matched,
                       the (optional) no pattern otherwise.

The special sequences consist of "\\" and a character from the list
below.  If the ordinary character is not on the list, then the
resulting RE will match the second character.
    \number  Matches the contents of the group of the same number.
    \A       Matches only at the start of the string.
    \Z       Matches only at the end of the string.
    \b       Matches the empty string, but only at the start or end of a word.
    \B       Matches the empty string, but not at the start or end of a word.
    \d       Matches any decimal digit; equivalent to the set [0-9] in
             bytes patterns or string patterns with the ASCII flag.
             In string patterns without the ASCII flag, it will match the whole
             range of Unicode digits.
    \D       Matches any non-digit character; equivalent to [^\d].
    \s       Matches any whitespace character; equivalent to [ \t\n\r\f\v] in
             bytes patterns or string patterns with the ASCII flag.
             In string patterns without the ASCII flag, it will match the whole
             range of Unicode whitespace characters.
    \S       Matches any non-whitespace character; equivalent to [^\s].
    \w       Matches any alphanumeric character; equivalent to [a-zA-Z0-9_]
             in bytes patterns or string patterns with the ASCII flag.
             In string patterns without the ASCII flag, it will match the
             range of Unicode alphanumeric characters (letters plus digits
             plus underscore).
             With LOCALE, it will match the set [0-9_] plus characters defined
             as letters for the current locale.
    \W       Matches the complement of \w.
    \\       Matches a literal backslash.

This module exports the following functions:
    match     Match a regular expression pattern to the beginning of a string.
    fullmatch Match a regular expression pattern to all of a string.
    search    Search a string for the presence of a pattern.
    sub       Substitute occurrences of a pattern found in a string.
    subn      Same as sub, but also return the number of substitutions made.
    split     Split a string by the occurrences of a pattern.
    findall   Find all occurrences of a pattern in a string.
    finditer  Return an iterator yielding a match object for each match.
    compile   Compile a pattern into a RegexObject.
    purge     Clear the regular expression cache.
    escape    Backslash all non-alphanumerics in a string.

Some of the functions in this module takes flags as optional parameters:
    A  ASCII       For string patterns, make \w, \W, \b, \B, \d, \D
                   match the corresponding ASCII character categories
                   (rather than the whole Unicode categories, which is the
                   default).
                   For bytes patterns, this flag is the only available
                   behaviour and needn't be specified.
    I  IGNORECASE  Perform case-insensitive matching.
    L  LOCALE      Make \w, \W, \b, \B, dependent on the current locale.
    M  MULTILINE   "^" matches the beginning of lines (after a newline)
                   as well as the string.
                   "$" matches the end of lines (before a newline) as well
                   as the end of the string.
    S  DOTALL      "." matches any character at all, including the newline.
    X  VERBOSE     Ignore whitespace and comments for nicer looking RE's.
    U  UNICODE     For compatibility only. Ignored for string patterns (it
                   is the default), and forbidden for bytes patterns.

This module also defines an exception 'error'.

实例（一）：

atch：re.match(pattern, string, flags=0)
flags    编译标志位，用于修改正则表达式的匹配方式，如：是否区分大小写，
多行匹配等等。
re.match('com', 'comwww.runcomoob').group()

re.match('com', 'Comwww.runComoob',re.I).group()

实例（二）

search：re.search(pattern, string, flags=0)
re.search('\dcom', 'www.4comrunoob.5com').group()
注意：
re.match('com', 'comwww.runcomoob')
re.search('\dcom', 'www.4comrunoob.5com')
一旦匹配成功，就是一个match object 对象，而match object 对象拥有以下方法：
group()    返回被 RE 匹配的字符串
start()    返回匹配开始的位置
end()    返回匹配结束的位置
span()    返回一个元组包含匹配 (开始,结束) 的位置
group() 返回re整体匹配的字符串，可以一次输入多个组号，对应组号匹配的字符串。import re
a = "123abc456"
 re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(0)   #123abc456,返回整体
 re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(1)   #123
 re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(2)   #abc
 re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(3)   #456
 
 group(1) 列出第一个括号匹配部分，group(2) 列出第二个括号匹配部分，group(3) 
 列出第三个括号匹配部分。

实例（三）

findall：
re.findall  以列表形式返回所有匹配的字符串
　　re.findall可以获取字符串中所有匹配的字符串。如：

p = re.compile(r'\d+')
print p.findall('one1two2three3four4')

re.findall(r'\w*oo\w*', text)；获取字符串中，包含'oo'的所有单词。
    
import re
text = "JGood is a  handsome boy,he is handsome and cool,clever,and so on ...."
print(re.findall(r'\w*oo\w*',text)) #结果：['JGood', 'cool']
#print re.findall(r'(\w)*oo(\w)*',text) # ()表示子表达式 结果：[('G', 'd'), ('c', 'l')]

finditer():
>>> p = re.compile(r'\d+')
>>> iterator = p.finditer('12 drumm44ers drumming, 11 ... 10 ...')
>>> for match in iterator:
     match.group() , match.span()

实例（四）

sub subn：

re.sub(pattern, repl, string, max=0)
re.sub("g.t","have",'I get A,  I got B ,I gut C')

re.I 使匹配对大小写不敏感
re.L 做本地化识别（locale-aware）匹配
re.M 多行匹配，影响 ^ 和 $
re.S 使 . 匹配包括换行在内的所有字符
>>> re.findall(".","abc\nde")
>>> re.findall(".","abc\nde",re.S)
re.U 根据Unicode字符集解析字符。这个标志影响 \w, \W, \b, \B.
re.X 该标志通过给予你更灵活的格式以便你将正则表达式写得更易于理解。

re.S：.将会匹配换行符，默认.逗号不会匹配换行符
>>> re.findall(r"a(\d+)b.+a(\d+)b","a23b\na34b")
[]
>>> re.findall(r"a(\d+)b.+a(\d+)b","a23b\na34b",re.S)
[('23','34')]
>>>
re.M：^$标志将会匹配每一行，默认^只会匹配符合正则的第一行；默认$只会匹配符合正则的末行

转载于:https://www.cnblogs.com/liuhailong-py-way/p/5593365.html