Python string的一些用法
在线文档: http://docs.python.org/3.3/library/stdtypes.html#str
1. str.split(sep=None, maxsplit=-1)
http://docs.python.org/3.3/library/stdtypes.html#str.split
sep:分割符;
maxsplit:分割次数, -1代表整个字符串查都分割
For example:
' 1 2 3 '.split() returns ['1', '2', '3']
' 1 2 3 '.split(None, 1) returns ['1', '2 3 '].
代码1
>>> str1 = ("i am a worker, and you are a student !")
>>> print(str1)
i am a worker, and you are a student !
>>> items = str1.split()
>>> print(items)
['i', 'am', 'a', 'worker,', 'and', 'you', 'are', 'a', 'student', '!']
代码2
>>> str1 = "abcbdbebfbbh"
>>> items = str1.split("b")
>>> print(items)
['a', 'c', 'd', 'e', 'f', '', 'h']
从代码1可以知道, 默认分割符是空格" ".
从代码2可以知道, 分割出来的元素两端的空字符会清除.
从代码2可以知道, 返回列表的第-2项是空的, 也就是说元素的内容就是两个b之间的, 去掉两端空字符的内容.
举例: 分析下载下来的网页的编码方式
- #!/usr/bin/env python
- # 3.py
- # use UTF-8
- # Python 3.3.0
- def extract(text, sub1, sub2):
- """
- extract a substring from text between first
- occurances of substrings sub1 and sub2
- """
- return text.split(sub1, 1)[-1].split(sub2, 1)[0]
- str = '...nt="text/html;charset=utf-8"> <title>...' # 这里是模拟下载下载的某html文件
- print(extract(str, "charset=", '"''"')) # 分割点1"charset=", 分割点2'"'
- # text.split(sub1, 1)[-1].split(sub2, 1)[0]
- # text.split(sub1, 1)[-1]返回的是'utf-8"> <title>...'
- # text.split(sub1, 1)[-1].split(sub2, 1)[0]返回的是 utf-8
举例: 分割字符串后, 使用":"连接各个字符
- #!/usr/bin/env python
- # 3.py
- # use UTF-8
- # Python 3.3.0
- str = ("i am a worker, and you are a student !")
- items = str.split()
- print(items)
- #join the str
- sep = ":"
- items=sep.join(items)
- print(items)
- # 输出
- # ['i', 'am', 'a', 'worker,', 'and', 'you', 'are', 'a', 'student', '!']
- # i:am:a:worker,:and:you:are:a:student:!