Python总结:字符串处理

(注:最近要给测试同事做一个培训,因此需要总结一些常用的字符处理方法。很多内容总结自 Python cookbook,严格意义上其实不算是原创,但还是融入了一些自己的理解。姑且挂上原创的标签吧。)

1. 依次处理字符串中的单个字符

(1)可以调用list内置函数,字符串字符串作为参数初始化。
theList = list(theString)

(2)使用for循环依次处理
(3)利用列表推到
(4)利用内置map函数

具体例子如下:

def do_something(char):
    print char ,‘ ’
    return True

thestring = "fuqiang"

for c in thestring:
    do_something(c)

results = [do_something(c) for c in thestring]
print results

results2 = map(do_something, thestring)
print results2


2. 字符和其对应值的转换

a = ord('a')
print a

b =chr(97)
print b

c = ord(u'\u2020')
print c

print map(ord, 'FuQiang')

输出:
>>> 
97
a
8224
[70, 117, 81, 105, 97, 110, 103]
>>> 

此外,八进制和十六进制函数:
>>> oct(10)
'012'
>>> hex(10)
'0xa'

oct(x): Convert an integer number (of any size) to an octal string

ord(c): Given a string of length one, return an integer representing the Unicode code point of the character when the argument is a unicode object, or the value of the byte when the argument is an 8-bit string.
hex(x): Convert an integer number (of any size) to a hexadecimal string. The result is a valid Python expression.


3. 去除字符串两端的空格

x = '    hej    '

print '|', x.lstrip(),'|'
print '|', x.rstrip(),'|'
print '|', x.strip(), '|'

输出
>>> 
| hej     |
|     hej |
| hej |
>>>

4. 合并字符串

pieces = ['1','2','3', '4', '5','6']
largeString = ''.join(pieces)
print largeString

name1 = 'Fu'
name2 = 'Qiang'
name3 = 'Walle'
fullName = 'My name is %s%s, my English name is %s.' % (name1, name2, name3)
print fullName

输出:
>>> 
123456
My name is FuQiang, my English name is Walle.

注释:一般情况下可以使用‘+’号连接字符串,但是如果大量的字串连接,最好不要使用‘+’号,因为每个字串都是一个对象,这种连接会产生大量的临时字串对象。性能上不是一个合理的方法。


5. 字符串翻转

fullName = 'My name is FuQiang, my English name is Walle.'
print fullName

reverseStr = fullName[::-1]
print reverseStr

#reverse by word
rewords = fullName.split()
rewords.reverse()
rewords = ' '.join(rewords)

print rewords

rewords2 = ' '.join(fullName.split()[::-1])
print rewords2

rewords3 = ' '.join(reversed(fullName.split()))
print rewords3

import re
rewords4 = ''.join(reversed(re.split(r'(\s+)', fullName)))
print rewords4

输出:
>>> 
My name is FuQiang, my English name is Walle.
.ellaW si eman hsilgnE ym ,gnaiQuF si eman yM
Walle. is name English my FuQiang, is name My
Walle. is name English my FuQiang, is name My
Walle. is name English my FuQiang, is name My
Walle. is name English my FuQiang, is name My
>>> 


6.  检查字符串是否包含某字符集合中的字符

def containsAny(seq, aset):
    for c in seq:
        if c in aset:
            return True
    return False

def containsAny2(seq, aset):
    return bool(set(aset).intersection(seq))


sequence = 'I am a engineer.'
charSet =  'a.$'
charSet2 =  'k$'

isAnyIn = containsAny(sequence, charSet)
print isAnyIn

isAnyIn2 = containsAny2(sequence, charSet2)
print isAnyIn2


def containsOnly(seq, aset):
    for c in seq:
        if c not in aset:
            return False
    return True


isOnlyIn = containsOnly(sequence, charSet)
print isOnlyIn

name = 'Fu'
fullName = 'FuQiang'
isOnlyIn = containsOnly(name, fullName)
print isOnlyIn


def containsAll(seq, aset):
    return not set(aset).difference(seq)

isOnlyIn = containsAll(name, fullName)
print isOnlyIn

isOnlyIn = containsAll(fullName, name)
print isOnlyIn


#Explain for the function: containsAll(seq, aset)
L1 = [1, 2, 3]
L2 = [1, 2, 3, 4]

#返回L1有而L2中没有的元素,返回符合条件的元素集合。
a = set(L1).difference(L2)

#返回L2有而L1中没有的元素,返回符合条件的元素集合。
b = set(L2).difference(L1)

print a
print b
print not a


输出:
>>> 
True
False
False
True
False
True
set([])
set([4])
True
>>>


7.  控制大小写

>>> Name = 'Fu Qiang'
>>> big = name.upper()
>>> big
'FU QIANG'
>>> little = name.lower()
>>> little
'fu qiang'
>>> cap = name.capitalize()
>>> cap
'Fu qiang'
>>> title = name.title()
>>> title
'Fu Qiang'


检查字符串是否是首字母大写:
def isCapitalized(s):
    return s == s.capitalize()

str1 = 'Title'
str2 = 'name'
str3 = 'Fu Qiang'

isCap = isCapitalized(str1)
print isCap

isCap = isCapitalized(str2)
print isCap

isCap = isCapitalized(str3)
print isCap



8. 访问子字符串 

(1) 可以通过切片访问

line = 'This is a good story!'
partField = line[3:8]
print partField

如果想获取5个字节一组的数据:
fives = [line[k:k+5] for k in xrange (0, len(line), 5)]
print fives

如果需要把字符串切割成自定长度的列:

theline = 'Here is an example from a business reviews website.'
#利用列表的特性,把字符串转换成列表,每个字符作为单独的列表元素。
char = list(theline)
#确定分隔的位置
cuts = [8, 14, 20, 28, 35]
#利用zip函数,注意单独处理第一个区间和最后一个区间。第一个区间是(0, cuts[1]),最后一个区间是(cuts[len(theline) -1],  None),即最后一个区间即切割完之后到最后一个字符的区间。
pieces = [theline[i:j] for i,j in zip([0]+cuts, cuts+[None])]
print pieces





  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值