python中文字符串检测子串是否存在

最新推荐文章于 2023-03-11 11:24:23 发布

卡卡kkscn

最新推荐文章于 2023-03-11 11:24:23 发布

阅读量911

点赞数

分类专栏： Python 文章标签： python 字符串

本文链接：https://blog.csdn.net/weixin_42840612/article/details/104159590

版权

Python 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

首先要明白python3中字符串分为两种，一种是str，另一种是bytes；前者采用unicode编码，后者是二进制。
我使用的是python3.7.2，在使用中发现创建的字符串变量默认为str。

>>> type('abc')
<class 'str'>
>>> type(b'abc')
<class 'bytes'>
>>> type(u'abc')
<class 'str'>

在使用中文时必须使用str，从报错提示中可以看到，bytes只能包含ASCII的字符。

>>> type(u'中国')
<class 'str'>
>>> type(b'中国')
  File "<stdin>", line 1
SyntaxError: bytes can only contain ASCII literal characters.
>>> type('中国')
<class 'str'>

在字符串中检测子串可以采用str.find(sub, beg=0, end=len(string))方法。此方法检测字符串str中是否包含子字符串 sub ，如果指定beg（开始）和end（结束）范围，则检查是否包含在指定范围内，如果包含子字符串返回开始的索引值，否则返回-1。

if(str.find(sub)!=-1):
	do something
else:
	do something

str和sub得同时是str或bytes，不能两种混合，如果两者不一样就会报错，解决方案是用encode()将str转为bytes或用decode()将bytes转为str。

>>> a=u'abc'
>>> b=b'abc'
>>> a.find(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: must be str, not bytes
>>> b=b.decode()
>>> a.find(b)
0

>>> a=u'abc'
>>> b=b'abc'
>>> b.find(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument should be integer or bytes-like object, not 'str'
>>> a=a.encode()
>>> b.find(a)
0