字符串方法练习

最新推荐文章于 2022-05-04 00:55:11 发布

y15518325965

最新推荐文章于 2022-05-04 00:55:11 发布

阅读量157

点赞数

本文链接：https://blog.csdn.net/y15518325965/article/details/79369273

版权

# coding:utf-8

string = '<div class="item-list ni-list"><ul><li  class="first">
<a href="http://www.tepintehui.com/detail/57185?ce" title=
"明星同款| 钟基欧巴穿的小脏鞋5折辣!" ><span>明星同款| 钟基欧巴穿的小脏鞋5折辣!
</span></a></li><li><a href="http://www.tepintehui.com/detail/56847?ce"
 title="装逼| 你们见过凌晨四点钟的洛杉矶吗?" ><span>装逼| 你们见过凌晨四点钟的洛
杉矶吗?</span></a></li><li  ><a href="http://www.tepintehui.com/detail/
57127?ce" title="反人类| 世界上最干净的纸竟然是黄色的!" ><span>反人类| 世界上
最干净的纸竟然是黄色的</span></a></li><li><a href="http://www.tepintehui.com/
detail/57120?ce" title="科普| 吃了避孕药之后怀的孩子能要吗?" >
<span>科普| 吃了避孕药之后怀的孩子能要吗?</span></a></li><li>
<a href="http://www.tepintehui.com/detail/57125?ce" title=
"真假| 9年义务升为12年制,是要取消高考吗" ><span>真假| 9年义务升为12年制,
是要取消高考吗</span></a></li><li><a href="http://www.tepintehui.com
/detail/57124?ce" title="土豪| 揭秘迪士尼见不得光的33号俱乐部" >
<span>土豪| 揭秘迪士尼见不得光的33号俱乐部</span></a></li><li  >
<a href="http://www.tepintehui.com/detail/41008?ce" 
title="吐槽| 男人单身太久会没感觉?" ><span>吐槽| 男人单身太久会没感觉?
</span></a></li><li  ><a href="http://www.tepintehui.com/detail/
23488?ce" title="冷知识| 为什么镜子是左右颠倒不是上下呢" ><span>
冷知识| 为什么镜子是左右颠倒不是上下呢</span></a></li><li  >
<a href="http://www.tepintehui.com/detail/37213?ce" title=
"新玩法| 这年头情侣之间种草莓已经out了!" ><span>新玩法| 这年头情侣之
间种草莓已经out了!</span></a></li><li  ><a href="http://www.tepintehui.com
/detail/11411?ce" title="四壁| 老美说凤姐把范冰冰秒成渣,你怎么看" >
<span>四壁| 老美说凤姐把范冰冰秒成渣,你怎么看</span></a></li><li  >
<a href="http://www.tepintehui.com/detail/37456?ce" 
title="凭什么| 个人挖墓是盗墓,国家挖是考古?" ><span>凭什么| 
个人挖墓是盗墓,国家挖是考古?</span></a></li><li  ><a href="http:
//www.tepintehui.com/detail/40706?ce" title="福利| 要知道加这个
群这么爽！我早进了" ><span>福利| 要知道加这个群这么爽！我早进了</span>
</a></li></ul></div>'

# 声明空列表，存放所有截取的网址
url_list = []

# 声明两个变量，记录要查找的起始字符串和终止字符串内容。
start_mark = 'href="'
end_mark = '?ce'

# 声明用于记录每次查找位置的变量，初始位置就是索引为0的字符。
record_position = 0

while record_position<len(string):
    # 先确定href="这一段字符所在的起始索引值
    start_index = string.find(start_mark, record_position)
    if start_index == -1:
        print '没有匹配成功'
        break
    # 再确定?ce这段字符所在的起始索引值，小括号中的start_index表示从href="
这个起点向后找?ce，不需要再从头开始查找。
    end_index = string.find(end_mark, start_index)

    # 注意：start_index只是确定了href="中h的索引值，end_index只是确定了
?ce中?的索引值。
    # 在进行字符串切片截取时，需要用start_index和end_index索引值加上href="
和?ce的字符串长度。
    url = string[start_index+len(start_mark):end_index+len(end_mark)]

    # 使用record_position记录此次查找的最终位置，用于下一次的查找。
    record_position = end_index
    print url