So cute are you Python 12

 1.BeautifulSoup 的安装过程:

**1.1 下载 BeautifulSoup  下载地址(点击下载)

**1.2 安装

安装的时候就是:

           python setup.py build

           python setup.py install

引入包要用:

        import bs4

        from bs4 import BeautifulSoup

2.BeautifulSoup

#!/usr/bin/evn python
#coding:utf-8
#FileName:re_learn01.py
#Function:show first time to use beautifulSoup
#History:25-10-2013
import bs4
from bs4 import BeautifulSoup;
def bea_Demo():
    demoHtml="""
<html>
<body>
<div class="icon_col">
   <h1 class="h1user">Certtt</h1>
</div>
</body>
</html>
"""
    soup = BeautifulSoup(demoHtml)
    print "type(soup)=",type(soup)
    print "soup=",soup
    h1userSoup = soup.find(name="h1",attrs={"class":"h1user"})
    #
    print "h1userSoup=",h1userSoup
    h1userUnicodeStr = h1userSoup.string
    print "h1userUnicodeStr=",h1userUnicodeStr

if __name__=='__main__':
    bea_Demo()
结果:

# python be_learn01.py 
type(soup)= <class 'bs4.BeautifulSoup'>
soup= 
<html>
<body>
<div class="icon_col">
<h1 class="h1user">Certtt</h1>
</div>
</body>
</html>

h1userSoup= <h1 class="h1user">Certtt</h1>
h1userUnicodeStr= Certtt
2.一个简单的页面的测试:

#!/usr/bin/evn python  
#coding:utf-8  
#FileName:re_learn01.py  
#Function:show first time to use beautifulSoup  
#History:25-10-2013  
import bs4
import urllib
from bs4 import BeautifulSoup
def bea_Demo():
    url='http://home.51cto.com/index.php?s=/space/7743046'
    ss=urllib.urlopen(url)
    page=ss.read()
    soup = BeautifulSoup(page)
    print "type(soup)=",type(soup)
    h1userSoup=[]
    h1userSoup = soup.findAll(name="ul")
    #print "soup=",soup
    for h in h1userSoup:
        res=h.findAll('a')
        for r in res:
            if r!=None:
                #print ''
                print "***:",r.string,"::",r,"\n"
                
    

if __name__=='__main__':
    bea_Demo()
结果:

$ python bea_learn02.py 

***: 家园 :: <a href="http://home.51cto.com" target="_blank">家园</a> 

***: 学院 :: <a href="http://edu.51cto.com" target="_blank">学院</a> 

***: 博客 :: <a href="http://blog.51cto.com" target="_blank">博客</a> 

***: 论坛 :: <a href="http://bbs.51cto.com" target="_blank">论坛</a> 

***: 下载 :: <a href="http://down.51cto.com" target="_blank">下载</a> 

***: 自测 :: <a href="http://selftest.51cto.com" target="_blank">自测</a> 

***: 门诊 :: <a href="http://doctor.51cto.com" target="_blank">门诊</a> 

***: 周刊 :: <a href="http://blog.51cto.com/newsletter/" target="_blank">周刊</a> 

***: 读书 :: <a href="http://book.51cto.com" target="_blank">读书</a> 

***: 技术圈 :: <a href="http://g.51cto.com" target="_blank">技术圈</a> 



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值