Python爬取 音标

 

# -*- coding: UTF-8 -*-
import requests
import time
from bs4 import BeautifulSoup
f = open('./words.txt')
fw = open('./result.txt','a')

line = f.readline()
index = 0
while line:
    index = index+1
    url = "https://www.oxfordlearnersdictionaries.com/definition/english/" + line.strip()
    print(str(index) + ":" + url)
    wbdata = requests.get(url,headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36'}).text
    soup = BeautifulSoup(wbdata,'html.parser')
    news_titles = soup.select("span.pron-g > span.phon")
    # print(news_titles)
    result = ''
    for n in news_titles:   
        title = n.get_text()    
        if 'NAmE' in title:
            result += '['+title.replace('NAmE','').replace('//','') + ']'
    print(result)  
    fw.write((result + "\n").encode("utf-8"))
    line = f.readline()
    time.sleep(0.1)

fw.close()
f.close()

 

 

转载于:https://my.oschina.net/sfshine/blog/3076588

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值