anaconda安装后怎么用python写编码_Python-初步-Anaconda安装与实现简单的代码

weixin_39906906

于 2020-12-10 16:57:12 发布

阅读量1.3k

点赞数

文章标签： anaconda安装后怎么用python写编码

也可以使用国内的清华tuna下载站，详情可以点击底部友情链接

或者到这个网站下载Anaconda

安装完成后，以管理员身份运行CMD，键入

pip install requests

pip install BeaufifulSoup4

pip install jupyter

启用jupyter

jupyter notebook

简单的代码实例：参考网页公开课：Python网络爬虫实战

import requests

res=requests.get('http://news.sina.com.cn/china/') #网页通过get方法，放入res中

res.encoding='uft-8' #去除乱码

print(res.text)

使用select选出id 前面加#，选出link,class 前面加 .

使用requests.get 获得页面内容，使用bs4解析页面内容

import requests

from bs4 import BeautifulSoup

res=requests.get('http://news.sina.com.cn/china/')

res.encoding='utf-8'

soup= BeautifulSoup(res.text,'html.parser') #注意parser的作用

for news in soup.select('.news-item'): #使用F12查看存在于news下

if len(news.select('h2'))>0: #检查不为空

h2 = news.select('h2')[0].text 输出h2文本

time = news.select('.time')[0].text #输出时间文本

a = news.select('a')[0]['href'] #输出链接

print(time,h2,a)

这样即完成了一个简单的爬虫。

删选时间：

soup.select('.time-sourse')[0].contents[0].strip() #筛选出时间，用.contents选出时间，用strip去除\t

将取出的string转换为time 格式(时间字符串之间的转换)

from datetime import datetime

datetime.strptime(timesource,'%y年%m月%d日%h时%m分')

筛选内文：

soup.select('#artibody p')[:-1] #选出art后筛选出p -1代表扣除左后一个p标签

取得总评论数数量：

import json

jd = json.loads(comments.text.strip('var darta='))

jd['result']['count']['total']

jd

lstrip('') #移除左边所选内容

rstrip('') #移除右边所选内容

使用Pandas整理数据Anaconda自带,源自于R语言

import pandas

df = pandas.daraframe(total)

df.head()

保存数据至数据库

Excel:

dt.to_exccel('name.xlsx')

sqlite3：

import sqlite3

with sqlite3.connect('news.sqlite')as db:

df.to_sql('news',con=db)

import sqlite3

with sqlite3.connect('news.sqlite')as db:

df2.pandas.read_sql_query('select * from news',con=db)

爬取信管网的某一个页面：

import requests

from bs4 import BeautifulSoup

res=requests.get('http://www.cnitpm.com/pm1/54324.html')

res.encoding='UTF-8'

soup= BeautifulSoup(res.text,'html.parser')

for xiti in soup.select('.newcon'):

if len(xiti.select('p'))>0:

print(xiti.get_text())

print('\n')

#print(soup.select('p'))

#xiti=soup.find_all('p')

#for li in soup.select('p'):

# print('试题')

# print(li.get_text())

# print('\n')

#xiti2=xiti.find('p')

#xiti = soup.find('div', attrs={'class':'newcon'})

#print(xiti)

weixin_39906906

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
anaconda安装后怎么用python写编码_Python-初步-Anaconda安装与实现简单的代码

也可以使用国内的清华tuna下载站，详情可以点击底部友情链接或者到这个网站下载Anaconda安装完成后，以管理员身份运行CMD，键入pip install requestspip install BeaufifulSoup4pip install jupyter启用jupyterjupyter notebook简单的代码实例：参考网页公开课：Python网络爬虫实战import requests...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。