20220114学习记录-py4e lesson 12 HTTP 1-3

最新推荐文章于 2024-08-03 23:18:37 发布

Finale_Raky

最新推荐文章于 2024-08-03 23:18:37 发布

阅读量1.4k

点赞数

文章标签： http 网络协议网络

本文链接：https://blog.csdn.net/weixin_42189468/article/details/122483481

版权

HTTP全名为xx transfer protocol

[N-COUNT] protocol is a set of rules for exchanging information between computers.

即一种传输协议，是我们在交换信息时要遵守的规则

一般的网址可以这样理解

1.常用TCP接口

2.python代码

第一行引入socket，第二行创建socket，第三行连接，紫色为目标，绿色为接口

********第三行很有可能崩溃，因为连不上etc

3.课后作业：PY4E - Python for Everybody

作业1

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
cmd = 'GET http://data.pr4e.org/intro-short.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)

while True:
    data = mysock.recv(512)
    if len(data) < 1:
        break
    print(data.decode(),end='')

mysock.close()

作业2：

Scraping Numbers from HTML using BeautifulSoup In this assignment you will write a Python program similar to http://www.py4e.com/code3/urllink2.py. The program will use urllib to read the HTML from the data files below, and parse the data, extracting numbers and compute the sum of the numbers in the file.

# To run this, download the BeautifulSoup zip file
# http://www.py4e.com/code3/bs4.zip
# and unzip it in the same directory as this file

from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
import re

# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

#url = input('Enter - ')
url = "http://py4e-data.dr-chuck.net/comments_1452627.html"
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")

# Retrieve all of the anchor tags

tags = soup()
add = 0
for tag in tags:
    num = re.findall('^<span class="comments">([0-9]+)',str(tag.contents[0]))
    for i in num:
        if int(i)!=0:
            add=add+int(i)
print(add)

Finale_Raky

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
20220114学习记录-py4e lesson 12 HTTP 1-3

Scraping Numbers from HTML using BeautifulSoupIn this assignment you will write a Python program similar tohttp://www.py4e.com/code3/urllink2.py. The program will useurllibto read the HTML from the data files below, and parse the data, extracting num
复制链接

扫一扫