Python网络爬虫器之制作

IT界小菜鸡

于 2024-08-18 11:37:21 发布

阅读量69

点赞数 1

分类专栏：笔记 1024程序员节文章标签： python 爬虫开发语言

本文链接：https://blog.csdn.net/qq_55541943/article/details/141297388

版权

笔记同时被 2 个专栏收录

14 篇文章 0 订阅

订阅专栏

1024程序员节

3 篇文章 0 订阅

订阅专栏

前些日子，突然兴起，想着做一个工具，输入一个url（即：统一资源定位符）即可获取网页信息，于是怀着这种想法，我便动手了。
做完了之后，便打算把源码分享给各位，直接复制就能用，但是需要注意的是，有些库，如果你没有代码所示的一些库，请用pip方法安装它们
方法：
在python环境下使用pip方法安装，打开命令提示符窗口安装即可

pip install xxxx（库名）

import requests
import time
from tqdm import tqdm
import os
time.sleep(0.1)
from bs4 import BeautifulSoup
print("请输入您要获取的链接")
url = input("请输入您要获取的链接(完整url): ")
time.sleep(0.2)
print(f'已获取链接:{url}')
for i  in tqdm(range(10),desc="等待中"):
    time.sleep(1)

GET = requests.get(url)
if GET.status_code ==200:
    time.sleep(1)
    print("链接成功")
else:
    print("链接失败")
'''
Now_time = time.strftime("%Y%m%d_%H%M%S", time.localtime())
Desktop = os.path.expanduser(os.path.join("~"),"Desktop")
TxT_creat = os.path.join(Desktop, f"{Now_time}.txt")  
with open(TxT_creat, 'w', encoding="utf-8") as w:  
            w.write(url)
'''

#ai代码部分
try:  
    response = requests.get(url)  
    if response.status_code == 200:  
        time.sleep(2)  # 假设这里是为了模拟一些处理时间  
        #print("链接成功")  
        # 将时间转换为更适合文件名的格式  
        Now_time = time.strftime("%Y%m%d_%H%M%S", time.localtime())  
        Desktop = os.path.expanduser("~/Desktop")  
        TxT_creat = os.path.join(Desktop, f"{Now_time}.txt")  
        with open(TxT_creat, 'w', encoding="utf-8") as w:  
            w.write(GET.text)
            w.close()
    else:  
        print("链接失败")  
except requests.RequestException as e:  
    print(f"请求失败: {e}")

time.sleep(0.8)
print("正在解析文件")
for read_file in tqdm(range(20),desc="文件解析中"):
    time.sleep(0.2)
Desktop = os.path.join(os.path.expanduser("~"),"Desktop")
code = os.path.join(Desktop,"源码.txt")
with open(code,"w",encoding="utf-8")as w:
    w.write(GET.text)
    w.close()
with open(code,"r",encoding="utf-8")as R:
    Read = R.read()
    R.close()
    Be  = BeautifulSoup(Read,"html.parser")
    divs =  Be.find_all("div")
    parser = os.path.join(Desktop,"解析结果.txt")
    for i  in divs:
        with open(parser,'a',encoding="utf-8")as In:
            In.write(i.get_text() +'\n')
            In.close()