Djange搭建爬虫系统
Django是一个开放源代码的 Web 应用框架,由 Python 写成。许多成功的项目都使用了这个框架,采用了 MVT 的软件设计模式,即模型(Model),视图(View)和模板(Template)。使用这个框架能够高效快捷的开发出这个系统。
文章目录
前言
我们将在下面与大家分享如何搭建起一个django项目,以及编写招聘网站爬虫系统以及爬虫数据分析功能和下载数据。
一、如何安装django?
第一种方式:通过pip安装
$ python -m pip install Django
第二种方式:
yum install python-setuptools
easy_install django
检验是否安装成功
[root@solar django]# python
Python 3.7.4 (default, May 15 2014, 14:49:08)
[GCC 4.8.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> django.VERSION
(3, 0, 6, 'final', 0)
安装完毕!
二、建构项目
1.创建项目
$ django-admin startproject mysite
如果django-admin不起作用,可以参阅:https://docs.djangoproject.com/en/3.1/faq/troubleshooting/#troubleshooting-django-admin
项目如下(示例):
mysite/
manage.py
mysite/
__init__.py
settings.py
urls.py
asgi.py
wsgi.py
2.创建爬虫应用程序
现在,您的环境(一个“项目”)已设置好,您就可以开始工作了。
您在Django中编写的每个应用程序都包含一个遵循特定约定的Python包。Django附带了一个实用程序,该实用程序会自动生成应用程序的基本目录结构,因此您可以专注于编写代码,而不是创建目录。
$ python manage.py startapp polls
代码如下(示例):
polls/
__init__.py
admin.py
apps.py
migrations/
__init__.py
models.py
tests.py
views.py
到这一步我们的目录的就已经创建完毕了。
进入到mysite目录:
我们尝试运行一下:python manage.py runserver 0.0.0.0:8000
效果如下:
看见有这个页面,安装大功告成
补充:
把数据库改成mysql数据库。
修改setting文件即可:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
# 'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
'NAME': 'movie',
'HOST': '127.0.0.1',
'PORT': 3306,
'USER': 'root',
'PASSWORD': 'root',
}
}
运行命令:
python manage.py migrate
这个时候就可以自动生成项目所需要的mysql数据表和数据字段。
3.安装自带的后台管理系统
在mysite里面的url文件改成以下。
创建超级管理员命令
$ python manage.py createsuperuser
创建超级密码
Username: admin
最后一步是输入密码。你会被要求输入两次密码,第二次的目的是为了确认第一次输入的确实是你想要的密码。
Password: **********
Password (again): *********
Superuser created successfully.
启动:
进入到mysite目录,运行
$ python manage.py runserver
4、打开浏览器,转到你本地域名的 “/admin/” 目录, – 比如 “http://127.0.0.1:8000/admin/” 。你应该会看见管理员登录界面:
我们安装一下:
pip install django-simpleui
完成搭建工作,刷新一下页面,一个全新的后台系统就呈现在我们的面前了,如下图所示:
至此,项目骨架就已经搭建完成。
4.搭建自定义的应用模块
1、运行命令:
python manage.py startapp polls
这个时候你就可以看见poll模块已经在我们的目录里面生成的了。
2、建立相关的表模型:
model.py
from django.db import models
# Create your models here.
# models.py
from django.db import models
class Test(models.Model):
name = models.CharField(max_length=20)
class Category(models.Model):
# id = models.CharField(u'实例ID', max_length=32, blank=False, primary_key=True)
name = models.CharField(u'岗位', max_length=50)
add_time= models.CharField(u'添加时间', max_length=50)
class Meta:
verbose_name = '职位分类'
verbose_name_plural = verbose_name
def __str__(self):
return self.name
class Record(models.Model):
# id = models.CharField(u'实例ID', max_length=32, blank=False, primary_key=True)
record_name=models.CharField(u'记录名称', max_length=50)
date=models.CharField(u'记录日期', max_length=50)
recruit_type=models.CharField(u'记录类型', max_length=50)
class Meta:
verbose_name = '爬取记录'
verbose_name_plural = verbose_name
def __str__(self):
return self.date
class Liepin(models.Model):
# id = models.CharField(u'实例ID', max_length=32, blank=False, primary_key=True)
work=models.CharField(u'岗位',max_length=50)
edu=models.CharField(u'教育背景',max_length=50)
district=models.CharField(u'地区',max_length=50)
compensation=models.CharField(u'薪酬',max_length=50)
company=models.CharField(u'公司',max_length=50)
year = models.CharField(u'工作年限', max_length=50)
create_time = models.CharField(u'创建时间', max_length=50)
work_type = models.ForeignKey(Category, on_delete=models.CASCADE,verbose_name='分类')
record = models.ForeignKey(Record, on_delete=models.CASCADE, verbose_name='记录')
salary = models.CharField(u'收入后一位', max_length=10)
class Meta:
verbose_name = '猎聘数据'
verbose_name_plural = verbose_name
class Qiancheng(models.Model):
# id = models.CharField(u'实例ID', max_length=32, blank=False, primary_key=True)
work = models.CharField(u'岗位', max_length=50)
edu = models.CharField(u'教育背景', max_length=50)
district = models.CharField(u'地区', max_length=50)
compensation = models.CharField(u'薪酬', max_length=50)
company = models.CharField(u'公司', max_length=50)
year = models.CharField(u'工作年限', max_length=50)
create_time = models.CharField(u'创建时间', max_length=50)
work_type = models.ForeignKey(Category, on_delete=models.CASCADE)
record = models.ForeignKey(Record, on_delete=models.CASCADE, verbose_name='记录')
salary = models.CharField(u'收入后一位', max_length=10)
class Meta:
verbose_name = '前程数据'
verbose_name_plural = verbose_name
class Lagou(models.Model):
# id = models.CharField(u'实例ID', max_length=32, blank=False, primary_key=True)
work=models.CharField(u'岗位',max_length=50)
edu=models.CharField(u'教育背景',max_length=50)
district=models.CharField(u'地区',max_length=50)
compensation=models.CharField(u'薪酬',max_length=50)
company=models.CharField(u'公司',max_length=50)
year = models.CharField(u'工作年限', max_length=50)
create_time = models.CharField(u'创建时间', max_length=50)
work_type=models.ForeignKey(Category,on_delete=models.CASCADE)
record = models.ForeignKey(Record, on_delete=models.CASCADE, verbose_name='记录')
salary= models.CharField(u'收入后一位', max_length=10)
class Meta:
verbose_name = '拉钩数据'
verbose_name_plural = verbose_name
class Data(models.Model):
id = models.CharField(u'实例ID', max_length=32, blank=False, primary_key=True)
count = models.CharField(u'次数', max_length=50)
work_name = models.CharField(u'工作名称', max_length=50)
category_id = models.CharField(u'分类id', max_length=50)
status = models.CharField(u'分类id', max_length=20)
class Meta:
verbose_name = '临时存储数据'
verbose_name_plural = verbose_name
admin.py
from django.contrib import admin
# Register your models here.
from django.contrib import admin
from polls.models import Test,Liepin,Lagou,Category,Qiancheng,Record
from django.core.paginator import Paginator
class LiepinAdmin(admin.ModelAdmin):
list_display = ('work', 'edu', 'district','company','compensation','year','work_type','record') # list
search_fields = ('work',)
# 分页,每页显示条数
list_per_page = 10
paginator = Paginator
class LagouAdmin(admin.ModelAdmin):
list_display = ('work', 'edu', 'district', 'company', 'compensation','year','work_type','record') # list
search_fields = ('work',)
# 分页,每页显示条数
list_per_page = 10
paginator = Paginator
class QianchengAdmin(admin.ModelAdmin):
list_display = ('work', 'edu', 'district', 'company', 'compensation','year','work_type','record') # list
search_fields = ('work',)
# 分页,每页显示条数
list_per_page = 10
paginator = Paginator
class CategoryAdmin(admin.ModelAdmin):
list_display=('name','add_time')
list_per_page=10
paginator = Paginator
class RecordAdmin(admin.ModelAdmin):
list_display=('date','recruit_type')
list_per_page=10
paginator = Paginator
admin.site.register(Liepin,LiepinAdmin)
admin.site.register(Lagou,LagouAdmin)
admin.site.register(Qiancheng,QianchengAdmin)
admin.site.register(Record,RecordAdmin)
admin.site.register(Category,CategoryAdmin)
运行
python manage.py migrate
数据库即可自动建立好。
进入后台查看,就可以看到相对应的表管理模块。
3、 建一个应用当作功能逻辑模块。
运行命令
python manage.py startapp reptile
一个模块就已经建立完成了。
5.编写各个招聘网站的爬取脚本。
1、猎聘网
在reptile建立一个py文件。
from django.http import HttpResponse
from django.shortcuts import render
from bs4 import BeautifulSoup
import csv
import time
import random
import requests
import sys
import operator
from polls import models
from urllib.parse import quote
from polls.models import Record
def grad_action(request):
work_name=request.GET.get('work_name')
type = request.GET.get('type')
record_name = request.GET.get('record_name')
# 判断当前是否有任务在进行
status = models.Data.objects.filter(category_id=type)
if (status[0].status == 0):
return HttpResponse(-1)
models.Data.objects.filter(category_id=type).update(status=0)
# 插入查找岗位信息记录
record=Record(record_name=record_name,date=str(int(time.time())),recruit_type=type)
record.save()
record_id=record.id
# 查找职位表是否有这个职位,没有的话就添加
cate_id=models.Category.objects.filter(name=work_name)
if(not(cate_id)):
cate=models.Category(name=work_name,add_time=int(time.time()))
cate.save()
cate_id=cate.id
else:
cate_id=cate_id[0].id
# return HttpResponse(1)
if(int(type)==1):
reture=liepin_action(0,0,work_name,cate_id,record_id)
return HttpResponse(reture)
# 爬取liepin
def liepin_action(i,sleep_count,work_name,cate_id,record_id):
# 岗位
work_name = work_name
link = "https://www.liepin.com/zhaopin/?industries=040&subIndustry=&dqs=050020&salary=&jobKind=&pubTime=&compkind=&compscale=&searchType=1&isAnalysis=&sortFlag=15&d_headId=aaa42964a7680110daf82f6e378267d9&d_ckId=ff5c36a41d1d524cff2692be11bbe61f&d_sfrom=search_prime&d_pageSize=40&siTag=_1WzlG2kKhjWAm3Yf9qrog%7EqdZCMSZU_dxu38HB-h7GFA&key=" + quote(
work_name) + "&curPage=" + str(i)
user_agent_list = [
"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"
]
headers = {"User-Agent": random.choice(user_agent_list)}
try:
response = requests.get(link, headers=headers)
response.encoding = 'utf-8'
html = response.text
soup = BeautifulSoup(html, 'html.parser')
sojob_result = soup.find("div", class_='sojob-result')
list_r = sojob_result.find_all("li")
except BaseException:
if (sleep_count > 9):
print("亲,我都试了45分钟了,还是无法请求网络成功,请你稍后重试或寻求专业人士帮助")
print("亲,抱歉,程序结束")
models.Data.objects.filter(category_id=1).update(status=1)
return 0
print("抱歉,爬取异常,原因可能是需要验证操作或您的网络不佳,我先休息五分钟再来试试把")
print("开始休眠5分钟")
sleep_count = sleep_count + 1
sys.stdout.flush()
time.sleep(300)
return liepin_action(i, sleep_count,work_name,cate_id,record_id)
if (len(list_r) == 0):
print("恭喜你,本次爬取数据任务已完成啦")
models.Data.objects.filter(category_id=1).update(status=1)
return 1
# 岗位
sleep_count = 0
in_data = []
out_data = []
for x in range(0, len(list_r)):
try:
address = list_r[x].find("a", class_='area').get_text().strip()
except BaseException:
address = ''
work = list_r[x].find("a").get_text().strip()
edu = list_r[x].find("span", class_='edu').get_text().strip()
year = list_r[x].find("span", class_='edu').find_next_sibling("span").get_text().strip()
money = list_r[x].find("span", class_='text-warning').get_text().strip()
company = list_r[x].find("p", class_='company-name').get_text().strip()
data = {'work': work, 'edu':edu, 'compensation':money, 'company':company, 'year':year, 'district':address}
work_data = models.Data.objects.filter(category_id=1)
in_data = data
out_data = work_data[0].work_name
in_data = str(in_data)
if (operator.eq(in_data, out_data)):
count = work_data[0].count
count = int(count)
if (count > 12):
print("恭喜你,本次爬取数据任务已完成啦")
models.Data.objects.filter(category_id=1).update(status=1)
return 1
if (money != '面议'):
try:
salary = money.split('-')[1][-5:]
salary_money = money.split('-')[1].replace(salary, '')
except BaseException:
salary_money = 0
else:
salary_money = 0
# 写入数据库
liepin_data = models.Liepin(work=work, create_time=int(time.time()),edu=edu,compensation=money
,record_id=record_id,work_type_id=cate_id,company=company,year=year,district=address,salary=salary_money)
liepin_data.save()
print(data)
models.Data.objects.filter(category_id=1).update(work_name=str(in_data))
models.Data.objects.filter(category_id=1).update(count=str(i))
sys.stdout.flush()
time.sleep(random.randint(7, 16))
return liepin_action(i + 1, sleep_count,work_name,cate_id,record_id)
2、拉勾网
from django.http import HttpResponse
from django.shortcuts import render
from bs4 import BeautifulSoup
import csv
import time
import random
import requests
import sys
import operator
import ssl
import json
import random
from urllib import parse
from urllib import request
from polls import models
from urllib.parse import quote
from polls.models import Record
def grad_action(request):
work_name=request.GET.get('work_name')
type = request.GET.get('type')
record_name = request.GET.get('record_name')
# 判断当前是否有任务在进行
status = models.Data.objects.filter(category_id=type)
if (status[0].status == 0):
return HttpResponse(-1)
models.Data.objects.filter(category_id=type).update(status=0)
# 插入查找岗位信息记录
record = Record(record_name=record_name, date=str(int(time.time())), recruit_type=type)
record.save()
record_id = record.id
# 查找职位表是否有这个职位,没有的话就添加
cate_id = models.Category.objects.filter(name=work_name)
if (not (cate_id)):
cate = models.Category(name=work_name, add_time=int(time.time()))
cate.save()
cate_id = cate.id
else:
cate_id=cate_id[0].id
# return HttpResponse(cate_id)
if (int(type) == 2):
reture = lagou_action(0, work_name, cate_id, record_id)
return HttpResponse(reture)
# 爬取lagou
def lagou_action(i,work_name,cate_id,record_id):
try:
# 去掉全局安全校验
ssl._create_default_https_context = ssl._create_unverified_context
# 先爬取首页python职位的网站以获取Cookie
url = 'https://www.lagou.com/jobs/list_%E6%9E%B6%E6%9E%84%E5%B8%88?city=%E5%B9%BF%E5%B7%9E&labelWords=&fromSearch=true&suginput='
# print(url)
req = request.Request(url, headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
})
response = request.urlopen(req)
# print(response)
# 从响应头中提取Cookie
cookie = ''
for header in response.getheaders():
if header[0] == 'Set-Cookie':
cookie = cookie + header[1].split(';')[0] + '; '
# 去掉最后的空格
cookie = cookie[:-1]
# print(cookie)
# 爬取职位数据
url = 'https://www.lagou.com/jobs/positionAjax.json?needAddtionalResult=false'
# 构造请求头,将上面提取到的Cookie添加进去
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
'Cookie': cookie,
'Referer': 'https://www.lagou.com/jobs/list_%E6%9E%B6%E6%9E%84%E5%B8%88?city=%E5%B9%BF%E5%B7%9E&labelWords=&fromSearch=true&suginput='
}
kd = work_name;
data = {
'first': 'true',
'pn': i,
'kd': kd
}
req = request.Request(url, data=parse.urlencode(data).encode('utf-8'), headers=headers, method='POST')
response = request.urlopen(req)
result = response.read().decode('utf-8')
result = json.loads(result)
except IOError:
models.Data.objects.filter(category_id=2).update(status=1)
return 0
if (result['content']['positionResult']['resultSize'] == 0):
models.Data.objects.filter(category_id=2).update(status=1)
return 1
# 岗位
try:
# print(result)
for x in range(0, result['content']['positionResult']['resultSize']):
district = result['content']['positionResult']['result'][x]['city']
work = result['content']['positionResult']['result'][x]['positionName']
edu = result['content']['positionResult']['result'][x]['education']
year = result['content']['positionResult']['result'][x]['workYear']
money = result['content']['positionResult']['result'][x]['salary']
company = result['content']['positionResult']['result'][x]['companyFullName']
create_time = result['content']['positionResult']['result'][x]['createTime']
data = [work, edu, money, company];
if district == "广州" or district == "深圳":
try:
salary_money=money.split('-')[1].replace('k', '')
except BaseException:
salary_money = 0
# 写入数据库
lagou_data = models.Lagou(work=work, create_time=int(time.time()), edu=edu, compensation=money
, record_id=record_id, work_type_id=cate_id, company=company, year=year,district=district,salary=salary_money)
lagou_data.save()
print(data)
except IOError:
models.Data.objects.filter(category_id=2).update(status=1)
return 0
sys.stdout.flush()
time.sleep(random.randint(15, 40))
return lagou_action(i+1, work_name, cate_id, record_id)
3、前程无忧网
from django.http import HttpResponse
from django.shortcuts import render
from bs4 import BeautifulSoup
import csv
import time
import random
import requests
import sys
import operator
import ssl
import json
import random
from urllib import parse
from urllib import request
from polls import models
from urllib.parse import quote
from polls.models import Record
def grad_action(request):
work_name = request.GET.get('work_name')
type = request.GET.get('type')
record_name = request.GET.get('record_name')
# 判断当前是否有任务在进行
status=models.Data.objects.filter(category_id=type)
if(status[0].status==0):
return HttpResponse(-1)
models.Data.objects.filter(category_id=type).update(status=0)
# 插入查找岗位信息记录
record = Record(record_name=record_name, date=str(int(time.time())), recruit_type=type)
record.save()
record_id = record.id
# 查找职位表是否有这个职位,没有的话就添加
cate_id = models.Category.objects.filter(name=work_name)
if (not (cate_id)):
cate = models.Category(name=work_name, add_time=int(time.time()))
cate.save()
cate_id = cate.id
else:
cate_id = cate_id[0].id
if (int(type) == 3):
# 更新该类目的爬取状态
reture = qiancheng_action(1,0, work_name, cate_id, record_id)
return HttpResponse(reture)
# 爬取lagou
def qiancheng_action(i,sleep_count,work_name,cate_id,record_id):
# 去掉全局安全校验
# 岗位
work_name = work_name
try:
link = "https://search.51job.com/list/030200,000000,0000,00,9,99," + quote(work_name) + ",2," + str(i) + ".html?lang=c&stype=&postchannel=0000&workyear=99&cotype=99°reefrom=99&jobterm=99&companysize=99&providesalary=99&lonlat=0%2C0&radius=-1&ord_field=0&confirmdate=9&fromType=&dibiaoid=0&address=&line=&specialarea=00&from=&welfare="
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}
response = requests.get(link, headers=headers)
code = response.apparent_encoding
response.encoding = code
html = response.text
soup = BeautifulSoup(html, 'html.parser')
in_data = []
out_data = []
count = 0
sojob_result = soup.find_all("script", type='text/javascript')
except BaseException:
if (sleep_count > 9):
print("亲,我都试了45分钟了,还是无法请求网络成功,请你稍后重试或寻求专业人士帮助")
print("亲,抱歉,程序结束")
models.Data.objects.filter(category_id=3).update(status=1)
return 0
print("抱歉,爬取异常,原因可能是需要验证操作或您的网络不佳,我先休息五分钟再来试试把")
print("开始休眠5分钟")
sleep_count = sleep_count + 1
sys.stdout.flush()
time.sleep(300)
return qiancheng_action(i, sleep_count,work_name,cate_id,record_id)
try:
a = str(sojob_result[2])
json_str = json.loads(a[60:-9], strict=False)
list = json_str['engine_search_result']
except BaseException:
sys.stdout.flush()
time.sleep(3)
return qiancheng_action(i+1, sleep_count, work_name, cate_id, record_id)
if (len(list) == 0):
print("恭喜你,本次爬取数据任务已完成啦")
models.Data.objects.filter(category_id=3).update(status=1)
return 1
try:
for x in range(1, len(list)):
work = list[x]['job_name']
company = list[x]['company_name']
address = list[x]['workarea_text']
money = list[x]['providesalary_text']
attribute_text = list[x]['attribute_text']
public_time = list[x]['issuedate']
data = [work, company, address, money, attribute_text, public_time]
year=attribute_text[1]
print(data)
if("经验" in year):
year=attribute_text[1]
else:
year='不限'
# 整理学历背景
for a in attribute_text:
if (a == '大专' or a == '本科' or a == '中专' or a == '高中' or a == '硕士'):
edu = a
else:
edu = '未知'
# 整理金额
if(money!=''):
try:
salary=money.split('-')[1][-3:]
if(salary=='万/月'):
salary_money=money.split('-')[1].replace('万/月','')
elif(salary=='万/年'):
salary_money = money.split('-')[1].replace('万/年', '')
else:
salary_money = money.split('-')[1].replace('千/月', '')
except BaseException:
salary_money = 0
else:
salary_money=0
qiancheng = models.Qiancheng(work=work, create_time=int(time.time()), edu=edu, compensation=money
, record_id=record_id, work_type_id=cate_id, company=company, year=year,district=address,salary=salary_money)
qiancheng.save()
in_data = data
work_data = models.Data.objects.filter(category_id=3)
out_data = work_data[0].work_name
in_data = str(in_data)
if (operator.eq(in_data, out_data)):
count = work_data[0].count
count = int(count)
except BaseException:
sys.stdout.flush()
time.sleep(random.randint(3, 7))
qiancheng_action(i + 1, sleep_count, work_name, cate_id, record_id)
sys.stdout.flush()
time.sleep(random.randint(3, 7))
if (count > 12):
print("恭喜你,本次爬取数据任务已完成啦")
models.Data.objects.filter(category_id=3).update(status=1)
return 1
sleep_count = 0
models.Data.objects.filter(category_id=3).update(work_name=str(in_data))
models.Data.objects.filter(category_id=3).update(count=str(i))
return qiancheng_action(i + 1, sleep_count, work_name, cate_id, record_id)
4、把相应的路由给搭建好就可以访问了。
"""mysite URL Configuration
The `urlpatterns` list routes URLs to views. For more information please see:
https://docs.djangoproject.com/en/3.0/topics/http/urls/
Examples:
Function views
1. Add an import: from my_app import views
2. Add a URL to urlpatterns: path('', views.home, name='home')
Class-based views
1. Add an import: from other_app.views import Home
2. Add a URL to urlpatterns: path('', Home.as_view(), name='home')
Including another URLconf
1. Import the include() function: from django.urls import include, path
2. Add a URL to urlpatterns: path('blog/', include('blog.urls'))
"""
from django.urls import path
from . import views,recruit_view,grad_view,grad_action,lagou_action,qiancheng_action,download_action,mate_action,grad_all
urlpatterns = [
path("index/",views.index, name='index' ),
path("recruit_view/<int:type_id>", recruit_view.recruit_record, name='index'),
path("recruit_view/recruit_index/<int:type_id>/<int:id>", recruit_view.recruit_index, name='index'),
path("download_action/<int:type_id>/<int:record_id>", download_action.download_action, name='index'),
path("grad_view/<int:type_id>", grad_view.grad_index, name='index'),
path("grad_action/", grad_action.grad_action, name='index'),
path("lagou_action/", lagou_action.grad_action, name='index'),
path("qiancheng_action/", qiancheng_action.grad_action, name='index'),
path("mate_action/<int:type_id>/<int:record_id>", mate_action.mate_action, name='index'),
path("grad_all/",grad_all.grad_all_view,name='index')
]
完成。
具体还有一些前端页面在这里就不细说了。
三、github
https://github.com/fengyuan1/django_manage.git
四、总结
这个系统可以轻松实现各大招聘网站的爬虫以及分析工作。