django建立个人网站(6百度网盘和百度文库)

有些地区会出现百度网盘打开资源被屏蔽无法下载保存。我这做个第三方保存渠道。

在我服务器中打开链接,将登录码返回页面,扫码登录后,自动将资源保存。

还有百度文库,文章内容提取。以上均使用selenium。

HTML:

<div id="wangpan" class="white_content">
    <a href="javascript:void(0)"
       οnclick="document.getElementById('wangpan').style.display='none';document.getElementById('fade').style.display='none'">点这里关闭本窗口</a>
    <h3>请输入百度网盘网址以及提取码,期间会弹出登录二维码,请用百度app扫码登录(1分钟内有效)。(本网站不记录任何用户信息。)</h3>
    <input type="text" name="url" id = 'wpurl' style="width:500px" placeholder="请输入网址"/></br>
    <input type="text" name="password" id = 'wppassword' placeholder="请输入提取码"/></br>
    <input type="button" value="提交" id="wpsub"/>
    <div></div>
    <img id="baidu" value="custom" >
    <div></div>
    <button type="button" id="login" style="float: left">扫码登录后点击</button>
    <div id="code"></div>

</div>
<div id="wenku" class="white_content">
    <a href="javascript:void(0)"
       οnclick="document.getElementById('wenku').style.display='none';document.getElementById('fade').style.display='none'">点这里关闭本窗口</a>
    <h3>现只支持文本格式,表格,pdf暂不支持。</h3>
    <input type="text" name="url" id = 'wkurl' style="width:500px" placeholder="请输入网址"/></br>
    <input type="button" value="提交" id="wksub"/>
    <div>
    <textarea  id = 'wktittle'></textarea>
     </div>
    <textarea id = 'down' style="margin: 0px; width: 100%; height: 100%; ">
     </textarea>

jQuery:

$('#wpsub').click(function () {
        var wpurl = $("#wpurl").val();
        var wppass = $("#wppassword").val();

        $.ajax({
                url: 'baiduwangpan',
                type: 'POST',
                data: {
                    wpurl:wpurl,
                    wppass :wppass
                },
                //headers:{"X-CSRFToken":$.cookie("csrftoken")},
                success: function (data) {
                    $('#baidu').attr('src', data)
            }

          }
         )
        }
        );

$('#login').click(function () {
        $.ajax({
                url: 'login',
                type: 'POST',
                data: {
                    wpurl:'1',
                },
                //headers:{"X-CSRFToken":$.cookie("csrftoken")},
                success: function (data) {
                    $('#code').val(data)
                    
            }

          }
         )
        }
        );

$('#wksub').click(function () {
        var wkurl = $("#wkurl").val();
        $.ajax({
                url: 'baiduwenku',
                type: 'POST',
                data: {
                    wkurl:wkurl,
                },
                //headers:{"X-CSRFToken":$.cookie("csrftoken")},
                success: function (data) {
                $('#wktittle').val(data[0]);
                $('#down').val(data[1]);
                

                     
                     
            }

          }
         )
        }
        );

views:

def baiduwangpan(request):
    
    url = request.POST.get("wpurl")
    print(url)
    try:
        password = request.POST.get("wppass")
        print(password)
        driver.get(url)
        driver.find_element_by_xpath("//input[@class='QKKaIE LxgeIt']").send_keys(password)
        driver.find_element_by_xpath("//span[@class='g-button-right']").click()
        time.sleep(5)
        driver.current_window_handle
        time.sleep(5)
        driver.find_element_by_xpath("//em[@class='icon icon-save-disk']").click()
        time.sleep(5)
        driver.current_window_handle
        img = driver.find_element_by_xpath("//img[@class='tang-pass-qrcode-img']").get_attribute('src')
    except:
        img = '/static/wp.JPG'
    return JsonResponse(img, json_dumps_params={'ensure_ascii': False}, safe=False)


def login(request):
    try:
        driver.current_window_handle
        # js = 'document.getElementBycLass("g-button g-button-blue").click()'
        # element = driver.find_element_by_xpath("//span[@class='g-button-right']")
        # driver.execute_script(js)
        driver.find_element_by_xpath("//a[@class ='g-button g-button-blue-large']").click()
        try:
            driver.close()
        except:
            pass
        print('first')
        code = 'success'
    except:
        try:
            driver.find_element_by_xpath("//span[@class='zbyDdwb']").click()
            time.sleep(3)
            driver.find_element_by_xpath("//span[@class='g-button-right']").click()
            time.sleep(3)
            driver.find_element_by_xpath("//a[@class ='g-button g-button-blue-large']").click()
            try:
                driver.close()
            except:
                pass
            print('second')
            code = 'success'
        except:
            print('fail')
            code = 'success'
            pass
    print('over')
    return JsonResponse(code, json_dumps_params={'ensure_ascii': False}, safe=False)


def baiduwenku(request):
    driver = webdriver.Chrome(options=driverOptions)
    url = request.POST.get("wkurl")
    print(url)
    driver.get(url)
    page = driver.find_element_by_xpath("//div[@class='doc-summary-wrap']/span[5]")
    page = page.text.replace('页', '')
    PAGE = driver.page_source
    js = "var q=document.documentElement.scrollTop=4000"
    driver.execute_script(js)
    time.sleep(2)
    driver.find_element_by_xpath("//span[@class='read-all']").click()

    lists = ''
    pageSource = driver.page_source
    for i in range(0, int(page)):
        js = "var q=document.documentElement.scrollTop=" + str(1375 * i)
        driver.execute_script(js)
        time.sleep(1)
        pageSource = driver.page_source
        pattern1 = re.compile(r'<p class="reader-word-layer reader-word-s' + str(i + 1) + '-.*?>(.*?)</p>')
        content = pattern1.findall(pageSource)
        for a in content:
            lists += a
    pattern = re.compile(r'<h3 class="doc-title">(.+?)</h3>')
    tittle = pattern.findall(PAGE)
    print(tittle)
    Tittle = str(
        tittle[0].replace('+', '').replace(' ', '').replace('+', '').replace('/', '').replace('?', '').replace('?',
                                                                                                               '').replace(
            '%', '').replace('#', '').replace('&', '').replace('=', ''))
    Lists = [tittle, lists.replace('&nbsp;', '')]
    print(lists)
    try:
        driver.close()
    except:
        pass
    return JsonResponse(Lists, json_dumps_params={'ensure_ascii': False}, safe=False)

urls:

path('baiduwangpan', views.baiduwangpan, name='baiduwangpan'),
path('baiduwenku', views.baiduwenku, name='baiduwenku'),
path('login', views.login, name='login'),

网页大体内容就都搞完了,现在前端的页面设计是真的难搞,在慢慢摸索当中。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值