猫眼字体反爬

链接:https://maoyan.com/board/1
打开链接会发现很正常啊,但是你打开元素审查工具,就会发现框框,是字体反爬,我们就把猫眼的字体反爬解决下
在这里插入图片描述

# !/usr/bin python3                                 
# encoding    : utf-8 -*-                                                          
# @software   : PyCharm      
# @file       :   猫眼.py
# @Time       :   2021/6/24 11:04

import requests
import re

from fontTools.ttLib import TTFont

url = 'https://maoyan.com/board/1'
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"
}

response = requests.get(url=url,headers=headers).text

找加密字体的文件:
在这里插入图片描述
我们用正则把它提取出来,不能写死字体连接,加个https:+url,保存下来

font_link = re.findall(r"//.*?.woff",response)[0]
p_link = 'https:'+font_link
font_data = requests.get(url=p_link,headers=headers).content
with open('1.woff','wb') as f:
    f.write(font_data)

我们还是用TTFont解析字体,保存成我们更容易看的xml文件,并且构建映射关系

font = TTFont("1.woff")
font.saveXML('1.xml')
new_font_dict = {}
font_map = font['cmap'].getBestCmap()

我们看xml的cmap部分,写出映射

font_dict = {"x":".","uniE137":8,"uniE343":1,"uniE5E2":4,"uniE7A1":9,"uniE8CD":5,"uniF19B":2,"uniF489":0,"uniF4EF":6,"uniF848":3,"uniF88A":7}

我这里用了最笨的方法,因为我没循环字典都被覆盖了,我也不知道怎么情况

num_list = []
key_list = []
for k,v in font_dict.items():
    num_list.append(v)
for d,b in font_map.items():
    t = str(hex(int(d))).replace('0x','&#x')+';'
    key_list.append(t)
new_dict = dict(zip(key_list,num_list))

定义两个空列表一个为key,一个为value,调用zip自动生成一个字典,将我们的key替换成拼接,到代码里把它替换成想对应的值

for i,r in new_dict.items():
    response = response.replace(str(i),str(r))

整体代码

# !/usr/bin python3                                 
# encoding    : utf-8 -*-                                                          
# @software   : PyCharm      
# @file       :   猫眼.py
# @Time       :   2021/6/24 11:04

import requests
import re

from fontTools.ttLib import TTFont

url = 'https://maoyan.com/board/1'
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"
}

response = requests.get(url=url,headers=headers).text
font_link = re.findall(r"//.*?.woff",response)[0]
p_link = 'https:'+font_link
font_data = requests.get(url=p_link,headers=headers).content
with open('1.woff','wb') as f:
    f.write(font_data)


font = TTFont("1.woff")
font.saveXML('1.xml')
new_font_dict = {}
font_map = font['cmap'].getBestCmap()
print(font_map)
font_dict = {"x":".","uniE137":8,"uniE343":1,"uniE5E2":4,"uniE7A1":9,"uniE8CD":5,"uniF19B":2,"uniF489":0,"uniF4EF":6,"uniF848":3,"uniF88A":7}
num_list = []
key_list = []
for k,v in font_dict.items():
    num_list.append(v)
for d,b in font_map.items():
    t = str(hex(int(d))).replace('0x','&#x')+';'
    key_list.append(t)
new_dict = dict(zip(key_list,num_list))
print(new_dict)
for i,r in new_dict.items():
    response = response.replace(str(i),str(r))
print(response)


效果

在这里插入图片描述

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值