Python3爬取迅捷语音转文字(包含持久化登陆和分片上传文件)

前言

在这里我就不再一一介绍每个步骤的具体操作了,因为在上一次爬取今日头条数据的时候都已经讲的非常清楚了,所以在这里我只会在重点上讲述这个是这么实现的,如果想要看具体步骤请先去看我今日头条的文章内容,里面有非常详细的介绍以及是怎么找到加密js代码和api接口。

Python3爬取今日头条文章视频数据,完美解决as、cp、_signature的加密方法

QQ群聊

855262907

分析迅捷语音转文字网站

语音转文字整个过程:

1.登陆账号(因为非VIP只能2分钟,所以我借了一个有VIP手机号过来,但是测试的图片中还是我自己的手机号)
2.分片上传音频文件(为啥是分片上传音频呢,后面有讲解)
3.音频转文字(到这就结束了)

登陆账号

当我们输入手机号码后,点击发送,他会进行POST请求,这个时候我们看到他的Form Data中有很多参数,我们一一来逆向。

在这里插入图片描述

我们开始搜索关键参数phone能够发现发送短信的代码就在这里面,那么就简单了。

在这里插入图片描述
废话不多说,直接开始打断点,看看他是怎么构造的。

在这里插入图片描述
我们可以发现data的参数中只有uuid是由Uuid.get()构造出来的,其他参数一眼就能看出来了,所以我就不多说了,然后data最终要进行basicParams转换后才进行POST请求,所以一步一步来看。

在这里插入图片描述

解决uuid和basicParams

通过调试发现uuid是由Uuid.get()构造,直接跳到这个函数来,发现有用的部分就是create函数,get函数只是用来判断uuid是否存在于localstorage中,如果存在就直接取出来用,如果不在就create创建一个新的。

在这里插入图片描述
JS代码:

function create() {
    var s = [];
    var hexDigits = "0123456789abcdef";
    for (var i = 0; i < 36; i++) {
        s[i] = hexDigits.substr(Math.floor(Math.random() * 0x10), 1);
    }
    s[14] = "4";
    s[19] = hexDigits.substr((s[19] & 0x3) | 0x8, 1);
    s[8] = s[13] = s[18] = s[23] = "";
    var uuid = s.join("");
    return uuid;
}

我们给他进行Python还原

Python代码:

import math
import random

def get_uuid():
    s = ['' for i in range(36)]
    hexDigits = "0123456789abcdef"
    for i in range(36):
        s[i] = hexDigits[math.floor(random.random() * 0x10)]
    s[14] = "4"
    s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
    s[8] = s[13] = s[18] = s[23] = ""
    uuid = ''.join(s)
    print(uuid)

if __name__ == '__main__':
    get_uuid()

在这里插入图片描述

解决basicParams

我们发现basicParams没有什么变化,只是给我们的data参数更加补充完整了,所以我们不需要逆向啥,直接都写成固定的就可以了。

在这里插入图片描述

发送短信

既然所有的参数都解决了,那么下面就直接上代码,开始发送短信。

Python代码:

import math
import random
import requests

class xunjie():
    def __init__(self):
        self.session = requests.Session()
        self.get_uuid()
        self.send_message()

    def get_uuid(self):
        s = ['' for i in range(36)]
        hexDigits = "0123456789abcdef"
        for i in range(36):
            s[i] = hexDigits[math.floor(random.random() * 0x10)]
        s[14] = "4"
        s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
        s[8] = s[13] = s[18] = s[23] = ""
        self.uuid = ''.join(s)

    def send_message(self):
        self.phone = int(input('输入你的手机号码:'))
        url = "https://user.api.hudunsoft.com/v1/sms"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/sms',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '163',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'version': 'v1.0.0',
            'phone': self.phone,
            'uuid': self.uuid,
            'code': ''
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            message = response.json().get('message')
            if "ok" in message:
                print("短信发送成功",message)
                break
            else:
                print("短信发送失败",u'%s' % message)

if __name__ == '__main__':
    xunjie()

在这里插入图片描述

解决高风险时的图片验证码

高风险的时候会要求你输入图片验证码,这个也非常简单,只不过我现在还没有达到高风险,所以现在看不到,也截不了图,所以就直接给你们上代码了,实现思路就是把图片下载下来,然后手动输入图片验证码,当然你也可以使用pytesseract库来识别图片验证码,所以这里我采用最简单的方法来实现。

Python代码:

import math
import random
import requests

class xunjie():
    def __init__(self):
        self.session = requests.Session()
        self.get_uuid()
        self.send_message()

    # 获取uuid
    def get_uuid(self):
        s = ['' for i in range(36)]
        hexDigits = "0123456789abcdef"
        for i in range(36):
            s[i] = hexDigits[math.floor(random.random() * 0x10)]
        s[14] = "4"
        s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
        s[8] = s[13] = s[18] = s[23] = ""
        self.uuid = ''.join(s)

    # 发送短信
    def send_message(self):
        self.phone = int(input('输入你的手机号码:'))
        url = "https://user.api.hudunsoft.com/v1/sms"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/sms',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '163',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'version': 'v1.0.0',
            'phone': self.phone,
            'uuid': self.uuid,
            'code': ''
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            message = response.json().get('message')
            if "ok" in message:
                print("短信发送成功",message)
                break
            else:
                print("短信发送失败",u'%s' % message)
                data['code'] = self.recognition_image()

    # 识别图片验证码
    def recognition_image(self):
        url = 'https://user.api.hudunsoft.com/v1/captcha?uuid={uuid}&time={time}&client=web&source=335'.format(uuid=self.uuid,time=str(time.time()).replace('.','')[:13])
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'GET',
            'path': '/v1/captcha?uuid=e884673549de432f8487c6078bc38685&time=1597927148527&client=web&source=335',
            'scheme': 'https',
            'accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'image',
            'sec-fetch-mode': 'no-cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        response = self.session.get(url,headers=headers)
        with open('验证码.jpg','wb') as f:
            f.write(response.content)
        code = int(input("请查看文件内的验证码并输入:"))
        return code

if __name__ == '__main__':
    xunjie()
持久化登陆

从登陆成功时捕获的链接可以看出来这里面的参数都是固定的了,device_id就是你第一次获取uuid时的值,phone就是你的手机号码,code就是你的手机验证码了。

在这里插入图片描述
我在代码里面加了持久化登陆,因为这个迅捷的操作都是基于token的,所以我们直接记录登陆后的token就可以了。

Python代码:

import math
import random
import requests
import json
import os
import time

class xunjie():
    def __init__(self):
        self.session = requests.Session()
        self.get_uuid()
        # 持久化登陆代码
        if 'cookie.txt' in os.listdir('.'):
            with open('cookie.txt', 'r') as f:
                cookie_data = f.read()
                if cookie_data:
                    self.session.cookies = requests.utils.cookiejar_from_dict(json.loads(cookie_data))
                else:
                    print('cookie.txt文件内容为空,请删除后在运行')
                    return True
            with open('token.txt', 'r') as f:
                token_data = f.read()
                if token_data:
                    self.token = token_data
                else:
                    print('token.txt文件内容为空,请删除后在运行')
                    return True
        else:
            self.send_message()
            self.login()

    # 获取uuid
    def get_uuid(self):
        s = ['' for i in range(36)]
        hexDigits = "0123456789abcdef"
        for i in range(36):
            s[i] = hexDigits[math.floor(random.random() * 0x10)]
        s[14] = "4"
        s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
        s[8] = s[13] = s[18] = s[23] = ""
        self.uuid = ''.join(s)

    # 发送短信
    def send_message(self):
        self.phone = int(input('输入你的手机号码:'))
        url = "https://user.api.hudunsoft.com/v1/sms"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/sms',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '163',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'version': 'v1.0.0',
            'phone': self.phone,
            'uuid': self.uuid,
            'code': ''
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            message = response.json().get('message')
            if "ok" in message:
                print("短信发送成功",message)
                break
            else:
                print("短信发送失败",u'%s' % message)
                data['code'] = self.recognition_image()

    # 识别图片验证码
    def recognition_image(self):
        url = 'https://user.api.hudunsoft.com/v1/captcha?uuid={uuid}&time={time}&client=web&source=335'.format(uuid=self.uuid,time=str(time.time()).replace('.','')[:13])
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'GET',
            'path': '/v1/captcha?uuid=e884673549de432f8487c6078bc38685&time=1597927148527&client=web&source=335',
            'scheme': 'https',
            'accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'image',
            'sec-fetch-mode': 'no-cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        response = self.session.get(url,headers=headers)
        with open('验证码.jpg','wb') as f:
            f.write(response.content)
        code = int(input("请查看文件内的验证码并输入:"))
        return code

    # 登陆
    def login(self):
        self.code = int(input('输入你的短信验证码:'))
        url = "https://user.api.hudunsoft.com/v1/user/auto_sign_in"
        headers = {
             'authority': 'user.api.hudunsoft.com',
             'method': 'POST',
             'path': '/v1/user/auto_sign_in',
             'scheme': 'https',
             'accept': 'application/json, text/javascript, */*; q=0.01',
             'accept-encoding': 'gzip, deflate, br',
             'accept-language': 'zh-CN,zh;q=0.9',
             'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
             'origin': 'http://voice.xunjiepdf.com',
             'referer': 'http://voice.xunjiepdf.com/voice2text.html',
             'sec-fetch-dest': 'empty',
             'sec-fetch-mode': 'cors',
             'sec-fetch-site': 'cross-site',
             'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
             'client': 'web',
             'source': '335',
             'soft_version': 'v3.0.1.1',
             'device_id': self.uuid,
             'phone': self.phone,
             'code': self.code
        }
        response = self.session.post(url,headers=headers,data=data)
        json_data = response.json()
        if "ok" in json_data.get('message'):
            print("登陆成功")
            print(json_data)
            self.token = json_data.get('data').get('token')
            with open('cookie.txt','w') as f:
                f.write(json.dumps(requests.utils.dict_from_cookiejar(response.cookies)))
            with open('token.txt','w') as f:
                f.write(self.token)
        else:
            print("登陆失败")

if __name__ == '__main__':
    xunjie()

在这里插入图片描述

分片上传音频文件

分片上传是什么意思呢?就是将大文件切分成多个小文件,把这些小文件都上传上去后在进行合并,合并为一个大文件。

我们上传的时候发现有3个新的POST请求产生,这就是我们分片上传的请求链接。从这些POST请求Form Data参数就能看出来,第一个POST是分片上传的开始(仅仅是给服务器提示我要上传,做个记录),第二POST才是真正分片上传文件的,第三个POST是分片上传的结束(仅仅是给服务器提示我上传完毕)。

在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

解决POST请求参数

第一个POST请求:

第一个POST请求和第三个POST请求两者参数只有action有变化,其他均没有发生变化,md5参数和fileName是不固定的,下面开始解决这两个参数。

搜索fileName关键字,看到下面的md5fileName参数都出来,直接开始调试。

在这里插入图片描述
在往上面看看,发现有惊喜,分片大小是每次2M,也就是说大于2M的文件将被分为多个2M的小文件,如:3M大小的文件将被分为2M和1M的文件,然后上传上去。

在这里插入图片描述
还发现个大问题就是,webUploader是有实现类的,所以我们跳进去看看,发现各种东西都是在里面进行处理的。

在这里插入图片描述
看到调试的file他的类型为FileInfo。

在这里插入图片描述
那么我们搜索他的实现类,ID = Guid.NewGuid().ToString("N");MD5就是对整个文件进行MD5运算。
在这里插入图片描述
在这里插入图片描述
Guid JS代码:

function Guid(g) {
    var arr = new Array();
    if (typeof (g) == "string") {
        InitByString(arr, g)
    } else {
        InitByOther(arr)
    }
    ;this.Equals = function(o) {
        if (o && o.IsGuid) {
            return this.ToString() == o.ToString()
        } else {
            return false
        }
    }
    ;
    this.IsGuid = function() {}
    ;
    this.ToString = function(format) {
        if (typeof (format) == "string") {
            if (format == "N" || format == "D" || format == "B" || format == "P") {
                return ToStringWithFormat(arr, format)
            } else {
                return ToStringWithFormat(arr, "D")
            }
        } else {
            return ToStringWithFormat(arr, "D")
        }
    }
    ;
    function InitByString(arr, g) {
        g = g.replace(/\{|\(|\)|\}|-/g, "");
        g = g.toLowerCase();
        if (g.length != 32 || g.search(/[^0-9,a-f]/i) != -1) {
            InitByOther(arr)
        } else {
            for (var i = 0; i < g.length; i++) {
                arr.push(g[i])
            }
        }
    }
    ;function InitByOther(arr) {
        var i = 32;
        while (i--) {
            arr.push("0")
        }
    }
    ;function ToStringWithFormat(arr, format) {
        switch (format) {
        case "N":
            return arr.toString().replace(/,/g, "");
        case "D":
            var str = arr.slice(0, 8) + "-" + arr.slice(8, 12) + "-" + arr.slice(12, 16) + "-" + arr.slice(16, 20) + "-" + arr.slice(20, 32);
            str = str.replace(/,/g, "");
            return str;
        case "B":
            var str = ToStringWithFormat(arr, "D");
            str = "{" + str + "}";
            return str;
        case "P":
            var str = ToStringWithFormat(arr, "D");
            str = "(" + str + ")";
            return str;
        default:
            return new Guid()
        }
    }
}
;Guid.Empty = new Guid();
Guid.NewGuid = function() {
    var g = "";
    var i = 32;
    while (i--) {
        g += Math.floor(Math.random() * 16.0).toString(16)
    }
    return new Guid(g)
}
;
//这两行是自己添加上去的
var id = Guid.NewGuid().ToString("N");
console.log(id);

把上面这串JS代码保存下来,名字为guid.js
Python代码:

import os

def get_guid():
    guid = os.popen('node guid.js').read().replace('\n', '')
    return guid

if __name__ == '__main__':
    print(get_guid())

在这里插入图片描述
文件MD5值运算:

Python代码:

import hashlib

def get_md5():
    md5 = hashlib.md5()
    with open('1.mp3', 'rb') as f:
        md5.update(f.read())
        md5_file = md5.hexdigest()
        print(md5_file)

if __name__ == '__main__':
    get_md5()

在这里插入图片描述

通过这串代码(webUploader里面的)可以看出,这就是我们的第一个POST请求,参数就是data: { action: 'Begin', fileName: currentFile.ID + "_" + currentFile.Name, md5: currentFile.MD5 },这里的currentFile就是我们看到的FileInfo

在这里插入图片描述
分析这么久了,开始上代码了。
Python代码:

import math
import random
import requests
import json
import os
import hashlib
import time

class xunjie():
    def __init__(self):
        self.session = requests.Session()
        self.get_uuid()
        # 持久化登陆代码
        if 'cookie.txt' in os.listdir('.'):
            with open('cookie.txt', 'r') as f:
                cookie_data = f.read()
                if cookie_data:
                    self.session.cookies = requests.utils.cookiejar_from_dict(json.loads(cookie_data))
                else:
                    print('cookie.txt文件内容为空,请删除后在运行')
                    return True
            with open('token.txt', 'r') as f:
                token_data = f.read()
                if token_data:
                    self.token = token_data
                else:
                    print('token.txt文件内容为空,请删除后在运行')
                    return True
        else:
            self.send_message()
            self.login()
        self.start_upload_file()

    # 获取uuid
    def get_uuid(self):
        s = ['' for i in range(36)]
        hexDigits = "0123456789abcdef"
        for i in range(36):
            s[i] = hexDigits[math.floor(random.random() * 0x10)]
        s[14] = "4"
        s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
        s[8] = s[13] = s[18] = s[23] = ""
        self.uuid = ''.join(s)

    # 发送短信
    def send_message(self):
        self.phone = int(input('输入你的手机号码:'))
        url = "https://user.api.hudunsoft.com/v1/sms"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/sms',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '163',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'version': 'v1.0.0',
            'phone': self.phone,
            'uuid': self.uuid,
            'code': ''
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            message = response.json().get('message')
            if "ok" in message:
                print("短信发送成功",message)
                break
            else:
                print("短信发送失败",u'%s' % message)
                data['code'] = self.recognition_image()

    # 识别图片验证码
    def recognition_image(self):
        url = 'https://user.api.hudunsoft.com/v1/captcha?uuid={uuid}&time={time}&client=web&source=335'.format(uuid=self.uuid,time=str(time.time()).replace('.','')[:13])
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'GET',
            'path': '/v1/captcha?uuid=e884673549de432f8487c6078bc38685&time=1597927148527&client=web&source=335',
            'scheme': 'https',
            'accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'image',
            'sec-fetch-mode': 'no-cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        response = self.session.get(url,headers=headers)
        with open('验证码.jpg','wb') as f:
            f.write(response.content)
        code = int(input("请查看文件内的验证码并输入:"))
        return code

    # 登陆
    def login(self):
        self.code = int(input('输入你的短信验证码:'))
        url = "https://user.api.hudunsoft.com/v1/user/auto_sign_in"
        headers = {
             'authority': 'user.api.hudunsoft.com',
             'method': 'POST',
             'path': '/v1/user/auto_sign_in',
             'scheme': 'https',
             'accept': 'application/json, text/javascript, */*; q=0.01',
             'accept-encoding': 'gzip, deflate, br',
             'accept-language': 'zh-CN,zh;q=0.9',
             'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
             'origin': 'http://voice.xunjiepdf.com',
             'referer': 'http://voice.xunjiepdf.com/voice2text.html',
             'sec-fetch-dest': 'empty',
             'sec-fetch-mode': 'cors',
             'sec-fetch-site': 'cross-site',
             'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
             'client': 'web',
             'source': '335',
             'soft_version': 'v3.0.1.1',
             'device_id': self.uuid,
             'phone': self.phone,
             'code': self.code
        }
        response = self.session.post(url,headers=headers,data=data)
        json_data = response.json()
        if "ok" in json_data.get('message'):
            print("登陆成功")
            print(json_data)
            self.token = json_data.get('data').get('token')
            with open('cookie.txt','w') as f:
                f.write(json.dumps(requests.utils.dict_from_cookiejar(response.cookies)))
            with open('token.txt','w') as f:
                f.write(self.token)
        else:
            print("登陆失败")

    # 获取GUID
    def get_guid(self):
        guid = os.popen('node guid.js').read().replace('\n', '')
        return guid

    # 获取文件md5值
    def get_md5(self):
        md5 = hashlib.md5()
        with open(self.file, 'rb') as f:
            md5.update(f.read())
            self.md5_file = md5.hexdigest()

    # 开始上传
    def start_upload_file(self):
        path = '/v1/alivoice/uploadaudiofile?r=' + str(random.random())
        url = 'https://user.api.hudunsoft.com' + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        self.file = '1.mp3'
        self.get_md5()
        self.file_name = self.get_guid() + '_' + self.file
        data = {
            'action': 'Begin',
            'fileName': self.file_name,
            'md5': self.md5_file
        }
        response = self.session.post(url,headers=headers,data=data)
        print(response.text)

if __name__ == '__main__':
    xunjie()

在这里插入图片描述
重复上传一样的,就会返回{"pos":"-1"},如果是新上传的就会返回{"pos":"0"}

在这里插入图片描述
第二个POST请求:

这个请求我就不在带你们看了,直接就上代码了,因为前面的部分都已经讲的很清楚了。

在这里插入图片描述

Python代码:

import math
import random
import requests
import json
import os
import hashlib
import time
from urllib3 import encode_multipart_formdata

class xunjie():
    def __init__(self):
        self.session = requests.Session()
        self.get_uuid()
        # 持久化登陆代码
        if 'cookie.txt' in os.listdir('.'):
            with open('cookie.txt', 'r') as f:
                cookie_data = f.read()
                if cookie_data:
                    self.session.cookies = requests.utils.cookiejar_from_dict(json.loads(cookie_data))
                else:
                    print('cookie.txt文件内容为空,请删除后在运行')
                    return True
            with open('token.txt', 'r') as f:
                token_data = f.read()
                if token_data:
                    self.token = token_data
                else:
                    print('token.txt文件内容为空,请删除后在运行')
                    return True
        else:
            self.send_message()
            self.login()
        self.start_upload_file()
        self.store_upload_file()

    # 获取uuid
    def get_uuid(self):
        s = ['' for i in range(36)]
        hexDigits = "0123456789abcdef"
        for i in range(36):
            s[i] = hexDigits[math.floor(random.random() * 0x10)]
        s[14] = "4"
        s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
        s[8] = s[13] = s[18] = s[23] = ""
        self.uuid = ''.join(s)

    # 发送短信
    def send_message(self):
        self.phone = int(input('输入你的手机号码:'))
        url = "https://user.api.hudunsoft.com/v1/sms"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/sms',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '163',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'version': 'v1.0.0',
            'phone': self.phone,
            'uuid': self.uuid,
            'code': ''
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            message = response.json().get('message')
            if "ok" in message:
                print("短信发送成功",message)
                break
            else:
                print("短信发送失败",u'%s' % message)
                data['code'] = self.recognition_image()

    # 识别图片验证码
    def recognition_image(self):
        url = 'https://user.api.hudunsoft.com/v1/captcha?uuid={uuid}&time={time}&client=web&source=335'.format(uuid=self.uuid,time=str(time.time()).replace('.','')[:13])
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'GET',
            'path': '/v1/captcha?uuid=e884673549de432f8487c6078bc38685&time=1597927148527&client=web&source=335',
            'scheme': 'https',
            'accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'image',
            'sec-fetch-mode': 'no-cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        response = self.session.get(url,headers=headers)
        with open('验证码.jpg','wb') as f:
            f.write(response.content)
        code = int(input("请查看文件内的验证码并输入:"))
        return code

    # 登陆
    def login(self):
        self.code = int(input('输入你的短信验证码:'))
        url = "https://user.api.hudunsoft.com/v1/user/auto_sign_in"
        headers = {
             'authority': 'user.api.hudunsoft.com',
             'method': 'POST',
             'path': '/v1/user/auto_sign_in',
             'scheme': 'https',
             'accept': 'application/json, text/javascript, */*; q=0.01',
             'accept-encoding': 'gzip, deflate, br',
             'accept-language': 'zh-CN,zh;q=0.9',
             'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
             'origin': 'http://voice.xunjiepdf.com',
             'referer': 'http://voice.xunjiepdf.com/voice2text.html',
             'sec-fetch-dest': 'empty',
             'sec-fetch-mode': 'cors',
             'sec-fetch-site': 'cross-site',
             'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
             'client': 'web',
             'source': '335',
             'soft_version': 'v3.0.1.1',
             'device_id': self.uuid,
             'phone': self.phone,
             'code': self.code
        }
        response = self.session.post(url,headers=headers,data=data)
        json_data = response.json()
        if "ok" in json_data.get('message'):
            print("登陆成功")
            print(json_data)
            self.token = json_data.get('data').get('token')
            with open('cookie.txt','w') as f:
                f.write(json.dumps(requests.utils.dict_from_cookiejar(response.cookies)))
            with open('token.txt','w') as f:
                f.write(self.token)
        else:
            print("登陆失败")

    # 获取GUID
    def get_guid(self):
        guid = os.popen('node guid.js').read().replace('\n', '')
        return guid

    # 获取文件md5值
    def get_md5(self):
        md5 = hashlib.md5()
        with open(self.file, 'rb') as f:
            md5.update(f.read())
            self.md5_file = md5.hexdigest()

    # 开始上传
    def start_upload_file(self):
        path = '/v1/alivoice/uploadaudiofile?r=' + str(random.random())
        url = 'https://user.api.hudunsoft.com' + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        self.file = '1.mp3'
        self.get_md5()
        self.file_name = self.get_guid() + '_' + self.file
        data = {
            'action': 'Begin',
            'fileName': self.file_name,
            'md5': self.md5_file
        }
        response = self.session.post(url,headers=headers,data=data)
        print(response.text)

    # 分片上传文件内容
    def store_upload_file(self):
        path = "/v1/alivoice/uploadaudiofile?r=" + str(random.random())
        url = "https://user.api.hudunsoft.com" + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '2097152',
            'content-type': 'multipart/form-data;',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'action': 'Store',
            'pos': '0',
            'size': '2097152',
            'md5': self.md5_file
        }
        with open(self.file, 'rb') as f:
            while True:
                files = f.read(2 * 1024 * 1024)
                if files:
                    data['size'] = len(files)
                    data['file'] = (self.file, files)
                    encode_data = encode_multipart_formdata(data)
                    data1 = encode_data[0]
                    headers['Content-Type'] = encode_data[1]
                    headers['content-length'] = str(len(files))
                    response = self.session.post(url,data=data1,headers=headers)
                    print(response.text)
                    f.seek(f.tell())
                    data['pos'] = f.tell()
                else:
                    print('上传完成')
                    break

if __name__ == '__main__':
    xunjie()

在这里插入图片描述

第三个POST请求:

这个请求我就也不在带你们看了,直接就上代码了,因为前面的部分都已经讲的很清楚了。

在这里插入图片描述
Python代码:

import math
import random
import requests
import json
import os
import hashlib
import time
from urllib3 import encode_multipart_formdata

class xunjie():
    def __init__(self):
        self.session = requests.Session()
        self.get_uuid()
        # 持久化登陆代码
        if 'cookie.txt' in os.listdir('.'):
            with open('cookie.txt', 'r') as f:
                cookie_data = f.read()
                if cookie_data:
                    self.session.cookies = requests.utils.cookiejar_from_dict(json.loads(cookie_data))
                else:
                    print('cookie.txt文件内容为空,请删除后在运行')
                    return True
            with open('token.txt', 'r') as f:
                token_data = f.read()
                if token_data:
                    self.token = token_data
                else:
                    print('token.txt文件内容为空,请删除后在运行')
                    return True
        else:
            self.send_message()
            self.login()
        self.start_upload_file()
        self.store_upload_file()
        self.end_upload_file()

    # 获取uuid
    def get_uuid(self):
        s = ['' for i in range(36)]
        hexDigits = "0123456789abcdef"
        for i in range(36):
            s[i] = hexDigits[math.floor(random.random() * 0x10)]
        s[14] = "4"
        s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
        s[8] = s[13] = s[18] = s[23] = ""
        self.uuid = ''.join(s)

    # 发送短信
    def send_message(self):
        self.phone = int(input('输入你的手机号码:'))
        url = "https://user.api.hudunsoft.com/v1/sms"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/sms',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '163',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'version': 'v1.0.0',
            'phone': self.phone,
            'uuid': self.uuid,
            'code': ''
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            message = response.json().get('message')
            if "ok" in message:
                print("短信发送成功",message)
                break
            else:
                print("短信发送失败",u'%s' % message)
                data['code'] = self.recognition_image()

    # 识别图片验证码
    def recognition_image(self):
        url = 'https://user.api.hudunsoft.com/v1/captcha?uuid={uuid}&time={time}&client=web&source=335'.format(uuid=self.uuid,time=str(time.time()).replace('.','')[:13])
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'GET',
            'path': '/v1/captcha?uuid=e884673549de432f8487c6078bc38685&time=1597927148527&client=web&source=335',
            'scheme': 'https',
            'accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'image',
            'sec-fetch-mode': 'no-cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        response = self.session.get(url,headers=headers)
        with open('验证码.jpg','wb') as f:
            f.write(response.content)
        code = int(input("请查看文件内的验证码并输入:"))
        return code

    # 登陆
    def login(self):
        self.code = int(input('输入你的短信验证码:'))
        url = "https://user.api.hudunsoft.com/v1/user/auto_sign_in"
        headers = {
             'authority': 'user.api.hudunsoft.com',
             'method': 'POST',
             'path': '/v1/user/auto_sign_in',
             'scheme': 'https',
             'accept': 'application/json, text/javascript, */*; q=0.01',
             'accept-encoding': 'gzip, deflate, br',
             'accept-language': 'zh-CN,zh;q=0.9',
             'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
             'origin': 'http://voice.xunjiepdf.com',
             'referer': 'http://voice.xunjiepdf.com/voice2text.html',
             'sec-fetch-dest': 'empty',
             'sec-fetch-mode': 'cors',
             'sec-fetch-site': 'cross-site',
             'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
             'client': 'web',
             'source': '335',
             'soft_version': 'v3.0.1.1',
             'device_id': self.uuid,
             'phone': self.phone,
             'code': self.code
        }
        response = self.session.post(url,headers=headers,data=data)
        json_data = response.json()
        if "ok" in json_data.get('message'):
            print("登陆成功")
            print(json_data)
            self.token = json_data.get('data').get('token')
            with open('cookie.txt','w') as f:
                f.write(json.dumps(requests.utils.dict_from_cookiejar(response.cookies)))
            with open('token.txt','w') as f:
                f.write(self.token)
        else:
            print("登陆失败")

    # 获取GUID
    def get_guid(self):
        guid = os.popen('node guid.js').read().replace('\n', '')
        return guid

    # 获取文件md5值
    def get_md5(self):
        md5 = hashlib.md5()
        with open(self.file, 'rb') as f:
            md5.update(f.read())
            self.md5_file = md5.hexdigest()

    # 开始上传
    def start_upload_file(self):
        path = '/v1/alivoice/uploadaudiofile?r=' + str(random.random())
        url = 'https://user.api.hudunsoft.com' + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        self.file = '1.mp3'
        self.get_md5()
        self.file_name = self.get_guid() + '_' + self.file
        data = {
            'action': 'Begin',
            'fileName': self.file_name,
            'md5': self.md5_file
        }
        response = self.session.post(url,headers=headers,data=data)
        print(response.text)

    # 分片上传文件内容
    def store_upload_file(self):
        path = "/v1/alivoice/uploadaudiofile?r=" + str(random.random())
        url = "https://user.api.hudunsoft.com" + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '2097152',
            'content-type': 'multipart/form-data;',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'action': 'Store',
            'pos': '0',
            'size': '2097152',
            'md5': self.md5_file
        }
        with open(self.file, 'rb') as f:
            while True:
                files = f.read(2 * 1024 * 1024)
                if files:
                    data['size'] = len(files)
                    data['file'] = (self.file, files)
                    encode_data = encode_multipart_formdata(data)
                    data1 = encode_data[0]
                    headers['Content-Type'] = encode_data[1]
                    headers['content-length'] = str(len(files))
                    response = self.session.post(url,data=data1,headers=headers)
                    print(response.text)
                    f.seek(f.tell())
                    data['pos'] = f.tell()
                else:
                    print('上传完成')
                    break

    # 结束上传
    def end_upload_file(self):
        path = '/v1/alivoice/uploadaudiofile?r=' + str(random.random())
        url = 'https://user.api.hudunsoft.com' + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'action': 'End',
            'fileName': self.file_name,
            'md5': self.md5_file
        }
        response = self.session.post(url,headers=headers,data=data)
        print(response.text)

if __name__ == '__main__':
    xunjie()

在这里插入图片描述

音频转文字

当我们点击转换文字时,产生的POST请求,为啥会有多个呢,是因为前面没有转换成功就会继续请求。

音频转文字失败:

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

音频转文字成功:

在这里插入图片描述
直接使用Python来进行请求,这个请求参数也没有什么变化,直接请求即可。

Python代码:

import math
import random
import requests
import json
import os
import hashlib
import time
from urllib3 import encode_multipart_formdata

class xunjie():
    def __init__(self,file):
        self.file = file
        self.session = requests.Session()
        self.get_uuid()
        # 持久化登陆代码
        if 'cookie.txt' in os.listdir('.'):
            with open('cookie.txt', 'r') as f:
                cookie_data = f.read()
                if cookie_data:
                    self.session.cookies = requests.utils.cookiejar_from_dict(json.loads(cookie_data))
                else:
                    print('cookie.txt文件内容为空,请删除后在运行')
                    return True
            with open('token.txt', 'r') as f:
                token_data = f.read()
                if token_data:
                    self.token = token_data
                else:
                    print('token.txt文件内容为空,请删除后在运行')
                    return True
        else:
            self.send_message()
            self.login()
        self.start_upload_file()
        self.store_upload_file()
        self.end_upload_file()
        self.md5_to_text()

    # 获取uuid
    def get_uuid(self):
        s = ['' for i in range(36)]
        hexDigits = "0123456789abcdef"
        for i in range(36):
            s[i] = hexDigits[math.floor(random.random() * 0x10)]
        s[14] = "4"
        s[19] = hexDigits[(int(s[19]) if s[19].isdecimal() else 0 & 0x3) | 0x8]
        s[8] = s[13] = s[18] = s[23] = ""
        self.uuid = ''.join(s)

    # 发送短信
    def send_message(self):
        self.phone = int(input('输入你的手机号码:'))
        url = "https://user.api.hudunsoft.com/v1/sms"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/sms',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '163',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'version': 'v1.0.0',
            'phone': self.phone,
            'uuid': self.uuid,
            'code': ''
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            message = response.json().get('message')
            if "ok" in message:
                print("短信发送成功",message)
                break
            else:
                print("短信发送失败",u'%s' % message)
                data['code'] = self.recognition_image()

    # 识别图片验证码
    def recognition_image(self):
        url = 'https://user.api.hudunsoft.com/v1/captcha?uuid={uuid}&time={time}&client=web&source=335'.format(uuid=self.uuid,time=str(time.time()).replace('.','')[:13])
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'GET',
            'path': '/v1/captcha?uuid=e884673549de432f8487c6078bc38685&time=1597927148527&client=web&source=335',
            'scheme': 'https',
            'accept': 'image/webp,image/apng,image/*,*/*;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'image',
            'sec-fetch-mode': 'no-cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        response = self.session.get(url,headers=headers)
        with open('验证码.jpg','wb') as f:
            f.write(response.content)
        code = int(input("请查看文件内的验证码并输入:"))
        return code

    # 登陆
    def login(self):
        self.code = int(input('输入你的短信验证码:'))
        url = "https://user.api.hudunsoft.com/v1/user/auto_sign_in"
        headers = {
             'authority': 'user.api.hudunsoft.com',
             'method': 'POST',
             'path': '/v1/user/auto_sign_in',
             'scheme': 'https',
             'accept': 'application/json, text/javascript, */*; q=0.01',
             'accept-encoding': 'gzip, deflate, br',
             'accept-language': 'zh-CN,zh;q=0.9',
             'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
             'origin': 'http://voice.xunjiepdf.com',
             'referer': 'http://voice.xunjiepdf.com/voice2text.html',
             'sec-fetch-dest': 'empty',
             'sec-fetch-mode': 'cors',
             'sec-fetch-site': 'cross-site',
             'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
             'client': 'web',
             'source': '335',
             'soft_version': 'v3.0.1.1',
             'device_id': self.uuid,
             'phone': self.phone,
             'code': self.code
        }
        response = self.session.post(url,headers=headers,data=data)
        json_data = response.json()
        if "ok" in json_data.get('message'):
            print("登陆成功")
            print(json_data)
            self.token = json_data.get('data').get('token')
            with open('cookie.txt','w') as f:
                f.write(json.dumps(requests.utils.dict_from_cookiejar(response.cookies)))
            with open('token.txt','w') as f:
                f.write(self.token)
        else:
            print("登陆失败")

    # 获取GUID
    def get_guid(self):
        guid = os.popen('node guid.js').read().replace('\n', '')
        return guid

    # 获取文件md5值
    def get_md5(self):
        md5 = hashlib.md5()
        with open(self.file, 'rb') as f:
            md5.update(f.read())
            self.md5_file = md5.hexdigest()

    # 开始上传
    def start_upload_file(self):
        path = '/v1/alivoice/uploadaudiofile?r=' + str(random.random())
        url = 'https://user.api.hudunsoft.com' + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        self.get_md5()
        self.file_name = self.get_guid() + '_' + self.file
        data = {
            'action': 'Begin',
            'fileName': self.file_name,
            'md5': self.md5_file
        }
        response = self.session.post(url,headers=headers,data=data)
        print(response.text)

    # 分片上传文件内容
    def store_upload_file(self):
        path = "/v1/alivoice/uploadaudiofile?r=" + str(random.random())
        url = "https://user.api.hudunsoft.com" + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-length': '2097152',
            'content-type': 'multipart/form-data;',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'action': 'Store',
            'pos': '0',
            'size': '2097152',
            'md5': self.md5_file
        }
        with open(self.file, 'rb') as f:
            while True:
                files = f.read(2 * 1024 * 1024)
                if files:
                    data['size'] = len(files)
                    data['file'] = (self.file, files)
                    encode_data = encode_multipart_formdata(data)
                    data1 = encode_data[0]
                    headers['Content-Type'] = encode_data[1]
                    headers['content-length'] = str(len(files))
                    response = self.session.post(url,data=data1,headers=headers)
                    print(response.text)
                    f.seek(f.tell())
                    data['pos'] = f.tell()
                else:
                    print('上传完成')
                    break

    # 结束上传
    def end_upload_file(self):
        path = '/v1/alivoice/uploadaudiofile?r=' + str(random.random())
        url = 'https://user.api.hudunsoft.com' + path
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': path,
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'action': 'End',
            'fileName': self.file_name,
            'md5': self.md5_file
        }
        response = self.session.post(url,headers=headers,data=data)
        print(response.text)

    # 访问md5ToText,也就是音频转换为文本
    def md5_to_text(self):
        url = "https://user.api.hudunsoft.com/v1/alivoice/md5Totext"
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/alivoice/md5Totext',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'md5': self.md5_file,
            'fileName': self.file,
            'title': self.file,
            'token': self.token
        }
        response = self.session.post(url,headers=headers,data=data)
        json_data = response.json()
        message = json_data['message']
        if message:
            print(message)
            print(json_data)
        else:
            self.task_id = json_data['data']['task_id']
            self.get_task_info()

    # 继续识别音频
    def get_task_info(self):
        url = 'https://user.api.hudunsoft.com/v1/alivoice/getTaskInfo'
        headers = {
            'authority': 'user.api.hudunsoft.com',
            'method': 'POST',
            'path': '/v1/alivoice/getTaskInfo',
            'scheme': 'https',
            'accept': 'application/json, text/javascript, */*; q=0.01',
            'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'origin': 'http://voice.xunjiepdf.com',
            'referer': 'http://voice.xunjiepdf.com/voice2text.html',
            'sec-fetch-dest': 'empty',
            'sec-fetch-mode': 'cors',
            'sec-fetch-site': 'cross-site',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
        }
        data = {
            'client': 'web',
            'source': '335',
            'soft_version': 'v3.0.1.1',
            'device_id': self.uuid,
            'taskId': self.task_id
        }
        while True:
            response = self.session.post(url,headers=headers,data=data)
            json_data = response.json()
            message = json_data['message']
            if message:
                print(message)
                break
            else:
                continue

if __name__ == '__main__':
    xunjie('1.mp3')

在这里插入图片描述
自此,所有的代码都在这里面了。

彩蛋

注意:这是个小彩蛋,你们仔细看看吧,只能帮到这里了。

在这里插入图片描述

在这里插入图片描述

声明:本文仅供学习交流使用,请勿用于商业用途,违者后果自负。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值