Python爬虫:中国结算,关于新开股票账户数等参数数据的爬取

1数据来源:http://www.chinaclear.cn/zdjs/tjyb1/center_tjbg.shtml

2获取内容 :09到至今的主要指标概览数据内容

3可参考代码,直接用就行

import requests
import re
import datetime

def get_month_range(start_day,end_day):
  months = (end_day.year - start_day.year)*12 + end_day.month - start_day.month
  month_range = ['%s年%s月'%(start_day.year + mon//12,str(mon%12+1).zfill(2)) for mon in range(start_day.month-1,start_day.month + months)]
  return month_range



def spider(date_list):
    for i in date_list:
        date =  int(i.replace('年','').replace('月',''))
        if date >=200904 and date < 201001 :
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            get_data = r'<tr style=.*?>.*?<td width="277" .*?><font .*?>.*?</font>.*?<p .*?><span .*?>(.*?)</span></p>.*?</font></td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[3]
            end_investors = data[2]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5]
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7]
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201001:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            get_data = r'<tr style=.*?>.*?<td .*?>.*?<p align="right" .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            month = data[0]
            new_investors = data[3]
            end_investors = data[2]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5]
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7]
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date >201001 and date <=201311:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            get_data = r'<tr style="height:13.5pt">.*?<td .*?>.*?<p .*?><span .*?>.*?</span></p>.*?</td>.*?<td .*?>.*?<p .*? align="right"><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[3]
            end_investors = data[2]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5]
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7]
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201312:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style="height: 13.5pt;">.*?<td width="19%" .*?>.*?<p align="right" .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[3]
            end_investors = data[2]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5].replace('<span>&nbsp; </span>','')
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7].replace('<span>&nbsp;</span>','')
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201401:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="19%" .*?>.*?<p .*? align="right"><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[3]
            end_investors = data[2]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5].replace('<span>&nbsp; </span>','')
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7].replace('<span>&nbsp;</span>','')
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date > 201401 and date < 201410:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="20%" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[3]
            end_investors = data[2]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5].replace('<span>&nbsp; </span>', '')
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7].replace('<span>&nbsp;</span>', '')
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date >= 201410 and date <= 201412:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="20%" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[4]
            end_investors = data[2]
            registered_securities_number = data[5]
            registered_securities_totalparvalue = data[6]
            registered_securities_totalmarketvalue = data[7]
            non_restricted_market_value = data[8]
            total_number_of_transfers = data[10]
            total_amount_of_transfer = data[11]
            total_settlement = data[12]
            net_settlement = data[13]
            print(month, new_investors, end_investors, registered_securities_number, registered_securities_totalparvalue,registered_securities_totalmarketvalue, non_restricted_market_value, total_number_of_transfers,total_amount_of_transfer, total_settlement, net_settlement)
        elif date >= 201501 and date <201503:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data1 = r'<tr style=.*?>.*?<td .*?>.*?<p .*? align=.*?><span .*?>.*?</span></p>.*?</td>.*?<td .*?>.*?<p .*? align=.*?><span .*?>(.*?)</span></p>.*?</td>.*?<td .*?>.*?<p .*? align=.*?><span .*?>.*?</span></p>.*?</td>.*?</tr>'
            pattern1 = re.compile(get_data1, re.I | re.S | re.M)
            data1 = pattern1.findall(html)
            get_data = r'<tr style=.*?>.*?<td .*?>.*?<p .*? align="left"><span .*?>.*?</span></p>.*?</td>.*?.*?<td .*?>.*?<p .*? align="right"><span .*?>(.*?)</span></p>.*?</td>.*?.*?<td .*?>.*?<p .*? align="right"><span .*?>.*?</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            month = data1[0]
            new_investors = data[3]
            end_investors = data[1]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5]
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7]
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number, registered_securities_totalparvalue,registered_securities_totalmarketvalue, non_restricted_market_value, total_number_of_transfers,total_amount_of_transfer, total_settlement, net_settlement)
        elif date == 201503:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td .*?>.*?<p .*?><span .*?>.*?</span></p>.*?</td>.*?<td .*?>.*?<p .*? align=.*?><span .*?>(.*?)</span></p>.*?</td>.*?<td .*?>.*?<p .*? align=.*?><span .*?>.*?</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            month = data[0]
            new_investors = data[3]
            end_investors = data[1]
            registered_securities_number = data[4]
            registered_securities_totalparvalue = data[5]
            registered_securities_totalmarketvalue = data[6]
            non_restricted_market_value = data[7]
            total_number_of_transfers = data[9]
            total_amount_of_transfer = data[10]
            total_settlement = data[11]
            net_settlement = data[12]
            print(month, new_investors, end_investors, registered_securities_number, registered_securities_totalparvalue,registered_securities_totalmarketvalue, non_restricted_market_value, total_number_of_transfers,total_amount_of_transfer, total_settlement, net_settlement)
        elif date > 201503 and date <=201506:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="100" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[5]
            registered_securities_totalparvalue = data[6]
            registered_securities_totalmarketvalue = data[7]
            non_restricted_market_value = data[8]
            total_number_of_transfers = data[10]
            total_amount_of_transfer = data[11]
            total_settlement = data[12]
            net_settlement = data[13]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201507:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="158" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3]
            registered_securities_totalparvalue = data[4]
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201508:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="100" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3]
            registered_securities_totalparvalue = data[4]
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201509:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="158" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3]
            registered_securities_totalparvalue = data[4]
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date > 201509 and date <= 201511:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="26%" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3]
            registered_securities_totalparvalue = data[4]
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201512:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="27%" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3]
            registered_securities_totalparvalue = data[4]
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201601:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="29%" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3]
            registered_securities_totalparvalue = data[4]
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date > 201601 and date <=201607:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="26%" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3]
            registered_securities_totalparvalue = data[4]
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date ==201608:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td .*? width="142" noWrap="">.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3].replace('<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>','').strip()
            registered_securities_totalparvalue = data[4].replace('<span>&nbsp;&nbsp;&nbsp; </span>','').strip()
            registered_securities_totalmarketvalue = data[5].replace('<span>&nbsp;&nbsp;&nbsp; </span>','').strip()
            non_restricted_market_value = data[6].replace('<span>&nbsp;&nbsp;&nbsp; </span>','').strip()
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date >= 201609 and date<=201610:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="26%" .*?>.*?<p .*?><span .*?>(.*?)</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3].replace('&nbsp;','').replace('<span>','').replace('</span>','').strip()
            registered_securities_totalparvalue = data[4].replace('&nbsp;','').replace('<span>','').replace('</span>','').strip()
            registered_securities_totalmarketvalue = data[5].replace('&nbsp;','').replace('<span>','').replace('</span>','').strip()
            non_restricted_market_value = data[6].replace('&nbsp;','').replace('<span>','').replace('</span>','').strip()
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201611:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="26%" .*?><span style=.*?><font .*?>.*?</font></span>.*?<p align="right" .*?><span style=.*?><font .*?>(.*?)</font></span></p>.*?<span style=.*?><font .*?>.*?</font></span></td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3].replace('&nbsp;','').replace('<span style="font-size: 10.5pt;">','').replace('</span>','').strip()
            registered_securities_totalparvalue = data[4].replace('&nbsp;','').replace('<span style="font-size: 10.5pt;">','').replace('</span>','').strip()
            registered_securities_totalmarketvalue = data[5].replace('&nbsp;','').replace('<span style="font-size: 10.5pt;">','').replace('</span>','').strip()
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201612:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr style=.*?>.*?<td width="26%" .*?><span style=.*?>.*?</span>.*?<p align="right" .*?><span .*?>(.*?)</span></p>.*?<span style=.*?>.*?</span></td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2].replace('<span>','').strip()
            registered_securities_number = data[3].replace('<span>','').replace('&nbsp;','').replace('<span style="font-size: 10.5pt;">','').replace('</span>','').strip()
            registered_securities_totalparvalue = data[4].replace('<span>','').replace('&nbsp;','').replace('<span style="font-size: 10.5pt;">','').replace('</span>','').strip()
            registered_securities_totalmarketvalue = data[5].replace('<span>','').replace('&nbsp;','').replace('<span style="font-size: 10.5pt;">','').replace('</span>','').strip()
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[8]
            total_amount_of_transfer = data[9]
            total_settlement = data[10]
            net_settlement = data[11]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date == 201701:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'
            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = '<tbody>.*?<tr style=.*?>.*?<td .*?>.*?</td>.*?<td .*?>.*?<p .*?><span style=.*?>(.*?)</span></p>.*?</td>.*?<td .*?>.*?</td>.*?</tr>.*?<tr style=.*?>.*?<td .*?>.*?</td>.*?<td .*?>.*?<p .*?><span style=.*?>(.*?)</span></p>.*?</td>.*?<td .*?>.*?</td>.*?</tr>.*?</tbody>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            month = data[0][0]
            new_investors = data[0][1].replace('<span>', '').replace('</span>', '').replace('&nbsp;', '').strip()
            get_data_one = '<tr .*?>.*?<td .*?>.*?<p .*?>.*?<span .*?>.*?</span>.*?</p>.*?</td>.*?<td .*?>.*?<p .*?>.*?<span style=.*?>.*?<span>.*?</span>(.*?)</span>.*?</p>.*?</td>.*?</tr>'
            pattern_one = re.compile(get_data_one, re.I | re.S | re.M)
            data_one = pattern_one.findall(html)[1:-1]
            # print(data_one)
            end_investors  = data_one[0]
            registered_securities_number  = data_one[1]
            registered_securities_totalparvalue  = data_one[2]
            registered_securities_totalmarketvalue  = data_one[3]
            non_restricted_market_value  = data_one[4]
            total_number_of_transfers  = data_one[5]
            total_amount_of_transfer  = data_one[6]
            total_settlement  = data_one[6]
            get_data_second ='<td .*?>.*?<p .*?>.*?<span .*?>.*?<span>.*?</span>.*?<span>.*?</span>(.*?)</span>.*?</p>.*?</td>'
            pattern_second = re.compile(get_data_second , re.I | re.S | re.M)
            net_settlement = pattern_second.findall(html)[-2]
            print(month,new_investors,end_investors,registered_securities_number,registered_securities_totalparvalue,registered_securities_totalmarketvalue,non_restricted_market_value,total_number_of_transfers,total_amount_of_transfer,total_settlement,net_settlement)
        elif date > 201701 and date <= 201705:
            Headers = {
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
            'Accept-Encoding': 'gzip, deflate',
            'Accept-Language': 'zh-CN,zh;q=0.9',
            'Cache-Control': 'max-age=0',
            'Connection': 'keep-alive',
            'Content-Length': '122',
            'Content-Type': 'application/x-www-form-urlencoded',
            'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
            'Host': 'www.chinaclear.cn',
            'Origin': 'http://www.chinaclear.cn',
            'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
            'Upgrade-Insecure-Requests': '1',
            'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'
            }
            data = {
            'riqi': '{0}'.format(i),
            'channelFidStr': 'e990411f19544e46be84333c25b63de6',
            'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url,headers=Headers, data=data)
            response.encoding= 'utf-8'
            html = response.text
            # print(html)
            get_data = '<tbody>.*?<tr style=.*?>.*?<td .*?>.*?</td>.*?<td .*?>.*?<p .*?><span style=.*?>(.*?)</span></p>.*?</td>.*?<td .*?>.*?</td>.*?</tr>.*?<tr style=.*?>.*?<td .*?>.*?</td>.*?<td .*?>.*?<p .*?><span style=.*?>(.*?)</span></p>.*?</td>.*?<td .*?>.*?</td>.*?</tr>.*?</tbody>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            month = data[0][0]
            get_data1 = '<tr .*?>.*?<td .*?>.*?<p .*?>.*?<span .*?>.*?</span>.*?</p>.*?</td>.*?<td .*?>.*?<p .*?>.*?<span style=.*?>.*?<span>.*?</span>(.*?)</span>.*?</p>.*?</td>.*?</tr>'
            pattern1 = re.compile(get_data1, re.I | re.S | re.M)
            data1 = pattern1.findall(html)[1:]
            new_investors  = data1[0]
            end_investors  = data1[1]
            registered_securities_number  = data1[2]
            registered_securities_totalparvalue  = data1[3]
            registered_securities_totalmarketvalue  = data1[4]
            non_restricted_market_value  = data1[5]
            total_number_of_transfers  = data1[6]
            total_amount_of_transfer  = data1[7]
            total_settlement  = data1[8]
            net_settlement  = data1[9]
            print(month,new_investors,end_investors,registered_securities_number,registered_securities_totalparvalue,registered_securities_totalmarketvalue,non_restricted_market_value,total_number_of_transfers,total_amount_of_transfer,total_settlement,net_settlement)
        elif date >= 201706  and date <201709:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            get_data = '<tr style=.*?>.*?<td .*?>.*?<p .*?><span style=.*?>.*?</span></p>.*?</td>.*?<td .*?>.*?<p .*?><span style=.*?>(.*?)</span></p>.*?</td>.*?<td .*?>.*?<p .*?><span style=.*?>.*?</span></p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3].replace('<span>', '').replace('</span>', '').replace('&nbsp;', '').strip()
            registered_securities_totalparvalue = data[4].replace('<span>', '').replace('</span>', '').replace('&nbsp;', '').strip()
            registered_securities_totalmarketvalue = data[5].replace('<span>', '').replace('</span>', '').replace('&nbsp;', '').strip()
            non_restricted_market_value = data[6].replace('<span>', '').replace('</span>', '').replace('&nbsp;', '').strip()
            total_number_of_transfers = data[7]
            total_amount_of_transfer = data[8]
            total_settlement = data[9]
            net_settlement = data[10]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
        elif date >= 201709:
            Headers = {
                'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
                'Accept-Encoding': 'gzip, deflate',
                'Accept-Language': 'zh-CN,zh;q=0.9',
                'Cache-Control': 'max-age=0',
                'Connection': 'keep-alive',
                'Content-Length': '122',
                'Content-Type': 'application/x-www-form-urlencoded',
                'Cookie': 'JSESSIONID=00005q0oN93pCb5mAK5eZQGAa7t:1amj63rte',
                'Host': 'www.chinaclear.cn',
                'Origin': 'http://www.chinaclear.cn',
                'Referer': 'http://www.chinaclear.cn/cms-search/monthview.action?action=china',
                'Upgrade-Insecure-Requests': '1',
                'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36'

            }
            data = {
                'riqi': '{0}'.format(i),
                'channelFidStr': 'e990411f19544e46be84333c25b63de6',
                'channelIdStr': 'bd095cc08f744c089b159a3bb744b9d0'
            }
            url = 'http://www.chinaclear.cn/cms-search/monthview.action?action=china'
            response = requests.post(url, headers=Headers, data=data)
            response.encoding = 'utf-8'
            html = response.text
            # print(html)
            get_data = r'<tr .*?>.*?<td .*?>.*?<p .*?>.*?<span .*?>.*?</span>.*?</p>.*?</td>.*?<td .*?>.*?<p .*?>.*?<span .*?>(.*?)</span>.*?</p>.*?</td>.*?</tr>'
            pattern = re.compile(get_data, re.I | re.S | re.M)
            data = pattern.findall(html)
            # print(data)
            month = data[0]
            new_investors = data[1]
            end_investors = data[2]
            registered_securities_number = data[3].replace('<span>&ensp;','')
            registered_securities_totalparvalue = data[4].strip().replace('<span style="font-size:10.5pt;font-family:宋体;color:#424242;">','').replace('<span style="font-size: 10.5pt; font-family: 宋体; color: rgb(66, 66, 66);">','').replace('<span style="font-size:10.5pt;font-family:            宋体;color:#424242;">','').replace('            <span style="font-size:            9.0pt;font-family:宋体;color:#424242;">','')
            registered_securities_totalmarketvalue = data[5]
            non_restricted_market_value = data[6]
            total_number_of_transfers = data[7]
            total_amount_of_transfer = data[8]
            total_settlement = data[9]
            net_settlement = data[10]
            print(month, new_investors, end_investors, registered_securities_number,registered_securities_totalparvalue, registered_securities_totalmarketvalue,non_restricted_market_value, total_number_of_transfers, total_amount_of_transfer, total_settlement,net_settlement)
    else:
        pass


if __name__ == '__main__':
    date_list =  get_month_range(datetime.date(2005, 1, 31),datetime.date(2020,9,1))
    spider(date_list)

 

 

 

 

 

 

 

 

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: Python是一种功能强大的编程语言,可以用于多种用途,其中之一是web爬虫。tushare是一个专业的股票数据接口,可以提供各种市场数据,如股票、指数、基金、期货等等。 使用Python和tushare进行股票数据爬取非常简便。首先需要在Python中安装tushare库,然后import该库到Python环境中。 通过tushare库,可以调用它提供的不同方法,如get_h_data()获取历史股票数据,get_today_ticks()获取今天的交易明细信息,get_tick_data()获取分笔数据等等。 例如,如果要获取股票的历史数据,可以使用如下代码: ```python import tushare as ts import pandas as pd # 设置股票代码和时间范围 code = '601318' start_date = '20210101' end_date = '20210630' # 调用tushare函数 df = ts.get_hist_data(code, start=start_date, end=end_date) # 查看数据 print(df.head()) ``` 这里获取的是中国平安(股票代码为601318)2021年1月1日至2021年6月30日的历史数据获取数据是一个pandas dataframe对象,可以使用各种数据处理和分析工具来操作和分析这些数据。例如,可以计算某个时间段内某个股票的均价、最大值、最小值等等,或者画出趋势图以及其他图表等等。 综上所述,通过tushare可以非常方便地获取股票数据,使用Python数据处理和分析工具,处理和分析这些数据,是进行量化投资和金融数据分析的重要工具。 ### 回答2: Python是一种广泛使用的编程语言,可用于各种项目和应用。其中,爬虫Python的一项重要应用技能之一,它可以帮助我们收集和分析网络上的信息。Tushare是一种Python股票数据API,可以帮助我们从股票市场上获取数据。 借助Python和Tushare,我们可以编写一个简单的股票爬虫程序,获取股票市场上各种类型的数据。例如,我们可以获取股票实时信息、历史价格、股票基本面数据等等。具体来说,我们可以用Tushare获取股票历史价格数据,然后用Python进行分析和可视化,帮助我们更好地了解股票市场的趋势和变化。 使用Python和Tushare进行股票数据爬取有很多优势。首先,Python是一种易于学习和使用的编程语言,具有很高的编程效率和灵活性。其次,Tushare是一个非常丰富和完整的股票数据API,可以帮助我们快速获取各种类型的数据。此外,Python和Tushare的开源性和免费地使用,使得股票数据爬取成本极低。 总之,Python和Tushare结合可以提供一个灵活、高效、低成本的解决方案,帮助爬虫程序员获取股票市场上各种类型的数据。这些数据可以是有助于投资决策的行业趋势和股票基本面数据,也可以是有助于交易行为的实时价格和历史价格数据

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值