统计mysql中学科引用表中学科引用次数

最新推荐文章于 2021-02-07 18:07:08 发布

棒棒糖one

最新推荐文章于 2021-02-07 18:07:08 发布

阅读量217

点赞数

分类专栏： python mysql

本文链接：https://blog.csdn.net/weixin_43332500/article/details/95623404

版权

python 同时被 2 个专栏收录

19 篇文章 1 订阅

订阅专栏

mysql

16 篇文章 1 订阅

订阅专栏

将学科相互引用表中的数据，统计出来，做成学科引用次数的矩阵
这一篇有点类似于前面一篇列转行关联矩阵，只是所用到的是mysql作为处理工具，这篇采用的是python来进行处理。
table2格式如下：字段re_sub即引文所在学科，ar_sb即文章所在学科，有这两个字段即可
在这里插入图片描述
计算学科引用次数，并将其填充到Excel表格中，做成矩阵形式

import pymysql.cursors
import logging
logging.basicConfig(filename='log.log',
                    format='%(asctime)s -%(name)s-%(levelname)s:%(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S %p',
                    level=logging.DEBUG)


def getsubjectcount(connection,subjectA,subjectB) :
	with connection.cursor() as cursor:
		# Read a single record
		sql = "select count(*) from `table2` where ar_sb=%s and re_sub=%s;"
		cursor.execute(sql, (subjectA,subjectB))
		result = cursor.fetchone()
		return result['count(*)']


#读取excel中的学科数据，返回dataArray
def readDataByExcle(inPutFile,Sheet):
	from openpyxl import load_workbook
	wb = load_workbook(inPutFile)
	sheet = wb[Sheet]
	dataArray = []
	print('读取数据完毕!')
	for i in range(1,sheet.max_row+1):
		subject = sheet["A"+str(i)].value
		if subject == None :
			continue
		dataArray.append(subject)
	return dataArray


'''
inputData: 
outPutFile：输出文件名，例如：'data.xlsx'
'''
def writeDataToExcleFile(inputData,outPutFile):
	from openpyxl import Workbook
	wb = Workbook()
	sheet = wb.active
	sheet.title = "Sheet1"
	i= 1
	for key in inputData.keys():
		sheet.cell(i,1).value = key
		j = 2
		for item in inputData[key]:
			sheet.cell(i,j).value =item
			j = j+1
		i = i+1
	wb.save(outPutFile)
	print('数据写入完毕!')


def doJob():
	connection = pymysql.connect(host='localhost',
                             user='root',
                             password='root',
                             db='db2',
                             charset='utf8mb4',
                             cursorclass=pymysql.cursors.DictCursor)
	try:
		subjectdata = readDataByExcle('F:/GEV/lda_dir/subject.xlsx','Sheet1')
		data = {}
		for subjectA in subjectdata:
			print(subjectA)
			data[subjectA] = []
			for subjectB in subjectdata:
				count = getsubjectcount(connection,subjectA,subjectB)
				data[subjectA].append(count)
			logging.info('data=>'+subjectA+"=>"+str(data[subjectA]))
		writeDataToExcleFile(data,'F:/GEV/lda_dir/subjectout11.xlsx')   
	finally:
		connection.close()


doJob()

其中学科名称放在了Excel中，结果如下，表头为手动添加
在这里插入图片描述

棒棒糖one

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
统计mysql中学科引用表中学科引用次数

将学科相互引用表中的数据，统计出来，做成学科引用次数的矩阵table2格式如下：字段re_sub即引文所在学科，ar_sb即文章所在学科，有这两个字段即可计算学科引用次数，并将其填充到Excel表格中，做成矩阵形式import pymysql.cursorsimport logginglogging.basicConfig(filename='log.log', ...
复制链接

扫一扫