前面两篇文章《python调用openpyxl包操作excel文件》和《python调用pymssql包操作SqlServer数据库》学习了Python操作excel和SqlServer的基本方法,同时在《python统计csdn个人博客文章清单及字数》中已经将抓取的文章数据保存到了excel文件中,本文基于前面的学习成果将excel文件中的数据保存到SqlServer数据库中。
首先先创建数据库表,基于下列语句创建数据库保存博客文章的类型、发布时间、名称、url和字数信息。
CREATE TABLE [dbo].[BlogArticleInfo](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Type] [nvarchar](50) NULL,
[ReleaseDate] [datetime] NULL,
[ArticleName] [nvarchar](500) NOT NULL,
[ArticleUrl] [nvarchar](500) NOT NULL,
[ArticleWordCount] [int] NOT NULL,
CONSTRAINT [PK_BlogArticle] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
本文用到的python操作主要包括excel文件的读取、SqlServer数据库的连接和关闭、SqlServer数据库的插入操作,这些操作在前面文章中都已经学习并测试过,在此不再赘述。代码不多,直接将代码及执行结果列到下面。唯一需要说明的是没有找到直接读取excel文件每行数据的方法,就用了中比较简陋的方式将每行数据转成元组。
# coding=gbk
from openpyxl import load_workbook
import pymssql
path = r"E:\MyPrograms\Python\csdn\result.xlsx"
wb = load_workbook(path)
sheet=wb.active
articledata=[]
for row in sheet.rows:
record=[]
for cell in row:
record.append(cell.value)
articledata.append(tuple(record))
server = 'XXX.XXX.XXX.XXX'
user = 'XXXXXXX'
password = 'XXXXXXX'
db='BlogData'
conn = pymssql.connect(server, user, password, db)
cursor = conn.cursor()
cursor.executemany(
"INSERT INTO BlogArticleInfo(Type,ReleaseDate,ArticleName,ArticleUrl,ArticleWordCount)VALUES(%s,%s,%s,%s,%d)",articledata)
conn.commit()
conn.close()
参考文献:
[1]https://openpyxl.readthedocs.io/en/stable/