python在数据库有自增id的情况下，向数据库插入不重复数据（除id外其余相同的数据被视为重复数据）

最新推荐文章于 2024-07-15 02:48:00 发布

晚来舟Mango

最新推荐文章于 2024-07-15 02:48:00 发布

阅读量880

点赞数

分类专栏：实习小助手文章标签：数据库 sql mysql

本文链接：https://blog.csdn.net/wanlaizhou/article/details/127318819

版权

实习小助手专栏收录该内容

1 篇文章 0 订阅

订阅专栏

python在数据库有自增id的情况下，向数据库插入不重复数据（除id外其余相同的数据被视为重复数据）

问题：
目前有一张表，主键和索引都是id，int类型且自增，我的代码每十分钟扫描一次数据，将这十分钟内新增的数据添加到数据库，但是由于执行代码需要时间，只扫描十分钟的数据会产生遗漏数据，所以暂定为扫描十一分钟的数据，由此会有少部分的重复数据被导入数据库。有两个解决办法，一是在导入时判断是否有相同数据，二是无差别导入，完成一次扫描进行一次数据库清理，删除重复数据。经过评判，暂定方法一，因为方法二消耗资源过多，加大负载，同时导入重复数据本就逻辑上不够完善。但是导入时判断数据重复却也成了难题，因为表的主键是id且自增，导入的时候只能对比除了id每一个字段的值是否完全一样，只有完全一样的情况下才算作重复数据。最终敲定了这个方法：在每条数据导入时，先采用exists方法判断是否存在，如果存在就导入，否则忽略，这个方法也有一定的资源消耗，总体来说还算是较优的，下面是代码。

python代码：

# 照着SQL代码写f字符串就行
insert ignore into `table`(id,type,ip,time,content,name,metric,`value`,unit)
select '1','1','1','1','1','1','1','1','1'
FROM dual
where not exists
(select id,type,ip,time,content,name,metric,`value`,unit
from `table`
where id='1' and type='0' and ip='1' and time='1' and content='1' and name='1' and metric='1' and `value`='1' and unit='1')

sql代码：

insert ignore into `table`(id,type,ip,time,content,name,metric,`value`,unit)
select '1','1','1','1','1','1','1','1','1'
FROM dual
where not exists
(select id,type,ip,time,content,name,metric,`value`,unit
from `table`
where id='1' and type='0' and ip='1' and time='1' and content='1' and name='1' and metric='1' and `value`='1' and unit='1')