介绍
今天突发奇想实现这样一个业务,作为练习,是这样的,从数据库的一张大表里(100万条数据)迁移数据到新表里,由于原来数据表的数据量太大,所以插入新表时需要进行分表
原数据
数据库User表数据有一百万条数据,对它进行分表
迁移
创建一个SpringBoot项目整合MyBatis,我们用MyBatis辅助迁移
POM
主要导入MySQL,MyBatis,FastJSON,Commons包
<!--mybatis-->
<dependency>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
<version>2.2.2</version>
</dependency>
<!--mysql-->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<scope>runtime</scope>
</dependency>
<!--common-->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.2</version>
</dependency>
<!--fastjson-->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.69</version>
</dependency>
YML
注意开启MyBatis的驼峰映射
server:
port: 80
spring:
datasource:
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://127.0.0.1:3306/mysql_test?characterEncoding=UTF-8&useUnicode=true&useSSL=false&serverTimezone=Asia/Shanghai&allowMultiQueries=true
username: root
password: root
mybatis:
mapperLocations: classpath:mapper/*.xml
configuration:
map-underscore-to-camel-case: true
Mapper XML
接口:
1.从数据库查一共有多少条数据
2.分批次从数据库表中查数据
3.根据要求创建分表
4.批量插入数据
<!--sep-->
<select id="selectTotal" resultType="int">
select count(1)
from user
</select>
<select id="selectSub" parameterType="int" resultType="com.example.mysql.entity.User">
select *
from user limit #{start}, #{end}
</select>
<insert id="createTable" parameterType="string">
SET NAMES utf8mb4;
SET
FOREIGN_KEY_CHECKS = 0;
CREATE TABLE IF NOT EXISTS user_${tableName}
(
`id` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`username` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`password` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`address` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`email` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`profile` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`birthday` date NOT NULL,
`register_day` date NOT NULL,
`login_day` date NOT NULL,
`status` int NOT NULL,
`account` decimal(10, 2) NOT NULL,
`balance` decimal(10, 2) NOT NULL,
`age` int NOT NULL,
`sex` int NOT NULL,
`avatar` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`level` int NOT NULL
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_bin ROW_FORMAT = Dynamic;
SET
FOREIGN_KEY_CHECKS = 1;
</insert>
<insert id="insertSep" parameterType="com.example.mysql.entity.User">
replace into user_${tableName} (
id,username,password,address,email,profile,
birthday,register_day,login_day,status,account,
balance,age,sex,avatar,`level`
)
values
<foreach collection="users" item="item" open="" close="" separator=",">
(#{item.id},#{item.username},#{item.password},
#{item.address},#{item.email},#{item.profile},
#{item.birthday},#{item.registerDay},#{item.loginDay},
#{item.status},#{item.account},#{item.balance},
#{item.age},#{item.sex},#{item.avatar},#{item.level})
</foreach>
</insert>
User实体
表中存放的实体
package com.example.mysql.entity;
import lombok.Data;
import java.math.BigDecimal;
import java.util.Date;
@Data
public class User {
private String id;
private String username;
private String password;
private String address;
private String email;
private String profile;
private Date birthday;
private Date registerDay;
private Date loginDay;
private Integer status;
private BigDecimal account;
private BigDecimal balance;
private Integer age;
private Integer sex;
private String avatar;
private Integer level;
}
UserMapper接口
主要声明与数据库交互的接口
package com.example.mysql.mapper;
import com.example.mysql.entity.SearchDto;
import com.example.mysql.entity.User;
import org.apache.ibatis.annotations.Mapper;
import org.apache.ibatis.annotations.Param;
import java.util.List;
@Mapper
public interface UserMapper {
void insertBatch(@Param("users") List<User> users);
List<User> selectBatch(@Param("searchDto") SearchDto searchDto);
int selectTotal();
List<User> selectSub(@Param("start") int start, @Param("end") int end);
void createTable(@Param("tableName") String tableName);
void insertSep(@Param("tableName") String tableName, @Param("users") List<User> users);
}
Controller
100万条数据分10批,每批10万条数据,获取到数据按字段(生日月份)进行分组保存进Map中,然后创建每组的数据库表,使用List并行流批量插入该组的数据到对应的数据库表中
注:下一批次如果插入数据到相同的表,就不创建表,直接插入数据即可
package com.example.mysql.controller;
import com.alibaba.fastjson.JSON;
import com.example.mysql.config.R;
import com.example.mysql.entity.User;
import com.example.mysql.mapper.UserMapper;
import com.example.mysql.util.DateUtil;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.collections4.ListUtils;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
import javax.annotation.Resource;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* @description: 分库分表
* @author: zj
* @date: 2022-07-14 10:57
*/
@RestController
@Slf4j
public class MySqlSepTableController {
@Resource
private UserMapper userMapper;
@GetMapping("/mysql/sep")
public String sep() {
//获取数据
int total = userMapper.selectTotal();
for (int i = 0; i < total / 100000; i++) {
log.info("正在分表, 第 {}/{} 轮", (i+1), total / 100000);
int start = i * 100000;
int end = start + 100000 > total ? total : start + 100000;
List<User> users = userMapper.selectSub(start, end);
Map<String, List<User>> monthMap = new HashMap<>();
for (User user : users) {
String key = DateUtil.formatDate(user.getBirthday()).substring(0,6);
if (monthMap.containsKey(key)) {
monthMap.get(key).add(user);
} else {
List<User> userList = new ArrayList<>();
userList.add(user);
monthMap.put(key, userList);
}
}
for (String key : monthMap.keySet()) {
List<User> monthUserList = monthMap.get(key);
//创建表
userMapper.createTable(key);
log.info("创建 {} 表完成", "user_"+key);
log.info("开始迁移 {} 表数据, 共 {} 条", "user_"+key, monthUserList.size());
//插入数据
List<List<User>> partition = ListUtils.partition(monthUserList, 500);
partition.parallelStream().forEach(list -> {
userMapper.insertSep(key, list);
});
log.info("迁移完成 {} 表数据, 共 {} 条", "user_"+key, monthUserList.size());
}
log.info("分表完成, 第 {}/{} 轮", (i+1), total / 100000);
}
Map<String, String> res = R.ok("sep table success");
res.put("status", "200");
return JSON.toJSONString(res);
}
}
效果
会自动生成各月份的数据表,每个表中存放当前月份出生的User实体数据