Spring Batch 入门之 CSV-to-DB

学习使用Spring batch从CSV文件读取记录,并使用 StaxEventItemWriter 输出经过处理的记录转换为 XML 的数据。

spring-batch-reference-model.png

  1. JobLauncher: 顾名思义,该领域对象就是Job的启动器,其作用就是绑定一组JobParameters到Job上,然后运行该Job。

  2. Job: 定义,配置批处理任务的领域对象,该对象的作用,第一是做Step的容器,配置该批处理任务需要的Step,以及他们之间的逻辑关系。第二是配置该批处理任务的特征,比方说名字,是否可重启,是否对JobParameters进行验证以及验证规则等。

  3. Step: 定义批处理任务中一个对立的逻辑任务处理单元。基本上的业务逻辑处理代码都是封装在Step中的。Step有2种实现形式,一种是Tasklet形式的,这种形式非常自由,开发人员只需要实现Tasklet接口,其中的逻辑完全有自己决定,另一种是Chunk-Oriented形式的,这种形式定义了一个Step的流程必须是“读-处理(可选)-写”,当然Spring Batch也对每一个步骤提供了接口ItemReader, ItemProcessor,ItemWriter还有很多常用的默认实现(读文件,读数据库,写文件,写数据库等等)。 每一个Step只能由一个Tasklet或者一个Chunk构成。

  4. JobRepository: 该领域对象会为Spring Batch的运维数据提供一种持久化机制。其为所有的运维数据的提供CRUD的操作接口,并为所有的操作提供事务支持。

项目概述

在这个应用程序中,我们将执行以下任务:

  1. 使用 FlatFileItemReader 从CSV文件读取交易记录
  2. 使用 CustomItemProcessor 进行项目的业务处理。当 ItemReader 读取一个项目,而 ItemWriter 写入它们时,
    ItemProcessor 提供一个转换或应用其他业务处理的访问点。
  3. 使用 StaxEventItemWriter 获取 CustomItemProcessor 的处理结果,并将它转换成 XML 类型数据作为最终输出。
  4. 使用 MyBatisBatchItemWriter 获取 CustomItemProcessor 的处理结果,并将它转换成 XML 类型数据作为最终输出。
  5. 查看MySQL

工程结构

在这里插入图片描述

Maven 依赖

sqlite-jdbcmysql-connector-java 可以选择其中一个。
当选择其中一种时,同时也要在 applicationContext.xml 文件中做出相应的改动。

改动:

  • 依赖的版本由 platform-bom 来统一管理
  • 添加 mybatis, mybatis-spring
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.littlefxc.example</groupId>
    <artifactId>Spring-CSV-to-DB</artifactId>
    <version>1.0-snapshot</version>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <spring.version>5.0.9.RELEASE</spring.version>
        <spring.batch.version>4.0.1.RELEASE</spring.batch.version>
        <sqlite.version>3.8.11.2</sqlite.version>
        <mysql.version>5.1.47</mysql.version>
    </properties>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>io.spring.platform</groupId>
                <artifactId>platform-bom</artifactId>
                <version>Cairo-RELEASE</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <!-- MySQL database driver -->
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
        </dependency>
        <!-- spring的XML文件处理依赖 -->
        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-oxm</artifactId>
        </dependency>
        <dependency>
            <groupId>org.mybatis</groupId>
            <artifactId>mybatis</artifactId>
            <version>3.5.0</version>
        </dependency>
        <dependency>
            <groupId>org.mybatis</groupId>
            <artifactId>mybatis-spring</artifactId>
            <version>2.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-jdbc</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.batch</groupId>
            <artifactId>spring-batch-core</artifactId>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
        </dependency>
    </dependencies>

</project>

applicationContext.xml

我们将使用 FlatFileItemReader 读取 CSV 文件。
我们将使用它的标准配置,包括 DefaultLineMapperDelimitedLineTokenizerBeanWrapperFieldSetMapper 类。
为了在XML文件中输出记录,我们将使用 StaxEventItemWriter 作为标准编写器。

改动:

  • 将输出XML变为输出到mysql
  • Spring Batch 持久层框架由 spring-jdbc 改为 mybatis, mybatis-spring

当然,原来的输出 itemWriter 去掉注释后,仍然起作用

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:jdbc="http://www.springframework.org/schema/jdbc"
       xmlns:batch="http://www.springframework.org/schema/batch"
       xmlns:p="http://www.springframework.org/schema/p"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:tx="http://www.springframework.org/schema/tx"
       xsi:schemaLocation="
       http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
       http://www.springframework.org/schema/jdbc http://www.springframework.org/schema/jdbc/spring-jdbc.xsd
       http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd
       http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
       http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx.xsd">

    <context:property-placeholder location="classpath:application.properties"/>
    <context:component-scan base-package="com.littlefxc.examples.batch"/>

    <!-- 1. 数据库脚本:这里是为了方便起见保证每次重启程序数据库都是最新的(生产环境中不要这么做!!!) -->
    <jdbc:initialize-database>
        <jdbc:script location="${batch.schema-drop}"/>
        <jdbc:script location="${batch.schema-create}"/>
        <jdbc:script location="${project.schema-drop}"/>
        <jdbc:script location="${project.schema-create}"/>
    </jdbc:initialize-database>

    <!-- 2. 连接数据库:可以选择其它的数据源实现 -->
    <bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource"
          p:driverClassName="${jdbc.driver-class-name}"
          p:url="${jdbc.url}"
          p:username="${jdbc.username}"
          p:password="${jdbc.password}"/>

    <!-- 3. 事务管理 -->
    <!--<bean id="transactionManager" class="org.springframework.batch.support.transactionRecord.ResourcelessTransactionManager"/>-->
    <bean id="transactionManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager"
          p:dataSource-ref="dataSource"/>

    <!-- 3.1 申明式事务 -->
    <tx:annotation-driven/>

    <!-- 3.2 mybatis 配置 -->
    <bean id="sqlSessionFactory" class="org.mybatis.spring.SqlSessionFactoryBean"
          p:dataSource-ref="dataSource"
          p:typeAliasesPackage="${mybatis.type-aliases-package}"
          p:configLocation="${mybatis.configuration}"/>

    <!-- 3.3 mybatis dao 配置 -->
    <bean class="org.mybatis.spring.mapper.MapperScannerConfigurer"
          p:basePackage="com.littlefxc.examples.batch.dao"
          p:sqlSessionFactoryBeanName="sqlSessionFactory"/>

    <!-- 4. 为JobLauncher,Job和Step实现提供CRUD操作 -->
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean"
          p:dataSource-ref="dataSource"
          p:transactionManager-ref="transactionManager"
          p:databaseType="mysql"/>

    <!-- 5. JobLauncher表示一个简单的接口,用于使用给定的 JobParameter 启动作业 -->
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher"
          p:jobRepository-ref="jobRepository"/>

    <!-- 6. batch 输入 -->
    <bean id="itemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <!-- 输入资源 -->
        <property name="resource" value="input/record.csv"/>
        <!-- 不读取第一行数据 -->
        <property name="linesToSkip" value="1"/>
        <!-- 将输入资源转化为对象 -->
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="names" value="username,user_id,transaction_date,transaction_amount"/>
                    </bean>
                </property>
                <property name="fieldSetMapper" ref="recordFieldSetMapper"/>
            </bean>
        </property>
    </bean>

    <!-- 7. batch的处理器 -->
    <bean id="itemProcessor" class="com.littlefxc.examples.batch.service.CustomItemProcessor"/>

    <!-- 8. batch 输出:输出为xml -->
    <!--<bean id="itemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
        <property name="resource" value="file:xml/output.xml"/>
        <property name="marshaller">
            &lt;!&ndash; xml 与 对象的转换器 &ndash;&gt;
            <bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller"
                  p:packagesToScan="com.littlefxc.examples.batch.model"/>
        </property>
        <property name="rootTagName" value="transactionRecord"/>
    </bean>-->
    <!-- 8. batch 输出:输出到mysql -->
    <!--<bean id="itemWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter"
          p:dataSource-ref="dataSource" p:sql="${batch.sql}">
        <property name="itemSqlParameterSourceProvider">
            <bean class="org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider"/>
        </property>
    </bean>-->
    <!-- 8. batch 输出:使用 mybatis 输出到mysql -->
    <bean id="itemWriter" class="org.mybatis.spring.batch.MyBatisBatchItemWriter">
        <property name="sqlSessionFactory" ref="sqlSessionFactory"/>
        <!-- 与 mybatis 关联的 dao 层的接口名:com.littlefxc.examples.batch.dao.TransactionRecordDao.insertTransactionRecord -->
        <property name="statementId" value="insertTransactionRecord"/>
    </bean>

    <!-- 9. 配置batch的输入(6)、处理器(7)、输出(8) -->
    <!-- commit-interval:提交事务之前将处理的项目数。 -->
    <batch:job id="firstBatchJob">
        <batch:step id="step1">
            <batch:tasklet>
                <batch:chunk reader="itemReader" processor="itemProcessor" writer="itemWriter" commit-interval="2"/>
            </batch:tasklet>
        </batch:step>
    </batch:job>

</beans>

RecordFieldSetMapper

ItemReader 的属性,作用是将 FieldSet 转换为对象

package com.littlefxc.examples.batch.service;

import com.littlefxc.examples.batch.model.TransactionRecord;
import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

import java.text.ParseException;
import java.text.SimpleDateFormat;

/**
 * 将 FieldSet 转换为对象
 * @author fengxuechao
 * @date 2019/1/4
 **/
public class RecordFieldSetMapper implements FieldSetMapper<Transaction> {

    public Transaction mapFieldSet(FieldSet fieldSet) throws BindException {
        SimpleDateFormat dateFormat = new SimpleDateFormat("dd/MM/yyyy");
        Transaction transactionRecord = new Transaction();

        transactionRecord.setUsername(fieldSet.readString("username"));
        transactionRecord.setUserId(fieldSet.readInt("user_id"));
        transactionRecord.setAmount(fieldSet.readDouble("transaction_amount"));
        String dateString = fieldSet.readString("transaction_date");
        try {
            transactionRecord.setTransactionDate(dateFormat.parse(dateString));
        } catch (ParseException e) {
            e.printStackTrace();
        }
        return transactionRecord;
    }
}

CustomItemProcessor

自定义实现接口 ItemProcessor, 作为 ItemReaderItemWriter 的转换点。

package com.littlefxc.examples.batch.service;

import com.littlefxc.examples.batch.model.TransactionRecord;
import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

import java.text.ParseException;
import java.text.SimpleDateFormat;

/**
 * 将读取到的数据集合转换为对象
 * @author fengxuechao
 * @date 2019/1/4
 **/
public class RecordFieldSetMapper implements FieldSetMapper<Transaction> {

    public Transaction mapFieldSet(FieldSet fieldSet) throws BindException {
        SimpleDateFormat dateFormat = new SimpleDateFormat("dd/MM/yyyy");
        Transaction transactionRecord = new Transaction();

        transactionRecord.setUsername(fieldSet.readString("username"));
        transactionRecord.setUserId(fieldSet.readInt("user_id"));
        transactionRecord.setAmount(fieldSet.readDouble("transaction_amount"));
        String dateString = fieldSet.readString("transaction_date");
        try {
            transactionRecord.setTransactionDate(dateFormat.parse(dateString));
        } catch (ParseException e) {
            e.printStackTrace();
        }
        return transactionRecord;
    }
}

模型

package com.littlefxc.examples.batch.model;

import lombok.Data;

import javax.xml.bind.annotation.XmlRootElement;
import java.util.Date;

/**
 * @author fengxuechao
 */
@Data
@XmlRootElement(name = "transactionRecord")
public class Transaction {

    private String username;

    private int userId;

    private Date transactionDate;

    private double amount;
}

record.csv

devendra, 1234, 31/10/2015, 10000
john    , 2134, 3/12/2015 , 12321
robin   , 2134, 2/02/2015 , 23411

启动程序

package com.littlefxc.examples.batch;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.support.ClassPathXmlApplicationContext;

public class App {
    public static void main(String[] args) {
        ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext();
        context.setConfigLocations("classpath:spring-context.xml");
        context.refresh();

        JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
        Job job = (Job) context.getBean("firstBatchJob");
        System.out.println("Starting the batch job");
        try {
            JobExecution execution = jobLauncher.run(job, new JobParameters());
            System.out.println("Job Status : " + execution.getStatus());
            System.out.println("Job completed");
        } catch (Exception e) {
            e.printStackTrace();
            System.out.println("Job failed");
        }
    }
}

验证

<?xml version="1.0" encoding="UTF-8"?>
<transactionRecord>
    <transactionRecord>
        <amount>10000.0</amount>
        <transactionDate>2015-10-31T00:00:00+08:00</transactionDate>
        <userId>1234</userId>
        <username>devendra</username>
    </transactionRecord>
    <transactionRecord>
        <amount>12321.0</amount>
        <transactionDate>2015-12-03T00:00:00+08:00</transactionDate>
        <userId>2134</userId>
        <username>john</username>
    </transactionRecord>
    <transactionRecord>
        <amount>23411.0</amount>
        <transactionDate>2015-02-02T00:00:00+08:00</transactionDate>
        <userId>2134</userId>
        <username>robin</username>
    </transactionRecord>
</transactionRecord>

附录:

application.properties

batch.schema-drop=org/springframework/batch/core/schema-drop-mysql.sql
batch.schema-create=org/springframework/batch/core/schema-mysql.sql
batch.sql=INSERT INTO transaction_record (user_id, username, transaction_date, amount) VALUES (:userId, :username, :transactionDate, :amount)

jdbc.url=jdbc:mysql://192.168.120.63:3306/batch?useSSL=false
jdbc.username=root
jdbc.password=123456
jdbc.driver-class-name=com.mysql.jdbc.Driver

# 自定义数据库删除脚本
project.schema-drop=classpath:schema-drop.sql
# 自定义数据库创建脚本
project.schema-create=classpath:schema.sql

# Mybatis Config
mybatis.configuration=classpath:mybatis-config.xml
mybatis.type-aliases-package=com.littlefxc.examples.batch.model
mybatis.mapper.base-package=com.littlefxc.examples.batch.dao

mybatis-config.xml

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE configuration PUBLIC "-//mybatis.org//DTD Config 3.0//EN" "http://mybatis.org/dtd/mybatis-3-config.dtd">

<configuration>
    <settings>
        <setting name="cacheEnabled" value="true"/>
        <setting name="lazyLoadingEnabled" value="true"/>
        <setting name="multipleResultSetsEnabled" value="true"/>
        <setting name="useColumnLabel" value="true"/>
        <setting name="useGeneratedKeys" value="false"/>
        <setting name="autoMappingBehavior" value="PARTIAL"/>
        <setting name="autoMappingUnknownColumnBehavior" value="WARNING"/>
        <setting name="defaultExecutorType" value="SIMPLE"/>
        <setting name="defaultStatementTimeout" value="25"/>
        <setting name="defaultFetchSize" value="100"/>
        <setting name="safeRowBoundsEnabled" value="false"/>
        <setting name="mapUnderscoreToCamelCase" value="false"/>
        <setting name="localCacheScope" value="SESSION"/>
        <setting name="jdbcTypeForNull" value="OTHER"/>
        <setting name="lazyLoadTriggerMethods" value="equals,clone,hashCode,toString"/>
    </settings>
</configuration>

schema.sql

/*
 Navicat Premium Data Transfer

 Source Server         : localhost
 Source Server Type    : MySQL
 Source Server Version : 50722
 Source Host           : localhost:3306
 Source Schema         : batch

 Target Server Type    : MySQL
 Target Server Version : 50722
 File Encoding         : 65001

 Date: 31/01/2019 10:27:20
*/

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for transaction_record
-- ----------------------------
# DROP TABLE IF EXISTS `transaction_record`;
CREATE TABLE `transaction_record`  (
  `username` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
  `user_id` bigint(20) NOT NULL,
  `transaction_date` datetime(6) NOT NULL,
  `amount` double(11, 0) NOT NULL,
  PRIMARY KEY (`username`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;

SET FOREIGN_KEY_CHECKS = 1;

schema-drop.sql

DROP TABLE IF EXISTS `transaction_record`;
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值