Spring-batch解析Csv文件

背景:

最近用POI解析线上的excel文件,在5万条以上的时候性能很慢。甚至内存卡死现象。于是想到用spring-batch分批次读取。 但是spring-batch不支持直接读取excel文件。所以先将excel转为csv文件(测试转换效率:8万条 40s)。然后用spring-batch分批次读取,每次5000条。 然后5000条数据处理再用多线程(forkJoin)处理。

 

      ============   以下记录下工程demo,仅供我本人参考  ===========

1:spring-batch配置-----pom:

<dependency>
      <groupId>org.springframework.batch</groupId>
      <artifactId>spring-batch-core</artifactId>
      <version>3.0.8.RELEASE</version>
</dependency>

2:spring-batch配置-----batch-content.xml:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:p="http://www.springframework.org/schema/p"
       xmlns:tx="http://www.springframework.org/schema/tx"
       xmlns:aop="http://www.springframework.org/schema/aop"
       xmlns:context="http://www.springframework.org/schema/context"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
	http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
	http://www.springframework.org/schema/tx
	http://www.springframework.org/schema/tx/spring-tx-3.0.xsd
	http://www.springframework.org/schema/aop
	http://www.springframework.org/schema/aop/spring-aop-3.0.xsd
	http://www.springframework.org/schema/context
	http://www.springframework.org/schema/context/spring-context-2.5.xsd"
       default-autowire="byName">

    <bean id="jobRepository"
          class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
        <property name="transactionManager" ref="transactionManagerBatch"/>
    </bean>

    <bean id="jobLauncher"
          class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>

    <!-- 这里命名不能和spring的transactionManager重名.否则导致spring事务不生效  -->
    <bean id="transactionManagerBatch"
          class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
</beans>

 

3:spring-batch-----配置:batch-job.xml:

<?xml version="1.0" encoding="UTF-8"?>
<bean:beans xmlns="http://www.springframework.org/schema/batch"
            xmlns:bean="http://www.springframework.org/schema/beans"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xmlns:p="http://www.springframework.org/schema/p"
            xmlns:tx="http://www.springframework.org/schema/tx"
            xmlns:aop="http://www.springframework.org/schema/aop"
            xmlns:context="http://www.springframework.org/schema/context"
            xsi:schemaLocation="http://www.springframework.org/schema/beans
    http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
    http://www.springframework.org/schema/tx
    http://www.springframework.org/schema/tx/spring-tx-3.0.xsd
    http://www.springframework.org/schema/aop
    http://www.springframework.org/schema/aop/spring-aop-3.0.xsd
    http://www.springframework.org/schema/context
    http://www.springframework.org/schema/context/spring-context-2.5.xsd
    http://www.springframework.org/schema/batch
    http://www.springframework.org/schema/batch/spring-batch-2.2.xsd">
    <bean:import resource="classpath:META-INF/batch/fee-batch-context.xml"/>

    <job id="analysisExcelJob">
        <step id="listStep">
            <tasklet transaction-manager="transactionManager">
                <chunk reader="redeemDataReader" writer="redeemDataWriter" processor="redeemDataProcessor"
                       commit-interval="5000"/>
            </tasklet>
        </step>
        <listeners>
            <listener ref="analysisExcelInterceptor"/>
        </listeners>
    </job>

    <!-- 读取报表文件,csv格式 -->
    <bean:bean id="redeemDataReader"
               class="org.springframework.batch.item.file.FlatFileItemReader"
               scope="step">
        <bean:property name="resource"
                       value="file:#{jobParameters['file.data']}"/>
        <bean:property name="linesToSkip" value="4"/>
        <bean:property name="lineMapper">
            <bean:bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <bean:property name="lineTokenizer">
                    <!-- 映射的字段以下面names属性, 须覆盖所有表头, 以 , 隔开 -->
                    <bean:bean
                            class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <bean:property name="names" value="计划回款时间,商户名称,父产品名称,标的名称,合同编号,进件编码,
                        标的募集金额,投资人利率,当前期数,总期数,应还金额,应还利息,罚息金额,还款总额,代扣实际到账,未到账金额,
                        商户分润金额,产品起息日,虚户时间,滞销天数,首次回款日,运营滞销贴息,商户线下应还,
                        当期是否提前回款,回款模式,是否是转非标,是否提现成功"/>
                    </bean:bean>
                </bean:property>

                <!-- 如果
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值