Java大数据量、多线程、批量导入数据问题的解

1 篇文章 0 订阅
1 篇文章 0 订阅
  • 背景:
  1. 项目整合业务数据汇总录入数据库,数据规模为200~300w每月,按要求需要在月初导入上月全部数据。
  2. 要做到项目隔离。
  3. 要保证数据完整性。

  • 数据导出方案:采用多线程读写配合文件服务器,可以保证数据导出效率。

  • 数据下载导入方案:
    1. 下载文件到机器目录。
    2. 使用多线程同时分段读取文件到内存。
    3. 使用多线程执行批量入库操作(批量insert量为300每次)。
    4. 使用countDownLatch计数器来确保每一天数据的完整性;
    5. 数据库链接池使用c3p0,初始大小为50,最大链接数为100。

  • 方案执行配置1.0:按每月天数初始化线程池,每天开启20个线程批量入库数据,最大线程数600
    • 问题:​​​​​​
      • 任务很慢,跑完全部数据用时16h...
      • 线程争抢链接池资源会造成阻塞,导致执行完毕的work不能及时释放连接归还给链接池。
      • 每天数据量10W/300=334work*31=10354work,抢占数据库链接的work过多,会造成c3p0链接池死锁(APPARENT DEADLOCK)。附上资料两篇:
        1. https://stackoverflow.com/questions/3730844/c3p0-apparent-deadlock-when-the-threads-are-all-empty
        2. http://blog.sina.com.cn/s/blog_7b9948fd0102vyuh.html
      • AsynchronousRunner$DeadlockDetector@7803cef7 -- APPARENT DEADLOCK!!! Complete Status:
        	Managed Threads: 3
        	Active Threads: 3
        	Active Tasks:
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@791efed9
        			on thread: C3P0PooledConnectionPoolManager[identityToken->2zozfma31t2nvwwikd9tw|5d553603]-HelperThread-#2
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@7d572454
        			on thread: C3P0PooledConnectionPoolManager[identityToken->2zozfma31t2nvwwikd9tw|5d553603]-HelperThread-#0
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@6a0b3b13
        			on thread: C3P0PooledConnectionPoolManager[identityToken->2zozfma31t2nvwwikd9tw|5d553603]-HelperThread-#1
        	Pending Tasks:
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@77085bb7
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@79ee8bb9
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@3ddf7554
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@ebf025f
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@3b5f25dc
        Pool thread stack traces:
        	Thread[C3P0PooledConnectionPoolManager[identityToken->2zozfma31t2nvwwikd9tw|5d553603]-HelperThread-#2,5,main]
        		java.util.Arrays.copyOfRange(Arrays.java:3664)
        		java.lang.String.<init>(String.java:207)
        		java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:567)
        		java.nio.CharBuffer.toString(CharBuffer.java:1241)
        		com.mysql.jdbc.StringUtils.toString(StringUtils.java:1871)
        		com.mysql.jdbc.ResultSetRow.getString(ResultSetRow.java:821)
        		com.mysql.jdbc.ByteArrayRow.getString(ByteArrayRow.java:70)
        		com.mysql.jdbc.ResultSetImpl.getStringInternal(ResultSetImpl.java:5815)
        		com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5692)
        		com.mysql.jdbc.ResultSetImpl.getObject(ResultSetImpl.java:4989)
        		com.mysql.jdbc.Util.resultSetToMap(Util.java:508)
        		com.mysql.jdbc.ConnectionImpl.buildCollationMapping(ConnectionImpl.java:982)
        		com.mysql.jdbc.ConnectionImpl.initializePropsFromServer(ConnectionImpl.java:3481)
        		com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2445)
        		com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2215)
        		com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:813)
        		com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
        		sun.reflect.GeneratedConstructorAccessor112.newInstance(Unknown Source)
        		sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        		java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        		com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        		com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:399)
        		com.mysql.jdbc.NonRegisteringDriver.connect$dkimi$impl$1046(NonRegisteringDriver.java:334)
        		com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java)
        		com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
        		com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
        		com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
        		com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
        		com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
        		com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
        		com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
        		com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
        	Thread[C3P0PooledConnectionPoolManager[identityToken->2zozfma31t2nvwwikd9tw|5d553603]-HelperThread-#0,5,main]
        		com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2076)
        		com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1900)
        		com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3401)
        		com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:483)
        		com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3096)
        		com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2266)
        		com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2687)
        		com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2713)
        		com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2663)
        		com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1599)
        		com.mysql.jdbc.ConnectionImpl.buildCollationMapping(ConnectionImpl.java:980)
        		com.mysql.jdbc.ConnectionImpl.initializePropsFromServer(ConnectionImpl.java:3481)
        		com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2445)
        		com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2215)
        		com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:813)
        		com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
        		sun.reflect.GeneratedConstructorAccessor112.newInstance(Unknown Source)
        		sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        		java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        		com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        		com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:399)
        		com.mysql.jdbc.NonRegisteringDriver.connect$dkimi$impl$1046(NonRegisteringDriver.java:334)
        		com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java)
        		com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
        		com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
        		com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
        		com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
        		com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
        		com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
        		com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
        		com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
        		com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
        	Thread[C3P0PooledConnectionPoolManager[identityToken->2zozfma31t2nvwwikd9tw|5d553603]-HelperThread-#1,5,main]
        		java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
        		java.nio.CharBuffer.allocate(CharBuffer.java:335)
        		java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:795)
        		java.nio.charset.Charset.decode(Charset.java:807)
        		com.mysql.jdbc.StringUtils.toString(StringUtils.java:1871)
        		com.mysql.jdbc.ResultSetRow.getString(ResultSetRow.java:821)
        		com.mysql.jdbc.ByteArrayRow.getString(ByteArrayRow.java:70)
        		com.mysql.jdbc.ResultSetImpl.getStringInternal(ResultSetImpl.java:5815)
        		com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5692)
        		com.mysql.jdbc.ConnectionImpl.loadServerVariables(ConnectionImpl.java:3931)
        		com.mysql.jdbc.ConnectionImpl.initializePropsFromServer(ConnectionImpl.java:3473)
        		com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2445)
        		com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2215)
        		com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:813)
        		com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
        		sun.reflect.GeneratedConstructorAccessor112.newInstance(Unknown Source)
        		sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        		java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        		com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        		com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:399)
        
        The last packet successfully received from the server was 1,288,063 milliseconds ago.  
        The last packet sent successfully to the server was 18,302 milliseconds ago.; 
        nested exception is com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
        

         

  • 方案执行配置2.0:修改初始化线程池 4(读取文件线程) * 6(写入线程),最大并发线程数24
    • 经生产环境验证100w数据用时10min,并可确保数据完整性以及重推数据的灵活性。

  • 生成环境跑批效果
  • 数据量:2699973
    31 * 3
    线上跑数据,花费:61056631
    execute download and batchInsert spend:61056631
    
    
    数据量:1034158
    4 * 6线程
    execute download and batchInsert spend:667602

     

  • 代码片段

    public void downLoadAndInsert(List<DepositTaskType> depositTaskTypes) {
            if (null == depositTaskTypes || depositTaskTypes.size() == 0) {
                log.info("no file need to downLoad!");
                return;
            }
            CountDownLatch countDownLatch = new CountDownLatch(depositTaskTypes.size());
            long startTime = System.currentTimeMillis();
            for (DepositTaskType depositTaskType : depositTaskTypes) {
                log.info("start task filename=" + depositTaskType.getTaskName());
                fixedThreadPool.submit(() -> activeDownLoadAndInsert(countDownLatch, depositTaskType, tempDir + depositTaskType.getTaskName()));
            }
            try {
                countDownLatch.await();
            } catch (InterruptedException e) {
                e.printStackTrace();
                log.error("countDownLatch.await error. e:" + e.getMessage());
            }
            log.info("execute download and batchInsert spend:" + (System.currentTimeMillis() - startTime));
        }
    
    public void fileReader(Dto dto, String filePath) {
            try {
                Integer fileLine = this.getFileLine(filePath);
                if (fileLine == 0) {
                    ...
                }
                int total = fileLine / 300;
                CountDownLatch countDownLatch = new CountDownLatch(total + 1);
                log.info("countDownLatch total:" + countDownLatch.getCount());
                List<Dto> batchList = Collections.synchronizedList(Lists.newArrayList());
                FileInputStream fileInputStream = new FileInputStream(filePath);
                InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream, "UTF-8");
                BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
                log.info("开始读取文件。。。");
                Long start = System.currentTimeMillis();
                String str;
                while (null != (str = bufferedReader.readLine())) {
                    batchList.add(fullDetail);
                    if (batchList.size() >= 300) {
                        log.info(">>>>>>>>>>批量插入List的size为:" + batchList.size());
                        List<Dto> copy = Lists.newArrayList();
                        synchronized (FileReaderService.class) {
                            copy.addAll(batchList);
                            batchList.clear();
                        }
    //                    insertThreadPool.submit(() -> mapper.batchInsertDetails(copy));
                        insertThreadPool.submit(() -> this.batchInsertDetail(copy, countDownLatch));
                    }
                }
                if (batchList.size() > 0) {
                    log.info(">>>>>>>>>>剩余插入List的size为:" + batchList.size());
                    List<Dto> copyList = Lists.newArrayList();
                    synchronized (FileReaderService.class) {
                        copyList.addAll(batchList);
                        batchList.clear();
                    }
    //                insertThreadPool.submit(() -> mapper.batchInsertDetails(copyList));
                    insertThreadPool.submit(() -> this.batchInsertDetail(copyList, countDownLatch));
                }
                try {
                    log.info("execute countDownLatch wait. countnumber" + countDownLatch.getCount());
                    countDownLatch.await();
                    log.info("--------批量插入文件" + filePath + ",共耗时" + (System.currentTimeMillis() - start));
                    //数据准备完毕处理业务逻辑
                } catch (InterruptedException e) {
                    e.printStackTrace();
                    log.error("countDownLatch.await exception:" + e.getMessage());
                } catch (Exception e) {
                    e.printStackTrace();
                    log.error("update depositTaskType exception:" + e.getMessage());
                }
                log.info("downLoad file success, fileName:" + filePath);
            } catch (Exception e) {
                ...
            } 
        }

     

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值