python并发与并行

并发与并行的区别

  • 并发:concurrency。单个cpu+多道技术就可以实现并发
  • 并行:parallel:同时运行,只有具备多个cpu才能实现并行

使用场景

多种方式测试

重要说明
  • 以下测试结果虽有随机性,但是在数量级上还是能说明问题的
  • 测试环境
    • os:64位win10
    • anconda3:1915 64 bit
    • python:3.7.3
  • 结论
方法耗时
for & map0.9460015296936035
[] & map0.8980069160461426
numba.jit & for & map1.0460188388824463
numba.jit & [] & map0.9310059547424316
concurrent.futures.ProcessPoolExecutor1.4520056247711182
multiprocessing.Pool1.6059844493865967
multiprocessing.Process0.6759865283966064
pp单机0.004994630813598633
joblib1.2500150203704834
  • 测试基本函数

    def read_csv_pd(file):
      df = pd.read_csv(file)
      df = df.dropna()
      df = df[df['status'] < 5]
      return df['OC NO'].tolist()
    
    def read_csv_open(file):
      """
      读取文件内容,返回状态不为空,且小于5的对应id
      :param file:
      :return:
      """
      set_id = set()
      with open(file, encoding='utf8') as f:
          lines = f.readlines()
          for num, line in enumerate(lines):
              if num == 0:
                  continue
              fields = line.split(',')
              if len(fields[1]) > 0 and int(fields[1]) < 5:
                  set_id.add(fields[0])
      return set_id
    
  • 测试一:for循环下测试pd.read_csvopen效率对比

    for file in files:
        mid_set =func(file=file)
        set_id_2.update(mid_set)
    
    • 结论

      方法耗时
      read_csv_pd2.1082966327667236
      read_csv_open0.9119999408721924
  • 测试二:for 循环的map与python内循环的比较

      for mid_set in map(read_csv_open, files):
        set_id_1.update(mid_set)
    
     [set_id_2.update(mid_set) for mid_set in map(read_csv_open, files)]
    
    • 结论

      方法耗时
      for0.9460015296936035
      []0.8980069160461426
  • 测试三:测试concurrent.futures.ProcessPoolExecutor

     with ProcessPoolExecutor(3) as pool:
        for mid_set in pool.map(read_csv_open, files):
            set_id.update(mid_set)
    
    • 耗时:1.4520056247711182
  • 测试四:multiprocessing.Pool

    with multiprocessing.Pool(cores) as pool:
        rs = pool.map(read_csv_open, files)
    
    • 耗时:1.6059844493865967
  • 测试五:multiprocessing.Process

    for file in files:
        logger.info(file)
        t = multiprocessing.Process(target=read_csv_open_test,
                                    kwargs={'file': file, 'q': q})
        process_arr.append(t)
        t.start()
    
    • 耗时:0.6759865283966064
  • 测试六:pp单机

    job = job_server.submit(pp_test, (files,), (read_csv_open,), ())
    
    • 耗时:0.004994630813598633
  • 测试七:joblib

    rs = joblib.Parallel(4)(joblib.delayed(read_csv_open)(file) for file in files)
    
    • 耗时:1.2500150203704834

完整代码

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值