2023年7月记录Python注意事项

最新推荐文章于 2024-08-31 21:39:38 发布

weixin_46428702

最新推荐文章于 2024-08-31 21:39:38 发布

阅读量177

点赞数

分类专栏： Python 文章标签： python 经验分享 paddlepaddle

本文链接：https://blog.csdn.net/weixin_46428702/article/details/132502931

版权

Python 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

多进程lambda

Python的multiprocessing会用到pickle，pickle无法序列化lambda匿名函数

get_default = lambda : vocab[UNK]
vocab = defaultdict(get_default)

会报如下错：

_pickle.PicklingError: Can’t pickle <function at 0x00000238950AEC20>: attribute lookup on main failed

因此需要改成普通函数

def get_default():
    return vocab[UNK]
vocab = defaultdict(get_default)

多进程tqdm

多进程使用tqdm时要把多进程放在外面，tqdm放在里面，不然进度没法及时显示。

import multiprocessing as mp
pool = mp.Pool(30)
for _log_len_counter, _trace_len_counter in pool.starmap(origin_to_vocab, tqdm(id_s)):  # tqdm和starmap不能反，不然进度没法及时显示

多进程与重写了new的子类

对于重写了__new__的子类，在多进程pickle序列化时，可能会出现意外的情况。
例如

import os
from pathlib import Path, WindowsPath, PosixPath
from typing import Generator, ClassVar, TypeVar

_cls = WindowsPath if os.name == 'nt' else PosixPath

class _PrefixPath(_cls):
    prefix: ClassVar[str] = ""

    def __new__(cls, *args, **kwargs):
        return super().__new__(cls, cls.prefix, *args, **kwargs)

class EncodedFilePath(_PrefixPath):
    prefix = "prefix"

if __name__ == '__main__':
    import pickle

    origin = EncodedFilePath("123", "456")
    pickled = pickle.dumps(origin)
    unpickled = pickle.loads(pickled)
    print(origin, unpickled)
    # prefix\123\456 prefix\prefix\123\456

此代码的输出是

prefix\123\456 prefix\prefix\123\456

可以看到，使用pickle封存再解封实例后，多了一个前缀。
原因是标准库中的pathlib中的PurePath类实现了__reduce__方法

    def __reduce__(self):
        # Using the parts tuple helps share interned path parts
        # when pickling related paths.
        return (self.__class__, tuple(self._parts))

调用pickle.dumps的时候会调用__reduce__方法，此时获取的self._parts是带有前缀的，pickle.loads的时候调用自己实现的_PrefixPath.__new__方法，会再增加一个前缀。

重写__reduce__方法后

import os
from pathlib import Path, WindowsPath, PosixPath
from typing import Generator, ClassVar, TypeVar

_cls = WindowsPath if os.name == 'nt' else PosixPath

class _PrefixPath(_cls):
    prefix: ClassVar[str] = ""

    def __new__(cls, *args, **kwargs):
        return super().__new__(cls, cls.prefix, *args, **kwargs)

    def __reduce__(self):  # 防止序列化后出现重复前缀
        return type(self), self.get_relative_path().parts

    def get_relative_path(self):
        return Path(self.relative_to(self.prefix))

class EncodedFilePath(_PrefixPath):
    prefix = "prefix"

if __name__ == '__main__':
    import pickle

    origin = EncodedFilePath("123", "456")
    pickled = pickle.dumps(origin)
    unpickled = pickle.loads(pickled)
    print(origin, unpickled)
    # prefix\123\456 prefix\123\456

此时输出是

prefix\123\456 prefix\123\456

可以正常封存和解封类的实例了。

with可以同时打开多个文件

with open(id_trace_float_path, "w", newline="", encoding="utf-8") as csv_float, \
        open(id_trace_int_path, "w", newline="", encoding="utf-8") as csv_int:

训练神经网络时加入embedding层后训练会变得非常慢，暂时还不知道如何解决

使用的框架和版本为PaddlePaddle-gpu 2.4.2.post117，尝试了

@paddle.jit.to_static

with paddle.static.device_guard("cpu"):

还是训练很慢，根据
https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/performance_improving/profiling_model.html
在chrome://tracing查看性能分析
模型性能分析
发现在梯度反向传播时出现了耗时很长的从GPU到CPU的数据同步
性能分析的输出也显示从GPU到CPU的数据同步占用很大

---------------------------------------------------Memory Manipulation Summary----------------------------------------------------
Time unit: ms
---------------------------------  ------  ---------------------------------------------  ----------------------------------------  
Name                               Calls   CPU Total / Avg / Max / Min / Ratio(%)         GPU Total / Avg / Max / Min / Ratio(%)    
---------------------------------  ------  ---------------------------------------------  ----------------------------------------  
GpuMemcpyAsync(same_gpu):GPU->GPU  284     5.00 / 0.02 / 1.00 / 0.00 / 0.00               0.00 / 0.00 / 0.00 / 0.00 / 0.00          
GpuMemcpyAsync:CPU->GPU            343     95.00 / 0.28 / 5.00 / 0.00 / 0.07              0.00 / 0.00 / 0.00 / 0.00 / 0.00          
GpuMemcpyAsync:GPU->CPU            77      121325.00 / 1575.65 / 16825.00 / 0.00 / 87.81  0.00 / 0.00 / 0.00 / 0.00 / 0.00          
BufferedReader:MemoryCopy          14      0.00 / 0.00 / 0.00 / 0.00 / 0.00               0.00 / 0.00 / 0.00 / 0.00 / 0.00          
---------------------------------  ------  ---------------------------------------------  ----------------------------------------

去掉嵌入层后发现没这个问题了，但是没有找到不去掉嵌入层的解决办法。