自定义博客皮肤VIP专享

*博客头图：

点击选择上传的图片

格式为PNG、JPG，宽度*高度大于1920*100像素，不超过2MB，主视觉建议放在右侧，请参照线上博客头图

请上传大于1920*100像素的图片！

博客底图：

点击选择上传的图片

图片格式为PNG、JPG，不超过1MB，可上下左右平铺至整个背景

栏目图：

点击选择上传的图片

图片格式为PNG、JPG，图片宽度*高度为300*38像素，不超过0.5MB

主标题颜色：

RGB颜色，例如：#AFAFAF

Hover：

RGB颜色，例如：#AFAFAF

副标题颜色：

RGB颜色，例如：#AFAFAF

预览取消提交

自定义博客皮肤

-+

上一步保存

blueblood7的专栏

原创在容器中，cat显示正常，vi显示乱码

1、用 locale 看，如果LANG不是 zh_CN.UTF-8，执行以下命令。2、如果用 vi 看还是乱码，执行以下命令。

2025-05-28 15:23:49 157

原创安装 pytorch2.5.1 + cuda12.1

pip install torch==2.5.1 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121

2025-03-24 14:23:39 1056

原创 python中二维数组创建的陷阱

python中二维数组创建的陷阱

2025-02-18 19:45:16 146

原创安装 fairseq 失败

安装 fairseq 失败

2025-01-16 11:07:01 375

原创安装 kaldifeat

安装 kaldifeat。

2024-12-17 17:02:47 552

原创安装 jieba

安装 jieba 的方法

2024-12-16 10:59:51 455

原创把 docker 镜像中的脚本做成 linux 开机启动的服务

把docker脚本做成开机启动的服务

2024-12-12 20:29:33 315

原创 torchaudio.load 段错误

使用 torchaudio.load 时出现段错误。

2024-12-11 13:40:13 689

原创 pip install pynini 失败

pip install pynini 失败，主要错误在编译 wheel 的时候。

2024-11-07 19:07:33 2598

原创从 docker 中传出文件

1、用 docker cp 复制到本机。2、用 scp 复制到远程。

2024-11-05 17:26:47 166

原创 ImportError: cannot import name ‘Qwen2VLForConditionalGeneration‘ from ‘transformers‘

4.44.2版本的 transformer 会有标题的错误。装完版本是 4.45.0.dev0。

2024-09-10 16:28:38 3291 1

原创 putText 显示中文为 ???

【代码】putText 显示中文为???

2024-08-12 17:29:46 484

原创慎用 np.vstack 和 np.hstack

连接2个array，如果是1维数组，要用 np.hstack，2维数组，要用 np.vstack。但可以统一用 np.concatenate(, axis=0)。

2024-06-21 17:06:13 282 1

原创多进程中随机数的初始种子

torch、random 和 np.random 在多进程中随机数的初始种子。

2024-04-01 16:30:57 335 1

原创远程连接 vscode 出错 “远程主机可能不符合 glibc 和 libstdc++ VS Code 服务器的先决条件”

下载 1.85.2，https://update.code.visualstudio.com/1.85.2/win32-x64-archive/stable。vscode 版本是 1.86，服务器上的 glibc 和 libstdc++ 版本不满足。

2024-02-22 15:52:26 3115

原创在 docker 中安装 GLEE

docker 中没有 git，可以通过共享主机文件夹，在主机中做 git clone。

2023-12-27 11:42:16 737 1

原创通过 conda 安装的 detectron2

发现预编译的版本最高支持 pytorch1.10、cuda11.3。

2023-12-26 17:37:03 1301

原创在 docker 中安装 sam

3、进入 segment-anything 目录，用 pip install . 安装。1、进入 docker，共享一个主机的目录。2、在主机的目录中做 git clone。

2023-12-15 14:51:12 814 1

原创 “RuntimeError: Unable to find a valid cuDNN algorithm to run convolution“

调小 batch size。

2023-10-26 10:27:01 157

原创 reshape 和 view 的效率比较

如果 tensor 是连续的，reshape 返回的是视图，和 view 一致。

2023-08-31 16:33:34 360

原创 np.argwhere 返回的数组

np.argwhere 返回的数组，第一维是满足条件的结果数量，第二维是满足条件的结果索引。

2023-06-14 10:29:33 303

原创 pytorch 的 dataset 中使用 onnxruntime

把 DataLoader 中 num_workers 设置成 0 就好了。

2022-12-13 19:11:11 1606

原创解析 cifar10 的压缩包到图片

解析 cifar10 的压缩包到图片

2022-12-06 16:09:32 368

原创使用 vscode 远程调试 docker 中的代码

使用 vscode 远程调试 docker 中的代码

2022-12-02 11:42:21 990

原创 timm 和 torchvision 中的 resnet50

从 timm 和 torchvision 分别加载 resnet50 预训练模型，从 onnx 看，权重是一样的。

2022-11-14 16:50:28 1966 1

原创 torch.autocast 放在 forward 中

使用 DDP时，如果不是每块 GPU 一个进程，torch.autocast 应该放在 forward 中。

2022-10-09 15:55:13 633

原创 SE 的 ONNX 图

SE 的 ONNX 图

2022-09-13 16:26:14 157

原创 shape 和 size() 区别

shape 比 size() 通用性更好

2022-07-25 11:29:29 2274

原创 cross entropy loss = log softmax + nll loss

cross entropy loss = log softmax + nll loss

2022-06-07 16:58:43 452

原创 IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [2], [3]

import torcha = torch.randn(3,5)print(a)# 下行会有错误 IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [2], [3]#b = a[[0,2],[1,3,4]] # 改成 b = a[[0,2],:][:,[1,3,4]] print(b)输出是：tensor([[ 0.3627, -0.7073, -0.39.

2022-05-09 15:05:27 2157

原创 [8] Assertion failed: dims.nbDims == 4 || dims.nbDims == 5

onnx 转 trt 的时候出现错误：[04/22/2022-15:45:13] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.[04/22/2022-15:45:13] [E] [TRT] (Unnamed

2022-04-22 16:18:55 1855

原创 enforce fail at inline_container.cc:222

执行 torch.load 时出现错误：“RuntimeError: [enforce fail at inline_container.cc:222] . file not found: archive/data/94479765723472”或者“RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading file data/94100453134480: inval

2022-04-14 15:54:40 4021 3

原创使用 trt 的int8 量化和推断 onnx 模型

目录生成 trt 模型1、使用代码2、onnx模型和图片3、修改代码4、结果推断 trt 模型生成 trt 模型1、使用代码https://github.com/rmccorm4/tensorrt-utils.git2、onnx模型和图片模型：动态batch输入（假设为mob_w160_h160.onnx，输入是 [batchsize, 3, 160, 160]）。图片：一堆图片（假设有1024张），不需要其他描述文件。在tensorrt-u...

2022-03-27 22:55:41 8761 2

原创画 ArcFace 中的 margin 曲线

效果如下：代码如下：from math import cos, sin, piimport numpy as npimport matplotlib.pyplot as plt'''# https://github.com/deepinsight/insightface/blob/master/recognition/arcface_torch/losses.pyclass ArcFace(torch.nn.Module): """ ArcFace (https:/..

2022-03-17 20:12:22 558

原创 Unable to determine the device handle for GPU 0000:02:00.0: GPU is lost.

TITAN X (Pascal) 的显卡，当 batch size 过大爆显存时，就会出现 GPU丢失的错误。

2022-02-17 18:49:51 1254

原创 unhandled system error, NCCL version 2.7.8

在宿主机上运行基于 DDP 的 pytorch 训练程序没问题，进入 docker 后运行，出现 "unhandled system error, NCCL version 2.7.8" 的错误。解决方法：在 python -m torch.distributed.launch --nproc_per_node=4 ...前加上 NCCL_DEBUG=INFO可以看到：s215:623:649 [3] include/shm.h:48 NCCL WARN Error while cr

2022-02-16 17:58:14 6146 1

原创在两台 ubuntu 上安装 K8S

参考：1、ubuntu 安装 k8s2、报错：The connection to the server localhost:8080 was refused - did you specify the right host or port?3、Connecting to raw.githubusercontent.com failed: Connection refused. 解决办法安装 flannel 时使用：(python38) ai200@ubuntu16:/$ kubectl

2022-02-15 16:27:35 961

原创在两台 ubuntu 之间传输大文件

方法1：scp -c aes128-gcm@openssh.coma.tar.gz usrname@ip:dir加上 -c aes128-gcm@openssh.com，可以加速。方法2：rsync -avP a.tar.gz usrname@ip:dir参考：1、linux中scp传文件速度慢原因2、为什么scp这么慢，如何使它更快？...

2022-01-25 10:38:36 1687

原创 ubuntu14.04 升级到 ubuntu16.04

参考：将Ubuntu 16.04 LTS 升级到 18.04 LTS | 以及问题汇总# 升级前(base) root@s215:~# cat /proc/versionLinux version 3.13.0-147-generic (buildd@lcy01-amd64-024) (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.4) ) #196-Ubuntu SMP Wed May 2 15:51:34 UTC 2018# 升级后(bas

2022-01-24 16:21:20 1218

原创多机多卡训练时的错误

错误1：“NCCL WARN Connect to failed : Network is unreachable”解决方法：设置环境变量NCCL_SOCKET_IFNAME=enp（有可能是eno，可以先用ifconfig 查看）

2021-12-23 19:51:50 2305 1

空空如也

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示

确定要删除当前文章？

取消删除