2020年06月_城俊BLOG

翻译 pytorch 训练train/测试validate loss太大 nan

2020-06-30 11:25:26 3728

原创 pytorch加载模型报错Unexpected key(s) in state_dict: module.conv1.weight, module.bn1

文章目录代码报错原因解决代码import modelsarch = 'resnet50'model = models.__dict__[arch]()checkpoint = torch.load(ckptFile) model.load_state_dict(checkpoint['state_dict'])model = torch.nn.DataParallel(model).cuda()报错Traceback (most recent call last

2020-06-28 14:58:57 17040 10

原创 [极客入门汇总] 从0开始搭建深度学习主机（双显卡-个人版）

参考：https://zhuanlan.zhihu.com/p/56075793

2020-06-26 10:51:44 2071

翻译 batchsize大小的影响

batchsize越大，精度越高？batchsize越大，每次迭代能在更大的范围内找到最优的梯度更新方向？答案：No. 需要选择合适的batchsize，大了或者小了都有缺点。极限情况：batchsize= # samples：Full Batch Learningbatchsize=1：在线学习个人认为关于这个问题一个比较好的解释：batchsize越大，每次迭代能在更大的范围内找到最优的梯度更新方向？其他：https://www.zhihu.com/question/32673260

2020-06-24 16:44:19 2442

原创 Python按键检测方法汇总

文章目录signal信号量, pynputtermios, signaltty, termiospygameopencv通过SSH协议给python多进程转发按键通过实测，主要有以下方案，点击查看具体实现细节signal信号量, pynputpython multiprocessing多进程按键检测，优雅地终止多进程(signal信号量和pynput等多种方法)termios, signalpython 检测键盘按键，随时停止程序tty, termiospython检测键盘输入ter

2020-06-24 13:48:17 10359

原创 bcolz安装报错ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for

使用Arcface时需要安装bcolz，不知道是什么$ sudo pip3.7 install bcolzWARNING: The directory '/home/user1/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directo

2020-06-24 10:51:30 4248 1

翻译模型参数加载和迁移学习 Pytorch Finetune和MXNet Finetune

MXNet Finetune:每一层都有个 lr_mult 属性，就是学习率的倍率，可以设置为不为1的值进行放大或缩小。参考代码：_weight = mx.symbol.Variable("fc7_weight", shape=(args.num_classes, args.emb_size), lr_mult=1.0)Pytorch Finetune:设置哪些层固定不变model_ft = models.resnet50(pretrained=True) # 这里自动下载官方的预训.

2020-06-24 10:35:07 1040

原创 Pytorch和Numpy中由列表生成的向量

在Pytorch和Numpy中都可以由一维列表生成矩阵，但是存在差别# 注意：列表生成的向量在Pytorch中在索引时被当做列向量处理，但不能直接参与乘法运算>>> a = [1,2,3,4,5]>>> a = torch.tensor(a)>>> a.shapetorch.Size([5])# 直接索引到数值>>> a[0]1# Pytorch中一个列向量可以这么直接索引。注意索引的结果是一个tensor>&

2020-06-23 15:21:19 1672

原创 Pytorch和Numpy的乘法【一网打尽！】

文章目录Pytorch按元素相乘 / 点乘 / 按对应点相乘按矩阵相乘 / 矩阵乘法Numpy按元素相乘按矩阵相乘Pytorch按元素相乘 / 点乘 / 按对应点相乘API：mul_和mul要求：两个tensor的shape必须相同>>> import torch>>> a = torch.tensor([[1,2,3,4,5]])>>> b = torch.tensor([[1,2,3,4,5]])>>> a.mul_

2020-06-23 14:06:30 1684

原创 Pytorch IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Pytorch在计算accuracy时报错Traceback (most recent call last): File "main.py", line 547, in <module> File "main.py", line 270, in main validate(test_loader_lfwa, model, criterion) File "main.py", line 480, in validate top1[j].update(prec1[j][

2020-06-22 11:57:40 6761

原创 Linux Ubuntu18.04蓝牙没声音 org.bluez.Error.Failed, name org.PulseAudio1 already taken

文章目录ubuntu 18.04 终端命令行开启/关闭蓝牙蓝牙没有声音怎么解决？报错集锦解决ubuntu 18.04 终端命令行开启/关闭蓝牙设置面板的蓝牙开关按钮不好用了, 关了就开不了了, 如何手动开启？看这里．蓝牙没有声音怎么解决？我的经历，供参考：蓝牙连上了但是没有声音, 一通折腾, 横冲直撞满头包… 网上说的什么 a2db.py, 反复折腾PulseAudo, Alsamixer 统统不管用. pulseaudio进程杀也杀不死, 死了又复生. 折腾过程中的报错如下 (排名不分先后, 如

2020-06-20 13:51:17 5373 8

原创 Linux Ubuntu 使用git共享代码到github

# 安装和基本配置$ sudo apt-get install git$ git config --global user.name 'your_name'$ git config --global user.email 'your_email@qq.com'$ cd local_repo/$ sudo git init# 设置 ss proxy避免git pull时网络速度慢$ git config --global http.https://github.com.proxy socks5:

2020-06-19 23:36:30 296

原创 Ubuntu 18.04 极客入门: 睡眠/开启蓝牙/蓝牙没声音命令

systemctl suspend按电源按钮后唤醒.感谢: https://ubuntuqa.com/article/100.html

2020-06-19 10:55:15 1717

原创 Pytorch one hot多标签学习报错 RuntimeError: multi-target not supported at /opt/conda/conda-bld/pytorch_1556

报错：Traceback (most recent call last): File "/home/user1/main.py", line 544, in <module> main() File "/home/user1/main.py", line 273, in main validate(test_loader_lfwa, model, criterion) File "/home/user1/main.py", line 476, in validate

2020-06-18 16:30:15 4373

原创 pytorch RuntimeError: size mismatch, m1: [16 x 86016], m2: [25088 x 512] at /opt/conda/conda-bld/pyt

报错内容：Traceback (most recent call last): File "/home/user1/main_arc_face.py", line 534, in <module> main() File "/home/user1/main_arc_face.py", line 315, in main val_loss, prec1 = validate(val_loader, model, criterion) File "/home/user1

2020-06-18 14:08:43 4010

原创 Pytorch ArcFace人脸属性预测报错 RuntimeError: Dimension out of range (expected to be in range of [-1, 0], bu

代码loss.append(criterion(output, target, ))报错: RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)原因：出现这个问题不一定是这里所说的，第一维输入的是0/1不是概率值，也有可能是第一和第二两个维度的shape不匹配，或者说长度不一样导致的。也就是网络的输出和ground truth target的size不匹配。如果强制修改网络结构的

2020-06-18 11:07:28 2321

原创 pytorch AttributeError: face_learner object has no attribute parameters

你初始化的模型是一个 face_learner 类，不是你需要运行的模型类本身。可能 face_learner.model 才是你需要的模型类。output = face_learner.model(input)而不是output = face_learner(input)仔细看看 face_learner 类下面是不是有一个 self.model的属性？一时糊涂。...

2020-06-17 19:02:07 1317

原创 Pytorch多标签损失函数MultiLabelSoftMarginLoss报错RuntimeError: The size of tensor a (128) must match the size

代码：import torch.nn as nncriterion = nn.MultiLabelSoftMarginLoss(weight=w)# train on celeba train set and lfwa train settrain_dataset = CelebA(args.data, 'train_40_att_list.txt', transforms.Compose([ transforms.Resize((250, 250)), transforms.Ran

2020-06-16 08:50:01 7569

原创 TypeError: unsupported operand type(s) for @: numpy.ndarray and Tensor

pytorch 使用 @ 做矩阵点乘时发生错误:TypeError: unsupported operand type(s) for @: ‘numpy.ndarray’ and ‘Tensor’代码import torchweights = torch.randn(784, 10) / math.sqrt(784)weights.requires_grad_()x @ weights原因你需要先把左边的x转换成tensor,才能和右边的权重相乘解决:x, = map ( t

2020-06-13 18:36:01 8803

翻译 pytorch训练报错TypeError: batch must contain tensors, numbers, dicts or lists found class PIL.Image.I

Traceback (most recent call last): File "/home/user1/main.py", line 1153, in <module> main() File "/home/user1/main.py", line 546, in main count_train) File "/home/user1/main.py", line 618, in train for i, (input, target) in enumera

2020-06-13 12:18:58 7332 5

翻译 Errors were encountered while processing: snapd

使用apt时报错：Errors were encountered while processing: snapd解决：$ sudo vi /var/lib/dpkg/info/snapd.prerm在第二行加上 exit 0，就是骗过shell脚本。保存，然后重新运行命令参考：https://acejoy.com/2017/03/06/116/

2020-06-11 18:45:00 2195

翻译深度学习模型转换

参考：https://github.com/microsoft/MMdnn示例：# mxnet -> ir (IR is the specified name of the ir file)$ mmtoir -f mxnet -n W_F_Up_Model/W_F_UP-symbol.json -w W_F_Up_Model/W_F_UP-0002.params -d IR --inputShape 3,112,112# ir -> py, npy$ mmtocode -f pytor

2020-06-10 17:23:14 626

翻译 Linux ubuntu设置防火墙

1,安装$ sudo apt install ufw2,配置$ sudo ufw enable$ sudo ufw default deny3, 查看状态$ sudo ufw status感谢:https://www.cnblogs.com/sweet521/p/5733466.html

2020-06-07 12:57:05 353

原创 python plt 设置双坐标轴彩色刻度，双图例，标签旋转

代码：import matplotlib.pyplot as pltimport numpy as npnp.random.seed(4)fig = plt.figure()ax1 = fig.add_subplot(111)x = [1,2,3,4,5]y1 = np.random.rand(5)y2 = np.random.rand(5)plot1 = ax1.plot(range(0, len(x)), y1, '-*', color='r', label='train')ax2

2020-06-05 17:42:43 10273

原创 numpy矩阵求和 TypeError: sum() received an invalid combination of arguments - got (axis=int, )

Traceback (most recent call last): File "/home/user1/pjs/0Fea/multiTaskCNN/face-attribute-prediction/celeba.py", line 101, in <module> t = target.sum(axis=0).reshape((1, 40))TypeError: sum() received an invalid combination of arguments - got (

2020-06-05 13:35:17 7691

翻译 linux ubuntu打开eps文件

$ xdg-open log.epshttps://unix.stackexchange.com/questions/111668/what-is-the-command-to-display-eps-files

2020-06-04 13:57:45 2641

翻译 lsb_release conda failed ValueError: could not convert string to float: 16.04 LTS

查看ubuntu版本号时报错$ sudo lsb_release -aTraceback (most recent call last): File "/usr/bin/lsb_release", line 95, in <module> main() File "/usr/bin/lsb_release", line 59, in main distinfo = lsb_release.get_distro_information() File "/usr/li

2020-06-03 18:06:16 1192 1

翻译 ubuntu 16.04重装/升级 nvidia显卡驱动，NVIDIA-SMI has failed because it couldnt communicate with the NVIDIA d

系统：Ubuntu 16.04 LTS原驱动版本：384.98$ ls /usr/src...nvidia-384.98...CUDA版本：release 9.0, V9.0.176重装/升级的过程中CUDA不用动。CuDNN不用动。报错：$ nvidia-smiNVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA dr

2020-06-03 15:55:05 2265

翻译 pytorch IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to

pytorch 1.1.0报错：$ sh experiments/webface/res50-bs64-sz224-ep35/train.shCreating model on [4] gpus: 0,1,2,3Origin Size: 493456 Aligned Size: 493568/home/user1/miniconda3/lib/python3.7/site-packages/torch/nn/parallel/_functions.py:61: UserWarning: W

2020-06-01 15:28:46 1009

翻译 pytorch too many open files

pytorch 单机多卡训练：报错：$ sh experiments/webface/res50-bs64-sz224-ep35/train.shCreating model on [4] gpus: 0,1,2,3Origin Size: 493456 Aligned Size: 493568Traceback (most recent call last):Traceback (most recent call last): File "/home/user1/miniconda

2020-06-01 15:07:12 1567

Hello Word!