笔记13-OSError: [Errno 24] Too many open files

参考文献

Linux 最大可以打开多少文件描述符?
OSError: [Errno 24] Too many open files错误解决方法。
[出现OSError: [Errno 24] Too many open files错误解决方法。]
查看线程数和句柄和进程最大文件连接数

失败尝试系列

报错

Running on class bagel
Extracting training-features for class bagel:   0%|     | 0/244 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 329, in reduce_storage
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 191, in DupFd
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/resource_sharer.py", line 48, in init
OSError: [Errno 24] Too many open files

停止程序

Extracting training-features for class bagel:   0%|     | 0/244 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 329, in reduce_storage
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 191, in DupFd
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/resource_sharer.py", line 48, in init
OSError: [Errno 24] Too many open files
Extracting training-features for class bagel:   0%|     | 0/244 [01:16<?, ?it/s]
Traceback (most recent call last):
File "/home/cszx/zgp/Shape-Guided-main/Shape-Guided-main/main.py", line 92, in <module>
patchcore.fit()
File "/home/cszx/zgp/Shape-Guided-main/Shape-Guided-main/core/shape_guide_core.py", line 99, in fit
for train_data_id, (sample, _) in enumerate(tqdm(data_loader, desc=f'Extracting training-features for class {self.class_name}')):
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data
idx, data = self._get_data()
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1142, in _get_data
success, data = self._try_get_data()
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 990, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/queue.py", line 173, in get
self.not_empty.wait(remaining)
File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/threading.py", line 299, in wait
gotit = waiter.acquire(True, timeout)
KeyboardInterrupt
Process finished with exit code 1

查看发现,似乎是因为线程数有限制

(zgp_shape) ~/.cache/torch/hub/checkpoints ulimit -n
1024
(zgp_shape) ~/.cache/torch/hub/checkpoints ulimit -n 2048
线程限制增加,仍然不行,继续增加
ulimit -n 4096
仍然不行,继续增加时候提示
ulimit -n 9192
ulimit: value exceeds hard limit
因此查看
(zgp_shape) ~/.cache/torch/hub/checkpoints cat /proc/sys/fs/nr_open
1048576
(zgp_shape) ~/.cache/torch/hub/checkpoints ulimit -Hn 9192
ulimit: can’t raise hard limits

修改配置

参考
出现OSError: [Errno 24] Too many open files错误解决方法。
o:在当前行下方新开一行并进入插入模式。
O:在当前行上方新开一行并进入插入模式。
esc:退出插入模式,回到普通模式。
h:左移一个字符。
j:下移一行。
k:上移一行。
l:右移一个字符。
i:进入插入模式。
在这里插入图片描述
仍然不能继续提高提高,且仍然报错
Running on class bagel

Extracting training-features for class bagel: 0%| | 0/244 [00:00<?, ?it/s]Traceback (most recent call last):
File “/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/queues.py”, line 234, in _feed
File “/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py”, line 51, in dumps
File “/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/multiprocessing/reductions.py”, line 329, in reduce_storage
File “/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py”, line 191, in DupFd
File “/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/resource_sharer.py”, line 48, in init
OSError: [Errno 24] Too many open files

通过sudo
sudo sh -c “ulimit -n 65535 && exec su $LOGNAME”

(base) ~ sudo sh -c "ulimit -n 65535 && exec su $csdx"   
[sudo] password for cszx: 
sudo: 3 incorrect password attempts
(base) ~ 
(base) ~ sudo sh -c "ulimit -n 65535 && exec su $cszx"
[sudo] password for cszx: 
[root@localhost]/home/cszx#   
[root@localhost]/home/cszx# ulimit -Hn 65535
[root@localhost]/home/cszx# ulimit -n       
65535
[root@localhost]/

仍然如此,
[root@localhost]/home/cszx# exec su cszx
(base) ~ ulimit -n
65535

仍然如此
Running on class bagel

Extracting training-features for class bagel:   0%|     | 0/244 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 329, in reduce_storage
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 191, in DupFd
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
·

继续增加

(base) ~ sudo sh -c "ulimit -n 165535 && exec su $cszx"
[sudo] password for cszx: 
[root@localhost]/home/cszx# exec su $cszx
[root@localhost]/home/cszx# 
[root@localhost]/home/cszx# exec su cszx
(base) ~ ulimit -Hn
165535
(base) ~ ulimit -n                                     
165535

仍然不行

先查查看

(base) ~ ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         0
-m: resident set size (kbytes)      unlimited
-u: processes                       4096
-n: file descriptors                165535
-l: locked-in-memory size (kbytes)  64
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 766851
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15:  

(base) / pgrep -f “python”
1549
2054
94876
95091
(base) / sudo ls /proc/94876/fd | wc -l
[sudo] password for cszx:
52
(base) / sudo ls /proc/2054/fd | wc -l
14
(base) / sudo ls /proc/1549/fd | wc -l
11
(base) / sudo ls /proc/95091/fd | wc -l
315
没看见有打开很多文件的进程,不敢再调大了。
难道可能是进度条的锅,决定在代码中去掉(其实不是)
在这里插入图片描述
报错变了

Traceback (most recent call last):
  File "/home/cszx/zgp/Shape-Guided-main/Shape-Guided-main/core/shape_guide_core.py", line 4, in <module>
    from core.rgb_sdf_feature import RGBSDFFeatures, SDFFeature
ModuleNotFoundError: No module named 'core'

Process finished with exit code 1

右键exludeded后

Running on class bagel

Traceback (most recent call last):
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 329, in reduce_storage
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 191, in DupFd
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
唉,原报错又回来了
Traceback (most recent call last):
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 329, in reduce_storage
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/reduction.py", line 191, in DupFd
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
^CTraceback (most recent call last):
  File "/home/cszx/zgp/Shape-Guided-main/Shape-Guided-main/main.py", line 92, in <module>
    patchcore.fit()
  File "/home/cszx/zgp/Shape-Guided-main/Shape-Guided-main/core/shape_guide_core.py", line 100, in fit
    for train_data_id, (sample, _) in enumerate(data_loader):
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data
    idx, data = self._get_data()
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1142, in _get_data
    success, data = self._try_get_data()
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 990, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/queue.py", line 173, in get
    self.not_empty.wait(remaining)
  File "/home/cszx/miniconda3/envs/zgp_shape/lib/python3.6/threading.py", line 299, in wait
    gotit = waiter.acquire(True, timeout)
KeyboardInterrupt

此时

(base) / sudo ls /proc/*/fd | wc -l
2322
(base) / sudo ls /proc/*/fd | wc -l
[sudo] password for cszx: 
1943
(base) / sudo ls /proc/*/fd | wc -l
2302

重启

增加文件数限制,然后使用命令运行(成功)

(base) ~ conda activate zgp_shape         
(zgp_shape) ~ ulimit -n 2048
(zgp_shape) ~ ulimit -n 4096

对pycharm还是没有作用
但是改为在SSH虚拟环境命令行使用命令运行就可以了

  • 33
    点赞
  • 30
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值