基于KubeRay提交RayJob
0.背景
基于kuberay-operator 0.4.0版本
1.问题
提交作业
kubectl apply -f ray_v1alpha1_rayjob.yaml
问题报错:
2023-05-10 02:20:00,617 INFO usage_lib.py:517 -- Usage stats collection is enabled by default without user confirmation because this terminal is detected to be non-interactive. To disable this, add `--disable-usage-stats` to the command that starts the cluster, or run the following command: `ray disable-usage-stats` before starting the cluster. See https://docs.ray.io/en/master/cluster/usage-stats.html for more details.
2023-05-10 02:20:00,617 INFO scripts.py:702 -- [37mLocal node IP[39m: [1m10.0.1.32[22m
Traceback (most recent call last):
File "/home/ray/anaconda3/bin/ray", line 8, in <module>
sys.exit(main())
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/scripts/scripts.py", line 2386, in main
return cli()
File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/cli_logger.py", line 852, in wrapper
return f(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/scripts/scripts.py", line 730, in start
ray_params, head=True, shutdown_at_exit=block, spawn_reaper=block
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/node.py", line 198, in __init__
self._init_temp()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/node.py", line 408, in _init_temp
try_to_create_directory(self._session_dir)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/utils.py", line 997, in try_to_create_directory
os.makedirs(directory_path, exist_ok=True)
File "/home/ray/anaconda3/lib/python3.7/os.py", line 223, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/tmp/ray/session_2023-05-10_02-20-00_619912_9'
2.分析
ray job中使用raydp镜像,有权限问题
3.解决方案
将镜像换为rayproject/ray:2.4.0,再次提交就成功了
root@DESKTOP-3813A3M:/mnt/d/all/app/Ray/rayjob# kubectl apply -f ray_v1alpha1_rayjob.yaml
rayjob.ray.io/rayjob-sample created
configmap/ray-job-code-sample created
976

被折叠的 条评论
为什么被折叠?



