Cinder块节点重启后,在Dashboard上不能启动附带了云硬盘(券)的实例,报错:
Unexpected error while running command. Command: sudo nova-rootwrap /etc/nova/rootwrap.conf
iscsiadm -m node -T iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78 -p block_node_IP:3260 --rescan
Exit code: 21 Stdout: u'' Stderr: u'iscsia].
错误代码21表示:ISCSI_ERR_NO_OBJS_FOUND - no records/targets/sessions/portals found to execute operation on,找不到目标对象。而在实例关机状态下,不能卸载挂载的卷。
一、计算节点
# tail /var/log/nova/nova-compute.log -n 500 | grep iscsi /*没有发现会话*/
Unexpected error while running command.
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf
iscsiadm -m node -T iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78 -p block_node_IP:3260 --rescan
Exit code: 21
Stdout: u''
Stderr: u'iscsiadm: No session found.
<pre name="code" class="html"><pre name="code" class="html">......
# tail -n 50 /var/log/messages | grep auth /*登陆拒绝*/
Jan 25 14:53:43 compute3 iscsid: conn 0 login rejected: initiator failed authorization with target
Jan 25 14:57:29 compute3 iscsid: conn 0 login rejected: initiator failed authorization with target
Jan 25 16:06:40 compute3 iscsid: conn 0 login rejected: initiator failed authorization with target
Jan 25 17:10:51 compute3 iscsid: conn 0 login rejected: initiator failed authorization with target
Jan 25 17:17:37 compute3 iscsid: conn 0 login rejected: initiator failed authorization with target
Jan 25 17:41:02 compute3 iscsiadm: iscsiadm: initiator reported error (24 - iSCSI login failed due to authorization failure)
Jan 25 17:41:02 compute3 iscsiadm: iscsiadm: initiator reported error (24 - iSCSI login failed due to authorization failure)
Jan 25 17:41:03 compute3 iscsid: conn 0 login rejected: initiator failed authorization with target
Jan 25 17:41:03 compute3 iscsid: conn 0 login rejected: initiator failed authorization with target
# systemctl status iscsid -l /*认证失败*/
1月 25 17:17:37 compute3 iscsid[1730]: conn 0 login rejected: initiator failed authorization with target
1月 25 17:17:37 compute3 iscsid[1730]: Connection63:0 to
[target: iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78, portal: block_node_IP,3260]
through [iface: default] is shutdown.
......
其中:
1. iqn.2010-10.org.openstack是块节点的iscsi的target标示
2. volume-369865bb-0714-4ab2-a96c-7a91b7483e78是块节点上为实例创建的卷名(lvdisplay可以看到)
# iscsiadm -m node -T iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78 -p block_node_IP:3260 --login /*手工登陆验证*/
Logging in to [iface: default, target: iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78,
portal: block_node_IP,3260] (multiple)
iscsiadm: Could not login to [iface: default,
target: iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78, portal: block_node_IP,3260].
iscsiadm: initiator reported error (24 - iSCSI login failed due to authorization failure)
iscsiadm: Could not log into all portals
二、块节点
# tail -n 1000 /var/log/messages | grep auth /*系统日志*/
Jan 25 17:10:50 block1 kernel: iSCSI Initiator Node: iqn.1994-05.com.redhat:a5fd80c5a912 is not authorized to access iSCSI target portal group: 1.
Jan 25 17:17:37 block1 kernel: iSCSI Initiator Node: iqn.1994-05.com.redhat:a5fd80c5a912 is not authorized to access iSCSI target portal group: 1.
Jan 25 17:41:02 block1 kernel: iSCSI Initiator Node: iqn.1994-05.com.redhat:a5fd80c5a912 is not authorized to access iSCSI target portal group: 1.
Jan 25 17:41:02 block1 kernel: iSCSI Initiator Node: iqn.1994-05.com.redhat:a5fd80c5a912 is not authorized to access iSCSI target portal group: 1.
其中:
1. iqn.1994-05.com.redhat:a5fd80c5a912是实例所在<strong>计算节点</strong>的iscsi名字,可以在计算节点上查看
$ cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1994-05.com.redhat:a5fd80c5a912
看到这里就是表明:实例所在的计算节点不能成功认证去连接块节点的卷。
$ targetcli ls /*块节点上查看target,这里我有两个卷,分别500G*/
为了验证这些卷最初建立和挂载到实例上时的状态,这里新建一个卷并挂载到另一个位于不同计算节点(iqn.1994-05.com.redhat:11f29647866a)的实例上,截图
/> cd iscsi/
/iscsi> ls /*只看iscsi列表,方便看*/
发现新卷有两个地方不同:
1. 有acls(访控列表),指定谁能访问这个卷,图中看指明了是上面测试用的另一个实例所在的计算节点
2. 下面还有个mapped_lun0,mapping在于指定能访问谁,也就是上面acls中的主机能够访问谁(对应的还有个masking指定不能访问谁)
类似地,现在对之前的两个卷,作同样的更改
# targetcli
/> cd iscsi/iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78/tpg1/acls/ /*转到对应的卷的acls中*/
/iscsi/iqn.20...e78/tpg1/acls> create iqn.1994-05.com.redhat:a5fd80c5a912 /*指定实例所在计算节点可以访问*/
Created Node ACL for iqn.1994-05.com.redhat:a5fd80c5a912
Created mapped LUN 0.
/iscsi/iqn.20...e78/tpg1/acls> ls
o- acls .................................................................................................................. [ACLs: 1]
o- iqn.1994-05.com.redhat:a5fd80c5a912 .............................................................. [1-way auth, Mapped LUNs: 1]
o- mapped_lun0 ......................... [lun0 block/iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78 (rw)]
OK,上面说的两个东西都有了。对另一个卷作同样处理。
o- acls .................................................................................................................. [ACLs: 1]
o- iqn.1994-05.com.redhat:a5fd80c5a912 .............................................................. [1-way auth, Mapped LUNs: 1]
o- mapped_lun0 ......................... [lun0 block/iqn.2010-10.org.openstack:volume-446d70fc-c3f8-43cd-a0b9-dfd5eee934b9 (rw)]
/iscsi/iqn.20...4b9/tpg1/acls> exit
Global pref auto_save_on_exit=true
Last 10 configs saved in /etc/target/backup.
Configuration saved to /etc/target/saveconfig.json
# targetcli saveconfig /*再保存下*/
Last 10 configs saved in /etc/target/backup.
Configuration saved to /etc/target/saveconfig.json
# systemctl restart target /*重启target服务*/
硬重启实例(步骤看文章底部的控制节点),还是同样报错,并且在块节点上发现:
# tail /var/log/messages
Jan 26 11:06:00 block1 cinder-volume: 2016-01-26 11:06:00.356 2785 INFO cinder.volume.manager [-] Updating volume replication status.
Jan 26 11:06:08 block1 kernel: CHAP user or password not set for Initiator ACL
Jan 26 11:06:08 block1 kernel: Security negotiation failed.
Jan 26 11:06:08 block1 kernel: iSCSI Login negotiation failed.
# vim /etc/iscsi/iscsid.conf /*查看计算节点上的配置,默认就没有设置用户名和密码*/
# *************
# CHAP Settings
# *************
# To enable CHAP authentication set node.session.auth.authmethod
# to CHAP. The default is None.
#node.session.auth.authmethod = CHAP
# To set a CHAP username and password for initiator
# authentication by the target(s), uncomment the following lines:
#node.session.auth.username = username
#node.session.auth.password = password
vim /etc/target/saveconfig.json /*回到
块节点,
发现新建测试卷有用户名和密码,之前的卷在块节点主机重启后却没有*/
"dev": "/dev/cinder-volumes/volume-01519b87-3036-4b6e-8174-c4a86030b370",
"name": "iqn.2010-10.org.openstack:volume-01519b87-3036-4b6e-8174-c4a86030b370",
"plugin": "block",
"readonly": false,
/password
"login_timeout": 15,
"netif_timeout": 2,
"prod_mode_write_protect": 0,
"t10_pi": 0
},
"enable": true,
"luns": [
{
"index": 0,
"storage_object": "/backstores/block/iqn.2010-10.org.openstack:volume-01519b87-3036-4b6e-8174-c4a86030b370"
}
],
"node_acls": [
{
"attributes": {
"dataout_timeout": 3,
"dataout_timeout_retries": 5,
"default_erl": 0,
"nopin_response_timeout": 30,
"nopin_timeout": 15,
"random_datain_pdu_offsets": 0,
"random_datain_seq_offsets": 0,
"random_r2t_offsets": 0
},
"chap_password": "5k4DnHHcJd3SyvaF",
"chap_userid": "xZrcAF8GH5P6smJmYceN",
"mapped_luns": [
{
"index": 0,
"tpg_lun": 0,
"write_protect": false
}
],
"node_wwn": "iqn.1994-05.com.redhat:11f29647866a"
}
],
# reboot /*验证块节点重启后,新建卷没有变化,还是能正常识别使用:acls有;默认的用户名和密码也都在配置文件中,那么这个问题没有重现*/
最后,既然是认证失败,那么我就在块节点上设置这两个卷的用户名和密码,然后在对应计算节点上去配置对应的帐号
/> cd iscsi/iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78/tpg1/acls/iqn.1994-05.com.redhat:a5fd80c5a912/
/iscsi/iqn.20...:a5fd80c5a912> set auth userid=username
Parameter userid is now 'username'.
/iscsi/iqn.20...:a5fd80c5a912> set auth password=password
Parameter password is now 'password'.
/iscsi/iqn.20...:a5fd80c5a912> exit /*另一个卷同样设置*/
Global pref auto_save_on_exit=true
Last 10 configs saved in /etc/target/backup.
Configuration saved to /etc/target/saveconfig.json
# systemctl restart target
# ss -napt | grep 3260
LISTEN 0 256 *:3260 *:*
# vim /etc/iscsi/iscsid.conf /*然后在
计算节点去配置iscsid.conf,去掉注释开启CHAP*/
# *************
# CHAP Settings
# *************
# To enable CHAP authentication set node.session.auth.authmethod
# to CHAP. The default is None.
node.session.auth.authmethod = CHAP
# To set a CHAP username and password for initiator
# authentication by the target(s), uncomment the following lines:
node.session.auth.username = username
node.session.auth.password = password
# iscsiadm -m discovery -t sendtargets -p block_node_IP /*计算节点上再手工连接验证*/
# iscsiadm -m node -l
Login to [iface: default, target: iqn.2010-10.org.openstack:volume-446d70fc-c3f8-43cd-a0b9-dfd5eee934b9, portal: block_node_IP,3260] successful.
Login to [iface: default, target: iqn.2010-10.org.openstack:volume-369865bb-0714-4ab2-a96c-7a91b7483e78, portal: block_node_IP,3260] successful.
最后再硬重启实例,起来了。但是呢这里会有个潜在问题,以后这个计算节点上的其他实例需要挂载卷的时候,可能就需要对卷进行同样的用户名和密码设置。而如果匿名不需要认证,那就不太安全。
三、控制节点
在Dashboard上硬启动启动失败的实例,直接进入了error状态,需要reset-state重置状态再hard reboot
# nova list /*获取实例ID*/
# nova reset-state 2d6fc5be-a95e-4959-a16a-45f126b0217a --active /*重置为active活动状态*/
现在到Dashboard上硬重启实例。