NetApp 7-mode初始化详解(原AGGR重建)及注意事项

本文详细介绍了在7-ModeONTAP环境中,如何进行系统初始化、删除/重建AGGR以及处理磁盘清零等操作。过程中涉及到了控制器高可用性(HA)的设置,磁盘所有权问题的解决,以及Web访问问题和多个磁盘柜的ShelfID冲突处理。文章提供了具体步骤和可能遇到的问题及解决方案。
摘要由CSDN通过智能技术生成

背景

前言

没想到有生之年还能再写个7-mode的文章,上次接触可能已将近十年,本次恰巧碰到这个机会,边学习边整理,整整一天时间,碰到了不少坑,作为记录

项目背景

用户需求:
当前FAS2552使用7-mode Ontap,主要用作测试,产品本身已过保,用户计划将当前数据AGGR重新分配,在满足测试容量需求的前提下留出足够的热备盘
环境排查:
本身来讲这个需求并不复杂,只需将当前的数据AGGR删除并按需求重新创建,顶多需要一些磁盘清零的时间,但本项目特殊之处在于,最初上线不知因为何原因,直接将数据盘添加到了系统的AGGR里,相当于要重做AGGR,就必须删除系统AGGR,相当于完全重新初始化!
在这里插入图片描述

相关概念

  1. 7-mode与c-mode区别
    C-mode也就是集群模式,本身推出很久了,相关区别无论中文还是英文资料都比较多,以下是我觉得总结的比较精炼的一段,对于我自己来讲,感触最深的区别包括:
  • C-mode整个CLI无论是设计逻辑还是补全等都非常友好,具体可参考我Ontap初始化那个文章
  • C-mode与7-mode从设计上的最大区别在于支持控制器的横向扩展
  • 部分新功能,当前随着Ontap版本提升也在增加
  • 7-Mode - either single controller, or two controllers clustered for HA; think of it as “traditional” ONTAP with all bells & whistles (dedupe, compression, cloning, etc.
  • Cluster-Mode - either single controller pair, or multiple pairs connected (clustered) via back-end 10GbE network; it originates from ONTAP GX & basically the key thing is, you can have a single storage system with more than two controllers; there are some features specific for C-Mode (single namespace, Infinite Volumes, etc.), but some other features are still missed comparing to 7-Mode (e.g. SnapVault)
  1. 磁盘清零
    在Ontap9.4之前,对硬盘相关数据层面的操作(包括新增拆机硬盘,AGGR重建等)前都需要对磁盘进行清零,类似于低格,需要对扇区数据块等重新更新,花费时间较长,以下是部分SAS盘的清零时间(单位小时),其他磁盘可参考官方KB,当然9.4起引入了快速清零的技术后不再需要等待时间
    在这里插入图片描述

根AGGR删除/初始化前准备

这部分未做记录,简单做下口述:

  • 重启控制器A,进入维护模式
  • 将跟AGGR离线并删除(Aggr offline / Aggr destroy)
  • 移除所有磁盘归属信息(Remove_ownership)
  • 对控制器B进行相同操作
  • 完成后再次重启,选择4清除所有配置及磁盘信息
  • 重启后控制器会自动选择3块硬盘作为root aggr,并开始磁盘清零
  • 完成清零后进入初始化流程

在这里插入图片描述
在初始化之前,还需要进行的准备工作包括:

  • 确认需要配置的基础管理信息,包括主机名,DNS,两个节点的管理IP等
  • 确认需要配置的业务相关信息,提前做好相关准备规划
  • 准备好License!

初始化流程

以下初始化过程主要配置了主机名及e0M口的管理IP,初始化配置有任何问题都可以输入setup进行重新配置

Please enter the new hostname [11]: XXXXX-KS-C1-A
Do you want to enable IPv6? [n]: 
Do you want to configure interface groups? [n]: 
Please enter the IP address for Network Interface e0a []: 
Please enter the IP address for Network Interface e0b []: 
Please enter the IP address for Network Interface e0e []: 
Please enter the IP address for Network Interface e0f []:
Please enter the IP address for Network Interface e0M []: xx.xx.88.18
Please enter the netmask for Network Interface e0M [255.255.0.0]: 255.255.255.0
Please enter flow control for e0M {none, receive, send, full} [full]: 
Please enter the name or IP address of the IPv4 default gateway: xx.xx.88.1 
	The administration host is given root access to the filer's
	/etc files for system administration.  To allow /etc root access
	to all NFS clients enter RETURN below.
Please enter the name or IP address of the administration host: 
Please enter timezone [GMT]: 
Where is the filer located? []: KS
Enter the root directory for HTTP files [/home/http]: 
Do you want to run DNS resolver? [n]: 
Do you want to run NIS client? [n]: 

	Press the return key to continue.


	The Service Processor (SP) provides remote management capabilities
	including console redirection, logging and power control.
	It also extends autosupport by sending
	additional system event alerts. Your autosupport settings are used
	for sending these alerts via email over the SP LAN interface.
Would you like to configure the SP LAN interface [y]: n
Setting the administrative (root) password for  ...
New password:
Retype new password:

System initialization has completed successfully.

关闭Autosupport及磁盘的自动注册

XXXXX-KS-C1-A> options autosupport.support.enable off
XXXXX-KS-C1-A> options disk.auto_asssign off

根据规划将所有磁盘依次assign给制定控制器,assign完成后需要对磁盘进行清零(disk zero spares)

XXXXX-KS-C1-A> disk assign 0b.SHFFG1631000094.0
XXXXX-KS-C1-A> sysconfig -r
RAID Disk       Device                  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------                  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           0b.SHFFG1631000094.0    0b    0   0   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.1    0b    0   1   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.2    0b    0   2   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.5    0b    0   5   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.6    0b    0   6   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.8    0b    0   8   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.9    0b    0   9   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.10   0b    0   10  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.11   0b    0   11  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.12   0b    0   12  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.13   0b    0   13  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.14   0b    0   14  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.17   0b    0   17  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.18   0b    0   18  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.19   0b    0   19  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.21   0b    0   21  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.23   0b    0   23  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)

XXXXX-KS-C1-A> disk zero spares

这边是比较特殊的一点,可能是由于我初始化的时候未配置集群间接口及地址,导致HA模式未开机(根据官方文档默认开启HA),因此需要手动将cf.mode改为ha,完成后重启

XXXXX-KS-C1-A> cf status
Non-HA mode.
XXXXX-KS-C1-A> options cf.mode
cf.mode                      non_ha
XXXXX-KS-C1-A> options cf.mode ha
Mode set to HA.  Reboot node to activate HA.
XXXXX-KS-C1-A> reboot

重启后进入setup,重新配置e0a及e0b地址,逻辑上让两个控制器的e0a和e0b互通,配置并制定对端failover地址即可

Please enter the IP address for Network Interface e0a []: 192.168.10.1
Please enter the netmask for Network Interface e0a []: 255.255.255.0
Should interface e0a take over a partner IP address during failover? [n]: y
Please enter the IPv4 address or interface name to be taken over by e0a []: 192.168.10.2
Please enter media type for e0a {100tx-fd, tp-fd, 100tx, tp, auto (10/100/1000)} [auto]: 
Please enter flow control for e0a {none, receive, send, full} [full]: 
Do you want e0a to support jumbo frames? [n]: 
Please enter the IP address for Network Interface e0b []: 192.168.20.1
Please enter the netmask for Network Interface e0b []: 255.255.255.0
Should interface e0b take over a partner IP address during failover? [n]: y
Please enter the IPv4 address or interface name to be taken over by e0b []: 192.168.20.2
Please enter media type for e0b {100tx-fd, tp-fd, 100tx, tp, auto (10/100/1000)} [auto]: 
Please enter flow control for e0b {none, receive, send, full} [full]: 
Do you want e0b to support jumbo frames? [n]: 

两个节点均完成后重启,cf enable过后确认集群高可用状态正常,此时进入Web界面两个控制器能自动识别到并统一管理

XXXXX-KS-C1-A> cf status
Controller Failover disabled.
VIA Interconnect is down (link down).
XXXXX-KS-C1-A> cf enable
XXXXX-KS-C1-A> cf status
Controller Failover enabled, XXXXX-KS-C1-B is up.
VIA Interconnect is up (link up).

最后正常进入WebUI配置AGGR及相关业务等

其他排错及注意事项

  1. 磁盘无法正常Remove Ownership
  • 问题描述:
    某个磁盘已经正常remove onership,但disk show -o里还能看到,再次操作remove显示该盘已移除,当对改盘做相关assign操作时出现currently owns the persistent reservation相关告警
Ownership for disk 3a.31.16 (S/N xxxxxxxxx000B34002AD) cannot be changed because system??(ID xxxxxxxx89) currently owns the persistent reservation.
  • 解决方案:
    以下是官方解释,由于未知原因导致的磁盘归属异常:

Existing device reservation due to unknown disk ownership. The reservations on the disk may not have properly removed before moving/assigning the disks to this system.

解决方案也比较简单,进入维护模式运行以下命令释放磁盘信息

filer>storage release disks
  1. Web访问时出现500 Connection Error
  • 问题描述:
    正常初始化完成后,登录Oncommand System Manager正常添加节点,无法正常登录
  • 解决方案:
    确认以下options正常开启
filer>options tls.enable on
filer>options httpd.admin.enable on
filer>options httpd.admin.ssl.enable on
  1. 有多个盘柜时记得修改Shelf ID!
  • 问题描述:
    当系统遇到多个Shelf ID一致的情况下启动时就会报错
Mar 08 02:53:24 [localhost:sas.shelf.conflict:warning]: At least two SAS disk shelves have the same disk shelf ID. 
Mar 08 02:53:24 [localhost:sas.shelf.conflict:warning]: At least two SAS disk shelves have the same disk shelf ID. 

原本用00,01这些数字代表的Shelf ID会自动更改为字符加序列号的方式,非常不便于排错及配置

RAID Disk       Device                  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------                  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           0a.SHFFG1634000181.0    0a    159 0   SA:A   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0a.SHFFG1634000181.2    0a    159 2   SA:A   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0a.SHFFG1634000181.3    0a    159 3   SA:A   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.0    0b    0   0   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.1    0b    0   1   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.2    0b    0   2   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
spare           0b.SHFFG1631000094.5    0b    0   5   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768 (not zeroed)
  • 解决方案:
    盘柜左侧的盖子打开后,有个小按钮可以设置Shelf ID,需要注意:
    1.先完成ID更改再连接SAS线开机
    2.ID是两位数,一般第一位可以用来区别不同类型的磁盘柜或者不同的机柜
    在这里插入图片描述
  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

蛋黄酱拌饭

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值