RVC使用指南（二）-集群管理

最新推荐文章于 2024-07-30 23:11:59 发布

z荒野求生

最新推荐文章于 2024-07-30 23:11:59 发布

阅读量1.4k

点赞数

分类专栏： VMware 配置笔记

原文链接：https://mp.weixin.qq.com/s/R7e09yZrPaCaXJYnG_cF0w

版权

配置笔记同时被 2 个专栏收录

420 篇文章 20 订阅

订阅专栏

VMware

357 篇文章 145 订阅

订阅专栏

本文详细介绍了用于管理vSAN集群的一系列命令，包括vsan.host_info、vsan.cluster_info、vsan.check_limits等，涉及主机信息收集、集群状态检查、故障模拟、资源维护等方面，帮助管理员有效监控和调整vSAN环境。

摘要由CSDN通过智能技术生成

RVC使用指南（二）-集群管理

https://mp.weixin.qq.com/s/R7e09yZrPaCaXJYnG_cF0w

看了就要关注我，哈哈~

本文讨论与vSAN集群管理相关的命令。这些命令用于收集有关ESXi主机和集群的信息。当想要维护vSAN集群或配置延伸群集时，它们提供重要信息：

· vsan.host_info

· vsan.cluster_info

· vsan.check_limits

· vsan.whatif_host_failures

· vsan.enter_maintenance_mode

· vsan.resync_dashboard

· vsan.proactive_rebalance

· vsan.proactive_rebalance_info

· vsan.host_evacuate_data

· vsan.host_exit_evacuation

· vsan.ondisk_upgrade

· vsan.v2_ondisk_upgrade

· vsan.upgrade_status

· vsan.stretchedcluster.config_witness

· vsan.stretchedcluster.remove_witness

· vsan.stretchedcluster.witness_info

为了缩短命令，我对环境中的集群、虚拟机和ESXi主机使用了标记。这样可以在示例中使用~cluster、~vm和~esx来代替。

/localhost/DC> mark cluster ~/computers/VSAN-Cluster/
/localhost/DC> mark vm ~/vms/vma.virten.lab
/localhost/DC> mark esx ~/computers/VSAN-Cluster/hosts/esx1.virten.lab/

集群管理

vsan.host_info ~host

输出vSAN主机的相关信息。包括以下信息：

Cluster role (master, backup or agent)
Cluster UUID
Node UUID
Member UUIDs
Auto claim (yes or no)
Disk Mappings: Disks that are claimed by VSAN
FaultDomainInfo: Information about the fault domain
NetworkInfo: VSAN traffic activated vmk adapters

示例1-输出vSAN主机信息：

/localhost/DC> vsan.host_info ~esx
Fetching host info from vesx1.virten.lab (may take a moment) ...
Product: VMware ESXi 6.5.0 build-5310538
VSAN enabled: yes
Cluster info:
  Cluster role: master
  Cluster UUID: 52bcd891-92ce-2de3-1dfd-2a41a96dc99e
  Node UUID: 57c31851-3589-813e-71ca-005056bb0438
  Member UUIDs: ["57c31851-3589-813e-71ca-005056bb0438", "57c31b5a-3501-74e0-d719-005056bbaf1d", "57c31aee-2b9b-789e-ff4f-005056bbefe7"] (3)
Node evacuated: no
Storage info:
  Auto claim: no
  Disk Mappings:
    SSD: Local VMware Disk (mpx.vmhba1:C0:T1:L0) - 10 GB, v3
    MD: Local VMware Disk (mpx.vmhba1:C0:T2:L0) - 25 GB, v3
    MD: Local VMware Disk (mpx.vmhba1:C0:T3:L0) - 25 GB, v3
FaultDomainInfo:
  Not configured
NetworkInfo:
  Adapter: vmk1 (10.0.222.121)

vsan.cluster_info ~cluster

输出vSAN所有主机的相关信息，此命令提供与vsan.host_info相同的信息：

/localhost/DC> vsan.cluster_info ~cluster/
Fetching host info from vesx2.virten.lab (may take a moment) ...
Fetching host info from vesx3.virten.lab (may take a moment) ...
Fetching host info from vesx1.virten.lab (may take a moment) ...
Host: vesx2.virten.lab
  Product: VMware ESXi 6.5.0 build-5310538
  VSAN enabled: yes
  Cluster info:
    Cluster role: agent
[...]

Host: vesx3.virten.lab
  Product: VMware ESXi 6.5.0 build-5310538
  VSAN enabled: yes
  Cluster info:
    Cluster role: backup
[...]

Host: vesx1.virten.lab
  Product: VMware ESXi 6.5.0 build-5310538
  VSAN enabled: yes
  Cluster info:
    Cluster role: master
[...]

No Fault Domains configured in this cluster

vsan.check_limits ~cluster|~host

收集并检查各种与VSAN相关的阈值（例如组件或磁盘利用率）是否超过其限制。该命令可用于单个ESXi主机或群集。

示例1-检查启用VSAN的集群中所有主机的VSAN阈值

/localhost/DC> vsan.check_limits ~cluster
Gathering stats from all hosts ...
Gathering disks info ...
+-------------------+-------------------+------------------------+
| Host              | RDT               | Disks                  |
+-------------------+-------------------+------------------------+
| esx1.virten.local | Assocs: 51/20000  | Components: 45/750     |
|                   | Sockets: 26/10000 | WDC_WD3000HLFS: 44%    |
|                   | Clients: 4        | WDC_WD3000HLFS: 32%    |
|                   | Owners: 11        | WDC_WD3000HLFS: 28%    |
|                   |                   | SanDisk_SDSSDP064G: 0% |
| esx2.virten.local | Assocs: 72/20000  | Components: 45/750     |
|                   | Sockets: 24/10000 | WDC_WD3000HLFS: 29%    |
|                   | Clients: 5        | WDC_WD3000HLFS: 31%    |
|                   | Owners: 12        | WDC_WD3000HLFS: 43%    |
|                   |                   | SanDisk_SDSSDP064G: 0% |
| esx3.virten.local | Assocs: 88/20000  | Components: 45/750     |
|                   | Sockets: 31/10000 | WDC_WD3000HLFS: 42%    |
|                   | Clients: 6        | WDC_WD3000HLFS: 44%    |
|                   | Owners: 9         | WDC_WD3000HLFS: 38%    |
|                   |                   | SanDisk_SDSSDP064G: 0% |
+-------------------+-------------------+------------------------+

vsan.whatif_host_failures [-n|-s] ~cluster

模拟主机故障将如何影响VSAN资源使用。该命令显示当前VSAN磁盘使用情况以及主机发生故障后计算出的磁盘使用情况。该模拟通过建立现有数据的新镜像将所有对象恢复为完全符合存储策略。

-n, --num-host-failures-to-simulate=: Number of host failures to simulate (default: 1)
-s, --show-current-usage-per-host: Show current resources used per host

示例1-模拟1个主机故障：

/localhost/DC> vsan.whatif_host_failures ~cluster
Simulating 1 host failures:

+-----------------+-----------------------------+-----------------------------------+
| Resource        | Usage right now             | Usage after failure/re-protection |
+-----------------+-----------------------------+-----------------------------------+
| HDD capacity    |   7% used (1128.55 GB free) |  15% used (477.05 GB free)        |
| Components      |   2% used (2025 available)  |   3% used (1275 available)        |
| RC reservations |   0% used (90.47 GB free)   |   0% used (48.73 GB free)         |
+-----------------+-----------------------------+-----------------------------------+

示例2-显示利用率并模拟1个主机故障：

/localhost/DC> vsan.whatif_host_failures -s ~cluster
Current utilization of hosts:
+------------+---------+--------------+------+----------+----------------+--------------+
|            |         | HDD Capacity |      |          | Components     | SSD Capacity |
| Host       | NumHDDs | Total        | Used | Reserved | Used           | Reserved     |
+------------+---------+--------------+------+----------+----------------+--------------+
| 10.0.0.1   | 2       | 299.50 GB    | 6 %  | 5 %      |    4/562 (1 %) | 0 %          |
| 10.0.0.2   | 2       | 299.50 GB    | 10 % | 9 %      |   11/562 (2 %) | 0 %          |
| 10.0.0.3   | 2       | 299.50 GB    | 10 % | 9 %      |    6/562 (1 %) | 0 %          |
| 10.0.0.4   | 2       | 299.50 GB    | 14 % | 13 %     |    7/562 (1 %) | 0 %          |
+------------+---------+--------------+------+----------+----------------+--------------+

Simulating 1 host failures:

+-----------------+-----------------------------+-----------------------------------+
| Resource        | Usage right now             | Usage after failure/re-protection |
+-----------------+-----------------------------+-----------------------------------+
| HDD capacity    |  10% used (1079.73 GB free) |  13% used (780.23 GB free)        |
| Components      |   1% used (2220 available)  |   2% used (1658 available)        |
| RC reservations |   0% used (55.99 GB free)   |   0% used (41.99 GB free)         |
+-----------------+-----------------------------+-----------------------------------+

vsan.enter_maintenance_mode [-t|-e|-n|-v] ~host

将主机置入维护模式，此命令支持VSAN，可以像vSphere Web Client一样将VSAN数据迁移到其他主机。启用DRS后，还会迁移正在运行的虚拟机。

-t, --timeout=: Set a timeout for the process to complete. When the host can not enter maintenance mode in X seconds, the process is canceled. (default: 0)
-e, --evacuate-powered-off-vms: Moves powered off virtual machines to other hosts in the cluster.
-n, --no-wait: The command returns immediately without waiting for the task to complete.
-v, --vsan-mode=: Actions to take for VSAN components. Options:
- ensureObjectAccessibility (default)
- evacuateAllData
- noAction

示例1-主机进入维护模式。不复制任何VSAN组件（速度快但减少了冗余)

/localhost/DC> vsan.enter_maintenance_mode ~esx
EnterMaintenanceMode esx1.virten.local: success

示例2-主机置于维护模式。将所有VSAN组件复制到群集中的其他主机：

/localhost/DC> vsan.enter_maintenance_mode ~esx -v evacuateAllData
EnterMaintenanceMode esx1.virten.local: success

示例3-主机进入维护模式。将所有VSAN组件复制到集群中的其他主机。如果需要的时间超过10分钟，就取消这个过程

/localhost/DC> vsan.enter_maintenance_mode ~esx -v evacuateAllData -t 600
EnterMaintenanceMode esx1.virten.local: success

示例4-将主机置于维护模式。不跟踪进程（批处理模式）：

/localhost/DC> vsan.enter_maintenance_mode ~esx -n
/localhost/DC>

vsan.resync_dashboard [-r] ~cluster

如果主机出现故障或进入维护模式，在这里查看重新同步状态。该命令可以运行一次，也可以使用刷新间隔。

-r, --refresh-rate=: Refresh interval (in sec). Default is no refresh

示例1-数据同步仪表板

/localhost/DC>  vsan.resync_dashboard ~cluster
Querying all VMs on VSAN ...
Querying all objects in the system from esx1.virten.lab ...
Got all the info, computing table ...
+-----------+-----------------+---------------+
| VM/Object | Syncing objects | Bytes to sync |
+-----------+-----------------+---------------+
+-----------+-----------------+---------------+
| Total     | 0               | 0.00 GB       |
+-----------+-----------------+---------------+

示例2-将主机置于维护模式后，重新同步仪表板。每10秒刷新一次：

/localhost/DC>  vsan.resync_dashboard ~cluster --refresh-rate 10
Querying all VMs on VSAN ...
Querying all objects in the system from esx1.virten.local ...
Got all the info, computing table ...
+-----------+-----------------+---------------+
| VM/Object | Syncing objects | Bytes to sync |
+-----------+-----------------+---------------+
+-----------+-----------------+---------------+
| Total     | 0               | 0.00 GB       |
+-----------+-----------------+---------------+
Querying all objects in the system from esx1.virten.local ...
Got all the info, computing table ...
+-----------------------------------------------------------------+-----------------+---------------+
| VM/Object                                                       | Syncing objects | Bytes to sync |
+-----------------------------------------------------------------+-----------------+---------------+
| vm1                                                             | 1               |               |
|    [vsanDatastore] 5078bd52-2977-8cf9-107c-00505687439c/vm1.vmx |                 | 0.17 GB       |
+-----------------------------------------------------------------+-----------------+---------------+
| Total                                                           | 1               | 0.17 GB       |
+-----------------------------------------------------------------+-----------------+---------------+
Querying all objects in the system from esx1.virten.local ...
Got all the info, computing table ...
+--------------------------------------------------------------------+-----------------+---------------+
| VM/Object                                                          | Syncing objects | Bytes to sync |
+--------------------------------------------------------------------+-----------------+---------------+
| vm1                                                                | 1               |               |
|    [vsanDatastore] 5078bd52-2977-8cf9-107c-00505687439c/vm1.vmx    |                 | 0.34 GB       |
| debian                                                             | 1               |               |
|    [vsanDatastore] 6978bd52-4d92-05ed-dad2-005056871792/debian.vmx |                 | 0.35 GB       |
+--------------------------------------------------------------------+-----------------+---------------+
| Total                                                              | 2               | 0.69 GB       |
+--------------------------------------------------------------------+-----------------+---------------+
[...]

vsan.proactive_rebalance [-s|-t|-v|-i|-r|-o] ~cluster

启动主动重新平衡，查看群集中组件的分布，并主动开始平衡ESXi主机之间的组件分布。

-s, --start: Start proactive rebalance
-t, --time-span=: Determine how long this proactive rebalance lasts in seconds, only be valid when option 'start' is specified
-v, --variance-threshold=: Configure the threshold, that only if disk's used_capacity/disk_capacity exceeds this threshold(comparing to the disk with the least fullness in the cluster), disk is qualified for proactive rebalance, only be valid when option 'start' is specified
-i, --time-threshold=: Threshold in seconds, that only when variance threshold continuously exceeds this threshold, the corresponding disk will be involved to proactive rebalance, only be valid when option 'start' is specified
-r, --rate-threshold=: Determine how many data in MB could be moved per hour for each node, only be valid when option 'start' is specified
-o, --stop: Stop proactive rebalance

示例1-开始主动重新平衡

/localhost/DC> vsan.proactive_rebalance -s ~cluster/
Processing Virtual SAN proactive rebalance on host vesx2.virten.lab ...
Processing Virtual SAN proactive rebalance on host vesx3.virten.lab ...
Processing Virtual SAN proactive rebalance on host vesx1.virten.lab ...
Proactive rebalance has been started!

示例2-停止主动重新平衡

/localhost/DC> vsan.proactive_rebalance -o ~cluster/
Processing Virtual SAN proactive rebalance on host vesx2.virten.lab ...
Processing Virtual SAN proactive rebalance on host vesx1.virten.lab ...
Processing Virtual SAN proactive rebalance on host vesx3.virten.lab ...
Proactive rebalance has been stopped!

示例3-启动主动重新平衡，但将带宽限制为每小时100MB：

/localhost/DC> vsan.proactive_rebalance -s -r 100 ~cluster/
Processing Virtual SAN proactive rebalance on host vesx2.virten.lab ...
Processing Virtual SAN proactive rebalance on host vesx1.virten.lab ...
Processing Virtual SAN proactive rebalance on host vesx3.virten.lab ...
Proactive rebalance has been started!

vsan.proactive_rebalance_info ~cluster

显示有关主动重新平衡的信息，包括磁盘使用情况统计信息以及是否正在运行主动重新平衡。

示例1-显示主动重新平衡状态：

/localhost/DC> vsan.proactive_rebalance_info ~cluster
Retrieving proactive rebalance information from host vesx2.virten.lab ...
Retrieving proactive rebalance information from host vesx3.virten.lab ...
Retrieving proactive rebalance information from host vesx1.virten.lab ...

Proactive rebalance start: 2016-11-03 11:12:04 UTC
Proactive rebalance stop: 2016-11-04 11:12:06 UTC
Max usage difference triggering rebalancing: 30.00%
Average disk usage: 2.00%
Maximum disk usage: 3.00% (3.00% above minimum disk usage)
Imbalance index: 2.00%
No disk detected to be rebalanced

vsan.host_evacuate_data [-a|-n|-t] ~host

此命令是进入维护模式的数据撤离部分，但没有任何vMotion任务。该命令从主机撤离数据，并确保VM对象在群集中的其他位置重建，以保持完全冗余。

-a, --allow-reduced-redundancy: Removes the need for nodes worth of free space, by allowing reduced redundancy
-n, --no-action: Do not evacuate data during host evacuation
-t, --time-out=: Time out for single node evacuation (default: 0)

示例1-撤离数据，减少冗余：

/localhost/DC> vsan.host_evacuate_data -a ~esx/
Data evacuation mode ensureObjectAccessibility
Data evacuation time out 0

Start to evacuate data for host vesx1.virten.lab
EvacuateVsanNode vesx1.virten.lab: success
Done evacuate data for host vesx1.virten.lab

Hosts remain evacuation state until explicily exit evacuation
through command vsan.host_exit_evacuation

vsan.host_exit_evacuation ~host

此命令退出主机撤离状态，并允许将主机上的磁盘重用于虚拟机对象。

示例1-退出主机撤离状态

/localhost/DC> vsan.host_exit_evacuation ~esx/

Start to exit evacuation for host vesx1.virten.lab
RecommissionVsanNode vesx1.virten.lab: success
Done exit evacuation for host vesx1.virten.lab

vsan.ondisk_upgrade [-a|-f] ~cluster

该命令轮流在群集中的所有ESXi主机上进行预检查，并将磁盘格式升级到最新版本。该命令在每个磁盘组撤出组件之前进行几次验证检查来执行滚动升级。

当群集中没有足够的资源来容纳磁盘撤离时，允许减少冗余进行升级。

-a, --allow-reduced-redundancy: Removes the need for one disk group worth of free space, by allowing reduced redundancy during disk upgrade
-f, --force: Automatically answer all confirmation questions with 'proceed'

示例1-将VSAN升级到最新的磁盘格式：

/localhost/DC> vsan.ondisk_upgrade ~cluster
+------------------+-----------+-------------+----------------+----------------+------------------+----------------+----------------+
| Host             | State     | ESX version | v1 Disk groups | v2 Disk groups | v2.5 Disk groups | v3 Disk groups | v5 Disk groups |
+------------------+-----------+-------------+----------------+----------------+------------------+----------------+----------------+
| vesx1.virten.lab | connected | 6.5.0       | 0              | 0              | 0                | 1              | 0              |
| vesx2.virten.lab | connected | 6.5.0       | 0              | 0              | 0                | 1              | 0              |
| vesx3.virten.lab | connected | 6.5.0       | 0              | 0              | 0                | 1              | 0              |
+------------------+-----------+-------------+----------------+----------------+------------------+----------------+----------------+

Running precondition checks ...
Passed precondition checks

Target file system version: v5
Disk mapping decommission mode: evacuateAllData
Check cluster status for disk format conversion.
Update vSAN system settings.
No disk conversion performed, all mounted disk groups on host are compliant
Check cluster status for disk format conversion.
Update vSAN system settings.
No disk conversion performed, all mounted disk groups on host are compliant
Check cluster status for disk format conversion.
Update vSAN system settings.
No disk conversion performed, all mounted disk groups on host are compliant
Disk format conversion is done.
Check existing objects on vSAN.
Object conversion is done.
Waiting for upgrade task to finish
Done vSAN upgrade

vsan.upgrade_status [-r] ~cluster

在升级过程中显示升级的对象数

-r, --refresh-rate=: removes the need for nodes worth of free space, by allowing reduced redundanc

示例1-以60秒的刷新率显示升级状态

/localhost/DC> vsan.upgrade_status -r 60 ~cluster
Showing upgrade status every 60 seconds. Ctrl + c to stop.
No upgrade in progress
0 objects in which will need realignment process
0 objects with new alignment
0 objects ready for v3 features
5 objects ready for v5 features

vsan.stretchedcluster.config_witness ~cluster ~witness_host ~preferred_fault_domain

配置见证主机以形成vSAN延伸群集。群集名称，见证主机（RVC中主机对象的路径）和首选故障域（标签）是必填项。请注意，此命令既不会创建ESXi主机也不会将其分配给故障域。可以使用esxcli vsan faultdomain set命令从RVC设置故障域。

cluster: A cluster with vSAN enabled
witness_host: Witness host for the stretched cluster
preferred_fault_domain: Preferred fault domain for witness host

示例1-配置见证主机并分配故障域：

/localhost/DC> vsan.stretchedcluster.config_witness ~cluster computers/wit.virten.lab/host/ Hamburg
Configuring witness host for the cluster...
Task: Add witness host
New progress: 1%
Task result: success
/localhost/DC> esxcli ~esx vsan faultdomain set -f "Hamburg"
/localhost/DC> esxcli ~esx vsan faultdomain get
faultDomainId: "35d7df6e-d3d9-3be2-927d-14acc5f1fc9a"

vsan.stretchedcluster.witness_info ~cluster

显示vSAN延伸群集的见证主机信息。

示例1-显示见证主机信息

/localhost/DC> vsan.stretchedcluster.witness_info ~cluster
Found witness host for vSAN stretched cluster.
+------------------------+--------------------------------------+
| Stretched Cluster      | vSAN65                               |
+------------------------+--------------------------------------+
| Witness Host Name      | wit.virten.lab                       |
| Witness Host UUID      | 58ffd0d6-4edd-3b92-636e-005056b98a68 |
| Preferred Fault Domain | Hamburg                              |
| Unicast Agent Address  | 10.100.0.4                           |
+------------------------+--------------------------------------+

vsan.stretchedcluster.remove_witness ~cluster

从vSAN延伸群集中删除见证主机。

/localhost/DC> vsan.stretchedcluster.remove_witness ~cluster/
Found witness host for vSAN stretched cluster.
+------------------------+--------------------------------------+
| Stretched Cluster      | vSAN65                               |
+------------------------+--------------------------------------+
| Witness Host Name      | wit.virten.lab                       |
| Witness Host UUID      | 58ffd0d6-4edd-3b92-636e-005056b98a68 |
| Preferred Fault Domain | Hamburg                              |
| Unicast Agent Address  | 10.100.0.4                           |
+------------------------+--------------------------------------+
Removing witness host from the cluster...
Task: Remove witness host
New progress: 1%
New progress: 30%
Task result: success