本站以分享各种运维经验和运维所需要的技能为主
《python零基础入门》:python零基础入门学习
《python运维脚本》: python运维脚本实践
《shell》:shell学习
《terraform》持续更新中:terraform_Aws学习零基础入门到最佳实战
《k8》从问题中去学习k8s
《docker学习》暂未更新
《ceph学习》ceph日常问题解决分享
《日志收集》ELK+各种中间件
《运维日常》运维日常
《linux》运维面试100问
背景:
ceph集群更换osd时,找不到坏盘位置,怎么查找坏盘对应的序列号---业内称“点灯”
有什么办法可以确定其位置?
方法:
方法一:Storcli 工具下载:
方法二:最简单是通过ledctl 点灯:
ledctl locate=/dev/xxx
如果最简单的ledctl点灯失败就使用方法一:
下面找sn slot 具体命令:
一、找到序列号 二、根据sn找slot 三、根据slot通过storcli电灯或者直接数槽位
[ Linux]# /opt/MegaRAID/storcli/storcli64 /c0/eall/sall show all | grep -5 S477NW0K605180Y
S.M.A.R.T alert flagged by drive = No
Drive /c0/e65/s14 Device attributes :
===================================
SN = S477NW0K605180Y
Manufacturer Id = ATA
Model Number = Samsung SSD 860 DCT 960GB
NAND Vendor = NA
WWN = 5002538e7004c956
Firmware Revision = HXT70B6Q
[root@gz-ceph-52-235 Linux]# smartctl -a /dev/sdj | grep -5 Serial
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-327.el7.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: Samsung SSD 860 DCT 960GB
Serial Number: S477NW0K605180Y
LU WWN Device Id: 5 002538 e7004c956
Firmware Version: HXT70B6Q
User Capacity: 960,197,124,096 bytes [960 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
#!/bin/bash
disk_num=`lsblk | grep dis | awk '{print$1}'`
for i in $disk_num
do
sn=`smartctl -a /dev/$i | grep Serial | awk '{print$3}'`
slot=`/opt/MegaRAID/storcli/storcli64 /c0/eall/sall show all | grep -5 ${sn} | grep 'c0' | awk '{print$2}'`
echo $i $sn ${slot}
#/opt/MegaRAID/storcli/storcli64 ${slot} start locate
#echo $i
done
补充:笔者也有更换osd的笔记可参考,需要可查: