Suse 11 下一次文件系统修复的案例

操作系统:Suse Linux 11

文件系统:ext3  

 

错误现象

X日,接到告警,检查文件系统/dev/sda1发现写入报只读,检查IP存储有告警,随即umount /img,但卸载后无法正常挂载

fdisk -l显示IO错误,重启应用服务器后依然无法正常挂载,显示IO错误,

检查IP存储有告警信息,待存储厂商解决存储问题后,重启应用服务器仍然无法正常挂载文件系统,

由于mount命令执行后长时间无响应,但观察/var/log/messages仍然显示系统在进行block的扫描:

 

Nov 2 06:04:53 linux11 kernel: [128293.578670] Buffer I/O error on device sda1, logical block 483584660
Nov 2 06:04:53 linux11 kernel: [128293.578672] lost page write due to I/O error on sda1
Nov 2 06:05:01 linux11 /usr/sbin/cron[15283]: (root) CMD ( /opt/hp/hp-health/bin/check-for-restart-requests)
Nov 2 06:05:53 linux11 kernel: [128353.584893] sd 9:0:0:0: [sda] Unhandled sense code
Nov 2 06:05:53 linux11 kernel: [128353.584898] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Nov 2 06:05:53 linux11 kernel: [128353.584901] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
Nov 2 06:05:53 linux11 kernel: [128353.584905] sd 9:0:0:0: [sda] Add. Sense: Medium not present
Nov 2 06:05:53 linux11 kernel: [128353.584910] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 e6 97 59 5f 00 00 08 00
Nov 2 06:05:53 linux11 kernel: [128353.584916] end_request: I/O error, dev sda, sector 3868678495
Nov 2 06:05:53 linux11 kernel: [128353.584920] Buffer I/O error on device sda1, logical block 483584804
Nov 2 06:05:53 linux11 kernel: [128353.584922] lost page write due to I/O error on sda1
Nov 2 06:05:53 linux11 kernel: [128353.599875] sd 9:0:0:0: [sda] Unhandled sense code
Nov 2 06:05:53 linux11 kernel: [128353.599878] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Nov 2 06:05:53 linux11 kernel: [128353.599880] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
Nov 2 06:05:53 linux11 kernel: [128353.599883] sd 9:0:0:0: [sda] Add. Sense: Medium not present
Nov 2 06:05:53 linux11 kernel: [128353.599886] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 e6 97 5f bf 00 00 08 00
Nov 2 06:05:53 linux11 kernel: [128353.599890] end_request: I/O error, dev sda, sector 3868680127
Nov 2 06:05:53 linux11 kernel: [128353.599893] Buffer I/O error on device sda1, logical block 483585008
Nov 2 06:05:53 linux11 kernel: [128353.599895] lost page write due to I/O error on sda1
Nov 2 06:05:53 linux11 kernel: [128353.600872] sd 9:0:0:0: [sda] Unhandled sense code
Nov 2 06:05:53 linux11 kernel: [128353.600875] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Nov 2 06:05:53 linux11 kernel: [128353.600877] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
Nov 2 06:05:53 linux11 kernel: [128353.600879] sd 9:0:0:0: [sda] Add. Sense: Medium not present
Nov 2 06:05:53 linux11 kernel: [128353.600882] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 e6 97 62 47 00 00 08 00
Nov 2 06:05:53 linux11 kernel: [128353.600887] end_request: I/O error, dev sda, sector 3868680775

红色部分显示系统仍在工作中,等待20小时候,工程师建议继续等待,20小时后,mount命令运行结束

linux11:~ #mount /dev/sda1 /mnt/
mount: wrong fs type, bad option, bad superblock on /dev/sda1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

linux11:~ #dmesg|tail -50
[138764.297170] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[138764.297172] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
[138764.297175] sd 9:0:0:0: [sda] Add. Sense: Medium not present
[138764.297178] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 f2 1f f5 b7 00 00 10 00
[138764.297182] end_request: I/O error, dev sda, sector 4062180791
[138764.312193] sd 9:0:0:0: [sda] Unhandled sense code
[138764.312197] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[138764.312199] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
[138764.312202] sd 9:0:0:0: [sda] Add. Sense: Medium not present
[138764.312204] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 f2 20 37 9f 00 00 08 00
[138764.312209] end_request: I/O error, dev sda, sector 4062197663
[138764.312224] sd 9:0:0:0: [sda] Unhandled sense code
[138764.312226] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[138764.312228] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
[138764.312230] sd 9:0:0:0: [sda] Add. Sense: Medium not present
[138764.312233] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 f2 20 38 b7 00 00 08 00
[138764.312237] end_request: I/O error, dev sda, sector 4062197943
[138764.312242] sd 9:0:0:0: [sda] Unhandled sense code
[138764.312243] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[138764.312245] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
[138764.312247] sd 9:0:0:0: [sda] Add. Sense: Medium not present
[138764.312250] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 f2 20 7f 87 00 00 08 00
[138764.312254] end_request: I/O error, dev sda, sector 4062216071
[138824.286688] sd 9:0:0:0: [sda] Unhandled sense code
[138824.286692] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[138824.286696] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
[138824.286699] sd 9:0:0:0: [sda] Add. Sense: Medium not present
[138824.286704] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 f2 20 f2 bf 00 00 08 00
[138824.286710] end_request: I/O error, dev sda, sector 4062245567
[138824.286714] __ratelimit: 8 callbacks suppressed
[138824.286718] Buffer I/O error on device sda1, logical block 507780688
[138824.286719] lost page write due to I/O error on sda1
[138824.324706] sd 9:0:0:0: [sda] Unhandled sense code
[138824.324709] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[138824.324711] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
[138824.324714] sd 9:0:0:0: [sda] Add. Sense: Medium not present
[138824.324717] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 f2 20 fa 1f 00 00 08 00
[138824.324722] end_request: I/O error, dev sda, sector 4062247455
[138824.324726] Buffer I/O error on device sda1, logical block 507780924
[138824.324727] lost page write due to I/O error on sda1
[138824.324741] sd 9:0:0:0: [sda] Unhandled sense code
[138824.324742] sd 9:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[138824.324744] sd 9:0:0:0: [sda] Sense Key : Medium Error [current]
[138824.324747] sd 9:0:0:0: [sda] Add. Sense: Medium not present
[138824.324749] sd 9:0:0:0: [sda] CDB: Write(10): 2a 00 f2 2e a1 17 00 00 08 00
[138824.324754] end_request: I/O error, dev sda, sector 4063142167
[138824.324756] Buffer I/O error on device sda1, logical block 507892763
[138824.324758] lost page write due to I/O error on sda1
[138824.324773] JBD: recovery failed
[138824.324774] EXT3-fs: error loading journal.

 

修复方案

工程师初步判定为Superblock损坏,开始进行制定修复方案:

1.通过dd将原/dev/sda1分区的文件备份到其他文件分区,原分区大小2T,IP存储重新划分了略大于2T的空间,挂到应用服务器上,进行数据备份

2.数据备份后通过fsck.ext3进行修复

 

一、数据备份

创建新的分区/dev/sdb1

linux11:/var/log #fdisk /dev/sdb

The number of cylinders for this disk is set to 267075.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-267075, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-267075, default 267075):
Using default value 267075

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
linux11:/var/log #
linux11:/var/log #
linux11:/var/log #
linux11:/var/log # fdisk -l

Disk /dev/cciss/c0d0: 300.0 GB, 299966445568 bytes
255 heads, 63 sectors/track, 36468 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000bf615

Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 * 1 38 305203+ 83 Linux
/dev/cciss/c0d0p2 39 4215 33551752+ 82 Linux swap / Solaris
/dev/cciss/c0d0p3 4216 36468 259072222+ 83 Linux

Disk /dev/sda: 2097.2 GB, 2097152000000 bytes
255 heads, 63 sectors/track, 254964 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000a13a0

Device Boot Start End Blocks Id System
/dev/sda1 1 254964 2047998298+ 83 Linux

Disk /dev/sdb: 2196.8 GB, 2196766720000 bytes
255 heads, 63 sectors/track, 267075 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x24828d3f

Device Boot Start End Blocks Id System
/dev/sdb1 1 267075 2145279906 83 Linux

注意,这里尝试使用了mkfs格式化文件分区,由于文件系统2T,格式化时间相当长,最终取消了这一操作,注意kill操作也不能很快的结束,只有等待,随即重新划分了存储空间,进行分区,但不进行格式化

开始数据备份

dd if=/dev/sda1 of=/dev/sdb1 bs=8M

最开始的时候未指定bs的大小,默认只有512字节,经过约30小时的等待后,测速发现只有1M/s,后中断该过程,改为bs=8M

应用服务器未安装stat包,补充测速的方法:

>strace.log
time strace -o strace.log -p 11929

运行一段时间后ctrl+c终止

 

统计write出现的次数

grep -c write starace.log

echo "次数*8/time得到的时间" |bc

即为估算的每秒复制的速度。

 

30个小时后备份结束

249999+1 records in
249999+1 records out
2097150257664 bytes (2.1 TB) copied, 130468 s, 16.1 MB/s

 

二、用备份数据进行恢复

由于原应用服务器还使用临时空间在承担业务,因此通过IP存储将分区挂载到其他操作系统相同的机器进行修复,首先确定超级块superblock的起始位置

linux11:~ #dumpe2fs /dev/sda1

dumpe2fs 1.41.9 (22-Aug-2009)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 34689bab-428f-4e84-b3b8-22351dfcbe9a
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 128000000
Block count: 511999574
Reserved block count: 25599978
Free blocks: 483185304
Free inodes: 126463697
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 901
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Filesystem created: Wed May 11 16:02:51 2011
Last mount time: Thu Aug 2 17:26:01 2012
Last write time: Thu Aug 2 17:26:01 2012
Mount count: 8
Maximum mount count: -1
Last checked: Wed May 11 16:02:51 2011
Check interval: 0 (<none>)
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 7c5d0a45-f12f-4ce5-8e8a-cb1029acbf2d
Journal backup: inode blocks
Journal size: 128M


Group 0: (Blocks 0-32767)
Primary superblock at 0, Group descriptors at 1-123
Reserved GDT blocks at 124-1024
Block bitmap at 1025 (+1025), Inode bitmap at 1026 (+1026)
Inode table at 1027-1538 (+1027)
1255 free blocks, 7617 free inodes, 14 directories
Free blocks: 4442, 31514-32767
Free inodes: 576-8192
Group 1: (Blocks 32768-65535)
Backup superblock at 32768, Group descriptors at 32769-32891
Reserved GDT blocks at 32892-33792
Block bitmap at 33793 (+1025), Inode bitmap at 33794 (+1026)
Inode table at 33795-34306 (+1027)
7263 free blocks, 7415 free inodes, 16 directories
Free blocks: 34340-36863, 37296, 49851-52059, 52083-54611
Free inodes: 8970-16384
Group 2: (Blocks 65536-98303)
Block bitmap at 65536 (+0), Inode bitmap at 65537 (+1)
Inode table at 65538-66049 (+2)
4 free blocks, 7403 free inodes, 21 directories
Free blocks: 72068-72071
Free inodes: 17174-24576
Group 3: (Blocks 98304-131071)

...有很多个superblock,见红色字体部分,以下省略...

 

操作系统将超级块备份到了多个位置,本次选择用32768处的进行修

linux01:~ #fsck.ext3 -y -b 32768 /dev/sda1
e2fsck 1.41.9 (22-Aug-2009)
Superblock needs_recovery flag is clear, but journal has data.
Recovery flag not set in backup superblock, so running journal anyway.
/dev/sda1: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(22281864--22282239) -(22282760--22282957) -(22282967--22283536) -(22283552--22284122) -(22284138--22284287) -(22284582--22285018) -(100555611--100556799) -(100580404--100581255)
Fix? yes

Free blocks count wrong for group #0 (31223, counted=1255).
Fix? yes

Free blocks count wrong for group #1 (31229, counted=7263).
Fix? yes

Free blocks count wrong for group #2 (32254, counted=4).
Fix? yes

Free blocks count wrong for group #3 (31229, counted=0).
Fix? yes

...省略...

Free inodes count wrong for group #15622 (8192, counted=7167).
Fix? yes

Directories count wrong for group #15622 (0, counted=34).
Fix? yes

Free inodes count wrong for group #15623 (8192, counted=6821).
Fix? yes

Directories count wrong for group #15623 (0, counted=52).
Fix? yes

Free inodes count wrong for group #15624 (8192, counted=7247).
Fix? yes

Directories count wrong for group #15624 (0, counted=21).
Fix? yes

Free inodes count wrong (127999989, counted=120656202).
Fix? yes


/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: 7343798/128000000 files (2.8% non-contiguous), 259583311/511999574 blocks

 

修复成功,重新挂载文件系统,能够正常加载,文件和目录能够正常访问


  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
SUSE Linux 11 中,防火墙的配置文件是 `/etc/sysconfig/SuSEfirewall2`。你可以使用编辑器打开该文件并进行相应的配置。 以下是一个示例配置文件的内容: ``` ## Path: Security/Firewall ## Description: SuSEfirewall2 basic configuration ## Type: string(basic,noservices,masquerading,advanced,custom) ## Default: basic # # SuSEfirewall2 configuration # based on /etc/sysconfig/scripts/SuSEfirewall2-3.6.312-1.1 # last modified Tue Jul 6 12:58:32 UTC 2010 by root <root@localhost> # Modified by YaST2 at Thu Feb 22 14:45:34 CET 2018 FW_DEV_EXT="eth0" # External network interface FW_DEV_INT="eth1" # Internal network interface FW_FORWARD="yes" # Enable packet forwarding FW_MASQUERADE="yes" # Enable masquerading FW_SERVICES_EXT_TCP="22 80" # TCP services allowed from external network FW_SERVICES_EXT_UDP="" # UDP services allowed from external network FW_SERVICES_INT_TCP="" # TCP services allowed from internal network FW_SERVICES_INT_UDP="" # UDP services allowed from internal network FW_SERVICES_DMZ_TCP="" # TCP services allowed from DMZ network FW_SERVICES_DMZ_UDP="" # UDP services allowed from DMZ network # ... 其他配置 ... ``` 你可以根据需要修改上述配置文件中的参数。例如,你可以添加或删除允许通过防火墙的端口,指定内部和外部网络接口等。修改完成后,保存文件并重启防火墙服务以使配置生效。 注意:在进行任何配置更改之前,请确保你已备份原始配置文件以防止意外情况发生。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值