单HBA卡存储部署故障处理
环境:
服务器:DELL R410
存储:MD3600F
HBA卡:Qlogic (QLE2560) 8Gbps PCI-E FC HBA - 单端口
操作系统:服务器1――RHEL 5.5 服务器2――CentOS 5.3
拓扑图:
准备完成的工作:
1、 部署时间同时服务器;
2、 两台服务器分别访问存储实现数据隔离(多个LUN);
3、 安装postgresql 数据库服务器;
系统分区情况:
/ 50G
/BOOT 2G
/DATE 100G
/SWAP 4G
/USR 50G
DELL存储通过操作系统自带的多路径管理软件device-mapper-1.02.28-2.el5和device-mapper-multipath-0.4.7-23.el5二个软件来管理多路径。
故障描述:
安装完系统和划分好LUN后发现出现大量DM新设备,经检查发现此设备为LUN 指定分区的多路径,同时开机时会提示磁盘错误,详细信息请查看如下图例:
错误处理过程:
1、修改多路径软件的配置 2、屏蔽配置中的多路径设备 3、删除多余的分区并重新挂载分区等操作 4、关闭多路径软件服务等操作。问题依旧。(具体处理方法略)
Oct 10 03:51:44 dataserver1 kernel: Netfilter messages via NETLINK v0.30.
Oct 10 03:51:45 dataserver1 dnsmasq[4761]: started, version 2.45 cachesize 150
Oct 10 03:51:45 dataserver1 dnsmasq[4761]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h
Oct 10 03:51:46 dataserver1 xenstored: Checking store ...
Oct 10 03:51:46 dataserver1 xenstored: Checking store complete.
Oct 10 03:51:47 dataserver1 avahi-daemon[4316]: Withdrawing address record for 192.168.1.101 on eth0.
Oct 10 03:51:47 dataserver1 avahi-daemon[4316]: Interface eth0.IPv6 no longer relevant for mDNS.
Oct 10 03:51:47 dataserver1 NetworkManager: <info> (eth0): carrier now OFF (device state 1)
Oct 10 03:51:47 dataserver1 kernel: device vif0.0 entered promiscuous mode
Oct 10 03:51:47 dataserver1 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not Found
Oct 10 03:51:47 dataserver1 kernel: ADDRCONF(NETDEV_UP): peth0: link is not ready
Oct 10 03:51:48 dataserver1 avahi-daemon[4316]: New relevant interface eth0.IPv4 for mDNS.
Oct 10 03:51:48 dataserver1 avahi-daemon[4316]: Interface eth0.IPv4 no longer relevant for mDNS.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> (eth0): device state change: 2 -> 3
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) starting connection 'System eth0'
Oct 10 03:51:49 dataserver1 NetworkManager: <info> (eth0): device state change: 3 -> 4
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) complete.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> (eth0): device state change: 4 -> 5
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) schedu
led...
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) complete.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) starte
d...
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: New relevant interface eth0.IPv4 for mDNS.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Joining mDNS multicast group on interface eth0.IPv4 with addres
s 192.168.1.101.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Registering new address record for 192.168.1.101 on eth0.
Oct 10 03:51:49 dataserver1 kernel: device peth0 entered promiscuous mode
Oct 10 03:51:49 dataserver1 kernel: xenbr0: topology change detected, propagating
Oct 10 03:51:49 dataserver1 kernel: xenbr0: port 2(peth0) entering forwarding state
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: New relevant interface eth0.IPv6 for mDNS.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Joining mDNS multicast group on interface eth0.IPv6 with addres
s fe80::7a2b:cbff:fe40:fb4f.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Registering new address record for fe80::7a2b:cbff:fe40:fb4f on
eth0.
Oct 10 03:51:50 dataserver1 NetworkManager: <info> (eth0): device state change: 7 -> 8
Oct 10 03:51:50 dataserver1 NetworkManager: <info> Policy set 'System eth0' (eth0) as default for routing and
DNS.
Oct 10 03:51:50 dataserver1 NetworkManager: <info> Activation (eth0) successful, device activated.
Oct 10 03:51:50 dataserver1 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) comple
te.
Oct 10 03:51:50 dataserver1 SMagent: SMagent started. PID=5126
Oct 10 03:51:50 dataserver1 smartd[5139]: smartd version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruc
e Allen
Oct 10 03:51:50 dataserver1 smartd[5139]: Home page is http://smartmontools.sourceforge.net/
Oct 10 03:51:50 dataserver1 smartd[5139]: Opened configuration file /etc/smartd.conf
Oct 10 03:51:50 dataserver1 smartd[5139]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, sca
nning devices
Oct 10 03:51:50 dataserver1 smartd[5139]: Problem creating device name scan list
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sda, opened
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sda, Bad IEC (SMART) mode page, err=4, skip device
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdb, opened
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdb, is SMART capable. Adding to "monitor" list.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) scheduled...
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) complete.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) starting...
Oct 10 03:51:49 dataserver1 NetworkManager: <info> (eth0): device state change: 4 -> 5
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) successful.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) scheduled.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) complete.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) started...
Oct 10 03:51:49 dataserver1 NetworkManager: <info> (eth0): device state change: 5 -> 7
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) scheduled...
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) complete.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) started...
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) scheduled...
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) complete.
Oct 10 03:51:49 dataserver1 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) started...
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: New relevant interface eth0.IPv4 for mDNS.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.1.101.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Registering new address record for 192.168.1.101 on eth0.
Oct 10 03:51:49 dataserver1 kernel: device peth0 entered promiscuous mode
Oct 10 03:51:49 dataserver1 kernel: xenbr0: topology change detected, propagating
Oct 10 03:51:49 dataserver1 kernel: xenbr0: port 2(peth0) entering forwarding state
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: New relevant interface eth0.IPv6 for mDNS.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Joining mDNS multicast group on interface eth0.IPv6 with address fe80::7a2b:cbff:fe4
0:fb4f.
Oct 10 03:51:49 dataserver1 avahi-daemon[4316]: Registering new address record for fe80::7a2b:cbff:fe40:fb4f on eth0.
Oct 10 03:51:50 dataserver1 NetworkManager: <info> (eth0): device state change: 7 -> 8
Oct 10 03:51:50 dataserver1 NetworkManager: <info> Policy set 'System eth0' (eth0) as default for routing and DNS.
Oct 10 03:51:50 dataserver1 NetworkManager: <info> Activation (eth0) successful, device activated.
Oct 10 03:51:50 dataserver1 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete.
Oct 10 03:51:50 dataserver1 SMagent: SMagent started. PID=5126
Oct 10 03:51:50 dataserver1 smartd[5139]: smartd version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Oct 10 03:51:50 dataserver1 smartd[5139]: Home page is http://smartmontools.sourceforge.net/
Oct 10 03:51:50 dataserver1 smartd[5139]: Opened configuration file /etc/smartd.conf
Oct 10 03:51:50 dataserver1 smartd[5139]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
Oct 10 03:51:50 dataserver1 smartd[5139]: Problem creating device name scan list
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sda, opened
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sda, Bad IEC (SMART) mode page, err=4, skip device
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdb, opened
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdb, is SMART capable. Adding to "monitor" list.
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdc, opened
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdc, is SMART capable. Adding to "monitor" list.
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdd, opened
Oct 10 03:51:50 dataserver1 smartd[5139]: Device: /dev/sdd, is SMART capable. Adding to "monitor" list.
Oct 10 03:51:50 dataserver1 smartd[5139]: Monitoring 0 ATA and 3 SCSI devices
Oct 10 03:51:50 dataserver1 smartd[5141]: smartd has fork()ed into background mode. New PID=5141.
Oct 10 03:51:51 dataserver1 S99SMmonitor: S99SMmonitor started. PID=5151
Oct 10 03:51:51 dataserver1 kernel: kjournald starting. Commit interval 5 seconds
Oct 10 03:51:51 dataserver1 kernel: EXT3 FS on dm-2, internal journal
Oct 10 03:51:51 dataserver1 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 10 03:53:49 dataserver1 kernel: end_request: I/O error, dev sdd, sector 32
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 4
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 5
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 6
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 7
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 8
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 9
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 10
Oct 10 03:53:49 dataserver1 kernel: Buffer I/O error on device sdd, logical block 11
Oct 10 03:58:31 dataserver1 multipathd: dm-2: umount map (uevent)
处理办法:
因为考虑到服务器上只安装了一快HBA卡未实现冗余链路,不需要使用多路径软件来管理磁盘,重装过系统后直接卸载device-mapper-multipath-0.4.7-23.el5 软件并重新加载磁盘分区,问题解决。
总结:
有时候看着满复杂的问题,只是因为一个小小的问题没有考虑到所致,所以一定要细致的考虑到问题的方方面面才可以做到万无一失。