Method used to identify the cause of a damaged file system in the Linux host

3

You can locate the cause of a damaged file system as follows:
1. Issue Description
How can I identify the cause of a damaged file system in the Linux host?
2. Solution
Fault location and rectification
If a damaged file system is caused by the operating system, rectify the problem on the operating system side.
If a damaged file system is caused by storage disks, rectify the problem on the storage side.
Other causes lead to a damaged file system.
Solution:
a. If the damaged file system is located in interactive personality TV (IPTV), the following information is displayed.
Enter the storage directory. Failures such as input out error are displayed or a file system fails to be mounted (you must ensure proper mapping of LUNs and disks added to hosts by the storage system).
b. The damaged file system is caused by an operating system failure.
Check host logs by going to the /var/log directory and searching for compressed log packages about message (search the latest message log first).
Search keyword err in host logs to check whether the following information is displayed (XFS is used as an example).
Feb 18 16:19:01 WX-BY-HMU2 kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4534 of file fs/xfs/xfs_bmap.c. Caller 0xffffffff882c4f9c
If a internal error is found in the host logs, the error is caused by an operating system failure.
Solution: Consult the relevant operating system personnel for troubleshooting. You can refer to maintenance documentation.
3. The damage file system is caused by failures on the storage side.
Check host logs by going to the /var/log directory and searching for compressed log packages about message (search the latest message log first).
Search keyword err in host logs to check whether the following information is displayed (XFS is used as an example).
Dec 7 15:03:00 gdby2-hms01 kernel: end_request: I/O error, dev sdc, sector 2093665280
If an I/O error is found in host logs, the error is caused by a disk fault in the storage system or a link fault between the host and the storage.
Solution: You can contact the storage R&D personnel for help.
4. Other Causes
The damaged file system is caused by powering on and restarting hosts and storage arrays after abnormal power-off.
The damaged file system is caused by transmission medium fault, such as fiber and cable damage, and data transmission link recovery from disconnection.
The above scenarios may result in failed I/O delivering on the host and then a file system failure.
Solution: Refer to maintenance documentation.

Other related questions:
Method used to identify listening ports on the host
Method used to identify listening ports on the host: For each client, ISM subscribes to the reported information on the device. In that case, a port is listened. As multiple ISMs on a PC can be listened to manage the device, ports listened by ISM are within a certain range. The scope of listening port of ISMV1R3 is 8000 to 8090, 7890, and 8901. In the Windows command line, enter netstat -a to check whether ports within the range are listened. (In the following example, ISM uses port 8011) You can check on the device the file of ports that are reported: /ISM/repository/root#cimv2/instances/CIM_IndicationSubscription.idx In the file, 1 is invalid and the content starting with 0 is valid.

Method used to expand host file system capacity of OceanStor T Series
You can expand host file systems of OceanStor T Series as follows: 1. Expand the capacity by using the volume management software delivered with the host (LVM and dynamic disk). 2. Expand the capacity by using the third-party volume management software (VxVM).

Method used to restart the SUSE Linux operating system
Restarting the operating system interrupts services. Therefore, exercise caution when performing this operation. To restart the operating system, run the following command: shutdown -r -t time now When you restart the operating system, the remote login user exits. The restart takes 3 to 5 minutes.

Method used to identify slow disks in the storage system
Method used to identify slow disks in the storage system: Slow disks refer to disks with poor performance in the storage system. Slow disks deteriorate the performance of the RAID group where the slow disks reside, or even the performance of the whole service system. To ensure stable performance of the storage system, you can perform the following steps to identify slow disks and replace them: 1. Log in to the storage device using the command line mode and enter the debug mode. For OceanStor S2600 V100R001, S5000 V100R001, S2600 R5C02, and S5000R5C02 storage, after specifying the user name and password, run debug to enter the debug mode in the CLI. For S5500T, S5600T, S5800T, S6800T, S3900, S5900, and S6900 storage systems, specify ibc_os_hs as the user name and Storage@21st as the password to enter the debug mode in the CLI. 2. Run iostat to check the disk usage, service time on each I/O, average waiting time of I/O requests, and the quantity of I/Os to be processed. Note: If values of the quantity of I/Os to be processed, average waiting time of I/O requests, service time on each I/O, and disk usage of a disk are greater than other disks, this disk is a slow disk.

If you have more questions, you can seek help from following ways:
To iKnow To Live Chat
Scroll to top