Background
During system maintenance, we do not know how to perform the EC3.0 health check, including how to obtain logs and how to perform the health check. The following describes the detailed methods.
Method introduction
Check items on the BMU
1. Collect the version information and screenshot of the BMU login page.
2. Collect the total number of accounts and screenshot of the account list.
(1) Select the UC service and take a screenshot.
(2) Select all accounts and take screenshots as shown in the following figure.
3. Collect the total number of SIP numbers and take screenshots of the SIP number maintenance page.
4. Collect the number of U1900s or USMs connected to the BMU. The following figure shows the unified gateway page.
5. Collect BMU interconnection CDR configuration and screenshot of the CDR configuration page.
6. Collect information about the UM server. The screenshot of the UM server is as follows:
7. Collect information on the System Configuration page. On the System Configuration page, click the eSpace Service Configuration, Authentication Configuration, Log Configuration, and Audit Configuration tabs, and take four screenshots.
8. Collect license information. The following figure shows the license maintenance page.
9. Collect the configuration information for connecting the BMU to the LDAP. The following figure shows the LDAP integration page.
(1) LDAP integration page
(2) Screenshot of the LDAP Field Mapping tab page
10. Collect common information.
² Check whether the live network runs Windows or Linux.
² Check whether the network is the standard network planned in the product documentation. For details about the network information, see the LLD table.
² Check whether the EOS is reached and whether the maintenance service is available.
² Check whether the problem occurs on the PC or mobile client.
² Symptom
11.Collect error screenshots and error codes.
What operations are performed on the live network when the problem occurs and when the problem occurs
Which account is faulty, one or more accounts are faulty.
Check whether the issue is an Android or iOS issue on the mobile device.
Check whether the problem occurs when you log in to another computer or mobile phone.
12.Collect logs.
Collect server logs and client logs at the corresponding time point.
If the client is an IP phone, specify the IP phone version.
Check the server.
A. Check UC application software (based on Windows Server).
Reference Standard
All services are running properly.
Manually check the service status of the BMU, eServer, Meeting MS, UMServer, and MAA.Manually check the service status of the BMU, eServer, Meeting MS, UMServer, and MAA.Choose Start > Run, enter services.msc, and click OK. The Services window is displayed.
If the status of each service is Started, the services are running properly. as shown in figure 2.
If any service is not started, right-click the service and choose Start from the shortcut menu. If the service still cannot be started, contact the service provider for technical support.
Checking UC Application Software (Based on SUSE Linux)
Reference Standard
All services are running properly.
Manually check the service status of the BMU, eServer, MAA, and UMServer.
1. Log in to the OS where the software is installed as the ecs user.
2. Go to the software installation directory.
The default installation path is used as an example.
cd /opt/eSpace_UC/eSpace_UC_Server/Bin
3. Check the service status.
./showAllState.sh
If information similar to the following is displayed, the service is running properly:
If any service is not started, run the ./restartAll.sh command to restart all the services. If the fault persists, contact technical support of the service provider.
3、Checking the Operating System
A. Check the server operating system (Windows Server).
If the CPU or disk usage is continuously high, services on the server may run slowly.
Reference Standard
The CPU usage is less than 60%.
The memory usage is less than 85%.
The disk usage (used space) is less than 80%.
Checking the CPU and Memory
1. Log in to each server as the administrator user.
2. Open Windows Task Manager.
3. On the Performance tab page, view the CPU and memory usage.
4. On the Processes tab page, view the processes with high CPU usage.
5. End the processes irrelevant to the normal running of UC services and Windows to release memory.
Note: The process list of the EC service is as follows:
Check the disk space.
1. Log in to the server as the administrator user.
2. Choose Start > Run.
3. Enter compmgmt.msc and press Enter.
The Computer Management page is displayed.
4. In the navigation tree on the left, choose Disk Management to view the disk usage of the server system. Take a screenshot of the disk usage.
Reference standard: The disk space usage is less than 85%.
Checking OS Logs
Check whether the operating system logs contain error information to locate the system fault.
1. Log in to the server as the administrator user.
2. Choose Start > Run, enter eventvwr.msc, and press Enter.
The Event Viewer page is displayed.
In the Summary of Management Events area, check whether error information is displayed on the current day.
If error information is displayed, analyze the cause and rectify the fault based on the content. If the fault persists, collect operating system logs and contact Huawei technical support.
B. Check the server operating system (SUSE Linux).
If the CPU or disk usage is continuously high, services on the server may run slowly.
Reference Standard
The CPU usage is less than 60%.
The memory usage is less than 85%.
The disk usage (used space) is less than 80%.
Checking the CPU and Memory
1. Log in to each server as the root user.
2. Check the CPU usage.
top
Information similar to the following is displayed:
Note: Press Ctrl+C to exit the top command.
o The average CPU idle rate displayed before %id cannot always be lower than 20%.
o The CPU usage of a process displayed in the %CPU column cannot always be higher than 60%.
Run the preceding command and take a screenshot.
Note: The process list of the EC service is as follows:
3. Check the memory usage.
free
Information similar to the following is displayed:
Memory usage = (used-buffers-cached)/total.
Reference standard: The memory usage is less than 85%.
If the value is greater than 85%, run the top command to check the value of %MEM and take a screenshot.
4. Check the disk usage.
df -h
Information similar to the following is displayed: View the value of Use% and take a screenshot.
Reference standard: The disk space usage is less than 85%.
4. Check the database.
A. Checking the database (SQL Server)
Reference Standard
The SQL Server service is started properly.
Procedure
1. Log in to the database server as the administrator user.
2. Choose Start > All Programs > Microsoft SQL Server 2008 R2 > Configuration Tools > SQL Server Configuration Manager.
3. Check whether the following services are running properly:
In the preceding command, MSSQLSERVER indicates the database instance name.
If the service status is abnormal, right-click the service and choose Start or Restart from the shortcut menu.
If the service status is still abnormal after the restart, perform the following steps:
1. Check whether the database listening port is occupied. If yes, take a screenshot.
a. Log in to the database as the administrator user.
b. Choose Start > All Programs > Microsoft SQL Server 2008 R2 > Configuration Tools.
c. Click SQL Server Configuration Manager.
d. Choose SQL Server Network Configuration > MSSQLSERVER Protocol, double-click TCP/IP, and click the IP Address tab.
e. Check the TCP port number of IP2. In this example, the value is 1433.
f. Run the following command to check which service occupies port 1433:
netstat -ano|findstr 1433
tasklist | findstr 16496
B. Check the Oracle database.
Reference Standard
· The database status is READ WRITE.
· The instance status is OPEN.
· The tablespace is in the ONLINE state.
· The listener is in the READY or UNKNOWN state.
· The tablespace usage is less than 70%.
· If the CRS is deployed in a RAC cluster, the CRS cluster is in the ONLINE state.
If the query result is abnormal, collect database logs and contact the service provider.
Viewing Archive Logs
If Oracle archive logs are not deleted, the Oracle disk space is used up, and ECS services are interrupted.
1. Check whether the database is in archive mode as the oracle user and obtain the archive log path.
a. Log in to the Oracle database server as the oracle user.
b. Run the following command to query the archive log mode and path:
sqlplus / as sysdba
archive log list;
If the value of Database log mode is Archive Mode and the value of Automatic archive is Enabled, the automatic archive mode is used. In this case, check the space usage. Otherwise, no operation is required.
2. Query the space usage of archive logs.
The archive log path is specified by the Archive destination parameter. In this example, the archive log path is +DG_BACKUP. Set the path based on the site requirements.
Single-server scenario
Log in to the database server as the root user and run the df -m command to check whether the local path is used up.
Two-node cluster scenario
a. Log in to the database server as the grid user.
b. Query the space usage and provide a screenshot.
asmcmd lsdg
Check whether the value of Free_MB/Total_MB in the DG_BACKUP space is less than 20%.
Viewing Audit Logs
If the Oracle audit logs are not cleared, the Oracle disk space is used up and ECS services are interrupted.
1. Log in to the database server as the oracle user.
2. Run the following command to query the audit function mode and audit file path:
sqlplus / as sysdba
show parameter audit;
o audit_file dest: indicates the path of the audit file. In this example, the path is /opt/oracle/admin/ora11g/adump. Change the path based on the site requirements. Generally, the path is on the local server rather than on the disk array.
o "audit_trail": indicates the mode of the audit function. The value can be NONE, DB, OS, TRUE, or FALSE.
NONE or FALSE: The audit function is disabled.
DB or TRUE: The audit function is enabled.
OS: The audit record is written into an operating system file. The file name is specified by audit_file dest.
If the value is DB, TRUE, or OS, audit logs need to be deleted.
Checking the Database Status and Tablespace Status
1. Log in to the database server as the oracle user (Oracle 11g server installation user).
2. Run the sqlplus "/as sysdba" command to connect to the database.
3. Run the select open_mode from v$database command. to check the database status.
Information similar to the following is displayed:
4. Run select TABLESPACE_NAME,STATUS from dba_tablespaces . Check the tablespace status.
Information similar to the following is displayed:
Viewing the Instance Status
1. Log in to the database server as the oracle user (Oracle 11g server installation user).
2. Run the sqlplus "/as sysdba" command to connect to the database.
3. Run the select INSTANCE_NAME, STATUS from v$instance command. to check the database instance status.
Information similar to the following is displayed:
Note: If the Oracle database is deployed in a cluster, log in to the two database servers and run the following statement:
Check the listener status.
1. Log in to the database server as the oracle user.
su - oracle
2. Run the lsnrctl status command to check the listener status.
Note: If the Oracle RAC cluster is used, run the lsnrctl status Listener name command on the two servers to check the status. You can view the listener name in the listening parameter file, for example, /opt/oracle/product/11gR2/db/network/admin/listener.ora. In this example, the listener name is LISTENER_ORA.
Information similar to the following is displayed. (The information displayed is different for the database single-node system, two-node cluster, and cluster. Pay attention to the status information.)
If information similar to the following is displayed in bold, it indicates that the listener is normal.
If "TNS-12541 TNS:no listener" is displayed, the listener is not started. Run the lsnrctl start command (in Oracle RAC scenarios, run the lsnrctl start Listener name command) to start the listener.
If other error information is displayed, contact the technical support of the service provider.
Checking the Tablespace Usage
1. Log in to the database server as the oracle user (Oracle 11g server installation user).
su - oracle
2. Run the sqlplus "/as sysdba" command to connect to the database.
3. Check the tablespace usage.
select ef.tablespace_name, round(ef.used_space/(1024*1024)) used_space, round(fs.total_space/(1024*1024)) total_space, round(ef.used_space/fs.total_space*100, 2) used_rate, round((fs.total_space-ef.used_space)/fs.total_space*100,2) free_rate from (select cf.tablespace_name, sum(df.bytes - cf.free_bytes) used_space from(select tablespace_name, file_id, sum(bytes) free_bytes from dba_free_space group by tablespace_name, file_id) cf,dba_data_files df where cf.tablespace_name = df.tablespace_name and cf.file_id = df.file_id group by cf.tablespace_name) ef, (select tablespace_name, sum(case when autoextensible='YES 'then maxbytes else bytes end) total_space from dba_data_files group by tablespace_name) fs where ef.tablespace_name = fs.tablespace_name;
Information similar to the following is displayed:
Note: Pay attention to the usage of the SYSTEM tablespace. If the usage is too high, the database cannot run properly.
Checking the CRS Cluster Status
If the Oracle RAC cluster is used, check the CRS cluster status. Perform this step on each node.
1. Log in to each server as the grid user.
2. Run the olsnodes command to check whether the cluster relationship is successfully established.
If information similar to the following is displayed on both nodes, the creation is successful:
3. Run the crsctl stat res -t command to check the cluster status.
If information similar to the following is displayed on both nodes, the cluster status is normal:
Conclusion
The above scenarios are completed in the laboratory environment, only for case reference, not as the guidance basis for specific problems. If there are further questions. Please obtain the above log and send it to R&D or contact us with TAC support team.
Postscript
If you have any questions, welcome to reply and exchange, looking forward to your reply and good comments. thank you.
Conclusion
Obtain the above log and send it to R&D or contact us with TAC support team.
Postscript
If you have any questions, welcome to reply and exchange, looking forward to your reply and good comments.