hello dear
Share a case with you
The 1102 error may have the following possibilities:
1. In the HA system, other nodes have started instances, occupying resources shared by the two machines (such as raw devices on the disk array);
2. Explain that when Oracle was shut down abnormally, some resources were not released, generally there are the following possibilities,
1. Oracle's shared memory segment or semaphore has not been released;
2. Oracle background processes (such as SMON, PMON, DBWn, etc.) are not closed;
3. The files lk<sid> and sgadef<sid>.dbf used to lock the memory have not been deleted.
First of all, although our system is an HA system, the instance of the standby node is always in a closed state, which can be confirmed by checking the database status on the standby node.
Secondly, the database was down due to the system power failure. The system was restarted after the power was turned on, so we excluded the second possible point 1, 2 points. The most suspicious is point 3.
Check the $ORACLE_HOME/dbs directory:
$ cd $ORACLE_HOME/dbs
$ ls sgadef*
sgadef* not found
$ ls lk*
lkORA92
Sure enough, the lk<sid> file was not deleted. Delete it
$ rm lk*
Restart the database and succeed.
If you suspect that the shared memory has not been released, you can use the following command to view:
$ipcs -mop
IPC status from /dev/kmem as of Thu Jul 6 14:41:43 2006
T ID KEY MODE OWNER GROUP NATTCH CPID LPID
Shared Memory:
m 0 0x411c29d6 --rw-rw-rw- root root 0 899 899
m 1 0x4e0c0002 --rw-rw-rw- root root 2 899 901
m 2 0x4120007a --rw-rw-rw- root root 2 899 901
m 458755 0x0c6629c9 --rw-r----- root sys 2 9113 17065
m 4 0x06347849 --rw-rw-rw- root root 1 1661 9150
m 65541 0xffffffff --rw-r--r-- root root 0 1659 1659
m 524294 0x5e100011 --rw------- root root 1 1811 1811
m 851975 0x5fe48aa4 --rw-r----- oracle oinstall 66 2017 25076
Then its ID number clears the shared memory segment:
$ipcrm –m 851975
For semaphores, you can use the following command to view:
$ ipcs -sop
IPC status from /dev/kmem as of Thu Jul 6 14:44:16 2006
T ID KEY MODE OWNER GROUP
Semaphores:
s 0 0x4f1c0139 --ra------- root root
... ...
s 14 0x6c200ad8 --ra-ra-ra- root root
s 15 0x6d200ad8 --ra-ra-ra- root root
s 16 0x6f200ad8 --ra-ra-ra- root root
s 17 0xffffffff --ra-r--r-- root root
s 18 0x410c05c7 --ra-ra-ra- root root
s 19 0x00446f6e --ra-r--r-- root root
s 20 0x00446f6d --ra-r--r-- root root
s 21 0x00000001 --ra-ra-ra- root root
s 45078 0x67e72b58 --ra-r----- oracle oinstall
According to the semaphore ID, use the following command to clear the semaphore:
$ipcrm -s 45078
If the Oracle process is not closed, use the following command to find out the existing Oracle process:
$ ps -ef|grep ora
oracle 29976 1 0 Jun 22? 0:52 ora_dbw0_ora92
oracle 29978 1 0 Jun 22? 0:51 ora_dbw1_ora92
oracle 5128 1 0 Jul 5? 0:00 oracleora92 (LOCAL=NO)
...
Then use kill -9 command to kill the process
$kill -9 <PID>
When a 1102 error occurs, you can check and troubleshoot according to the following process:
If it is an HA system, check whether other nodes have started instances;
Check whether the Oracle process exists, if it exists, kill the process;
Check if the semaphore exists, if it exists, clear the semaphore;
Check if the shared memory segment exists, if it exists, clear the shared memory segment;
Check whether the lock memory files lk<sid> and sgadef<sid>.dbf exist, and if so, delete them.