2018年12月19日 星期三

LGWR (ospid: 40339): terminating the instance due to error 4021 (Active Data Guard )


在有一台 active data guard  Crash 掉 , 查看一下 log 出現



WARNING: inbound connection timed out (ORA-3136)
  Time: 19-DEC-2018 17:14:42
Wed Dec 19 17:14:42 2018
Errors in file /ORA/db/diag/rdbms/lg_st/LG/trace/LG_lgwr_189317.trc:
ORA-04021: timeout occurred while waiting to lock object 
    nt OS err code: 0
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535
WARNING: inbound connection timed out (ORA-3136)
    
TNS-12535: TNS:operation timed out
    ns secondary err code: 12606
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
WARNING: inbound connection timed out (ORA-3136)
LGWR (ospid: 189317): terminating the instance due to error 4021


在這之前出現大量
WARNING: inbound connection timed out (ORA-3136) 


最主要是
LGWR (ospid: 189317): terminating the instance due to error 4021

LGWR 導致 DB Crash 掉 


首先先看一下 LGWR trace log

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /ORA/db/11.2.4
System name: Linux
Node name: ora-74
Release: 2.6.32-696.el6.x86_64
Version: #1 SMP Tue Feb 21 00:53:17 EST 2017
Machine: x86_64
Instance name: LG
Redo thread mounted by this instance: 1
Oracle process number: 21
Unix process pid: 189317, image: oracle@ora-74 (LGWR)


*** 2018-12-19 17:14:42.853
*** SESSION ID:(1261.1) 2018-12-19 17:14:42.853
*** CLIENT ID:() 2018-12-19 17:14:42.853
*** SERVICE NAME:(SYS$BACKGROUND) 2018-12-19 17:14:42.853
*** MODULE NAME:() 2018-12-19 17:14:42.853
*** ACTION NAME:() 2018-12-19 17:14:42.853
error 4021 detected in background process
ORA-04021: timeout occurred while waiting to lock object 
kjzduptcctx: Notifying DIAG for crash event
----- Abridged Call Stack Trace -----
ksedsts()+465<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+63<-ksuitm()+5570<-ksbrdp()+3507<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 
----- End of Abridged Call Stack Trace -----

*** 2018-12-19 17:14:42.929
LGWR (ospid: 189317): terminating the instance due to error 4021
ksuitm: waiting up to [5] seconds before killing DIAG(189287)




這幾個狀況 蠻符合  metalink 上說的 

Bug 18242740 : ACTIVE STANDBY DATABASE CRASHED WITH ORA-4021
LGWR (ospid: Xxx): Terminating The Instance Due To Error 4021 (Doc ID 2177473.1)



LGWR需要獲取到解析鎖(parse lock)才能刷新,可能這個會話話費了大量時間解析。超過了設定時間,依然沒有獲取到解析鎖,就發生了ORA-4021崩潰


解決方法: 增加時間


alter system set "_adg_parselock_timeout" = 3000 scope = both ;



受影響的版本 :12.2 已下
也可透過官方 patch 包 修正



















2018年12月6日 星期四

Oracle Data guard move datafile



Standby database :


SQL> SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE cancel;

Database altered.

SQL> 
SQL> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=MANUAL SCOPE=BOTH;

System altered.

SQL> 
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup nomount ;
ORACLE instance started.

SQL> alter database mount standby database;

Database altered.

SQL> host mv /ORA/TEST/TEST01  /ORA/OTHER_DATA/TEST/TEST01


SQL> alter database rename file ' /ORA/TEST/TEST01'  to  '/ORA/OTHER_DATA/TEST/TEST01';

Database altered.

SQL> 
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

Database altered.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE cancel;

Database altered.

SQL> alter database open read only;

Database altered.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;


Database altered.

SQL> SQL> 
SQL> 

SQL> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO SCOPE=BOTH;

System altered.

SQL> 
SQL> 



完成