Oracle DBA 生涯日誌: 11月 2018

前幾天在生產其中一套 RAC上, 發生有AP 會不定時 session 衝高, 雖然沒有到無法運作的狀況
但看它一直告警也覺得....心裡也覺得毛毛的

查看一下

em 及 AWR 出現大量 cursor: mutex X , 且 cursor: mutex X 佔了 34.6 % 的wait time , 連半夜產量較低的狀況下也是這樣

1.

發生的 cursor: mutex X 都是由一條 sql 引起的

SELECT *
FROM RELAY_NAME b
WHERE b.id = :1
AND b.name = :2
AND b.create_date >= TRUNC ( :3)
AND b.create_date < TRUNC ( :4 + 1)

SQL 非常簡單 index 也沒什麼問題 table 也不大

網路上有其他案例是後面變數太多導致引起bug , 但我這只有四個

本來想直接把 cursor_sharing 改成 force , 但這生產load 蠻高, 怕運氣不好整組壞掉

---------------------------

這邊不建議直接 cursor_sharing 改成 force , 有蠻多案例也會導致 cursor: mutex X 升高
---------------------------

先來看一下原因.

通過 V$SQL_SHARED_CURSOR的列值是Y / N來顯示當前指定SQL_ID不能被共享的原因

select * from v$sql_shared_cursor where sql_id = '168vjw7bg9p8g'

在眾多欄位上只有看到 bind_equiv_failure 是 Y

BIND_LENGTH_UPGRADEABLE 跟 TRANSLATION_MISMATCH 跟AUTH_CHECK_MISMATCH 都是 N

看起來問題沒有這麼複雜....

oracle 官方文檔上有說
BIND_EQUIV_FAILURE : the bind value’s selectivity does not match that used to optimize the existing child cursor.This definition together with the selectivity xml tag mentioned above gave me a first clue: Adaptive Cursor Sharing

the bind value’s selectivity does not match that used to optimize the existing child cursor

這句話讓我想起前幾天我有對這條 SQL 執行 SQL 調整建議程式根據 oracle 最佳方案去執行最優執行計畫

這邊就先把它停用

停用後此SQL立刻出現大波動.....過了1分後oracle大概有固定了變數...
之後就沒有再出現大量cursor: mutex X

以上當然是很特例的處理狀況

另外
bind_equiv_failure的意思就是，之所以沒有使用已經存在的子游標，是因為當前的綁定變量值和已經存在子游標當時使用的綁定變量值之間的選擇性有較大差別，出於性能考慮，不能和已經存在的子游標共享

要是綁定變量窺探和自適應游標沒關
那這時候就要猜想綁定變量的列上有直方圖存在是否有傾斜

以下再分享一個案例

在凌晨02:00的時間,某一個SQL load 飆高且event 都是 cursor : muter X

查詢原因:

通過 V$SQL_SHARED_CURSOR的列值是Y / N來顯示當前指定SQL_ID不能被共享的原因

select * from v$sql_shared_cursor where sql_id = '168vjw7bg9p8g':

在眾多欄位上只有看到 bind_equiv_failure 是 Y

大概就覺得直方圖存在已經有傾斜或者 table 的 statistics 一直不斷地在變動
看到這邊直覺就先把 table和index 重新收集資訊 analyze (使用GATHER_TABLE_STATS ) ,但無用

MOS 上有建議執行 alter system flush shared_pool 把 shared_pool 清除或者使用 _CURSOR_OBSOLETE_THRESHOLD 參數

但....這是生產的環境不太允許可以這樣測試

這時候突然想到出現 bind_equiv_failure

是因為當前的綁定變量值和已經存在子游標當時使用的綁定變量值之間的選擇性有較大差別

難道 table 的 statistics 變動頻率很高 ?!

執行 dbms_stats.lock_table_stats

結果 cursor : muter X 降低正常值, session 不再飆高

要是以上方法
table index analyze 及unlock table statistics 都無法解決的話
那就砍掉SQL在shared pool的執行計畫, 讓DB重新產生一份新的

1.先查詢SQL的位址
select SQL_TEXT,sql_id, address, hash_value, executions, loads, parse_calls, invalidations
from v$sqlarea where sql_id = '168vjw7bg9p8g';

這邊需要的是 address, hash_value 欄位的值

2.
單獨砍掉他的執行計畫

exec sys.dbms_shared_pool.purge('0000001D4A15D578,3606369551','c') ;

第三個參數 C 是 CURSOR

以下是可使用清單

P package/procedure/function

Q sequence

R trigger

T type

JS java source

JC java class

JR java resource

JD java shared data

C cursor

3.查詢SQL還在不在
select SQL_TEXT,sql_id, address, hash_value, executions, loads, parse_calls, invalidations
from v$sqlarea where sql_id = '168vjw7bg9p8g';

沒有資料! 刪除成功

4.重新產生新的執行計畫後, 就不再出現高 cursor:mutex X 及 cursor:mutex S

參考

Troubleshooting: High Version Count Issues (文档 ID 296377.1)
VIEW: “V$SQL_SHARED_CURSOR” Reference Note (文档 ID 120655.1)

https://hourim.wordpress.com/2015/04/06/bind_equiv_failure-or-when-you-will-regret-using-adaptive-cursor-sharing/

http://yong321.freeshell.org/computer/SharedPoolPurging.html

http://www.xifenfei.com/2013/05/dbms_shared_pool-purge%E5%B7%A5%E4%BD%9C%E5%8E%9F%E7%90%86%E7%8C%9C%E6%B5%8B.html

oracle data guard 備庫出現錯誤

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE cancel
alter database open read only
Fri Nov 09 10:08:37 2018
Read of datafile '/ORA/db/oradata/TX/temp01.dbf' (fno 201) header failed with ORA-01203
Rereading datafile 201 header failed with ORA-01203
Errors in file /ORA/db/diag/rdbms/lg_st/TX/trace/LG_dbw0_450588.trc:
ORA-01186: file 201 failed verification tests
ORA-01122: database file 201 failed verification check
ORA-01110: data file 201: '/ORA/db/oradata/TX/temp01.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
File 201 not verified due to error ORA-01122
Fri Nov 09 10:08:37 2018

執行

select s.name tbsname,t.name,(t.bytes/1024/1024) bytes,status

from v$tablespace s,v$tempfile t

where s.ts# = t.ts#;

出現 ORA-01122 及 ORA-01110 錯誤

這是tmep 文件失效

解決方是新增一個datafile 並且砍掉原本

SQL> alter tablespace temp add tempfile '/ORA/db/oradata/TX/temp01_01.dbf' size 100m autoextend on;

Tablespace altered.

SQL> alter tablespace temp drop tempfile '/ORA/db/oradata/TX/temp01.dbf';

Tablespace altered.

SQL>

解決完成

執行

select s.name tbsname,t.name,(t.bytes/1024/1024) bytes,status

from v$tablespace s,v$tempfile t

where s.ts# = t.ts#;

出現新增的 datafile

alert.log 內容

Fri Nov 09 10:26:34 2018

alter tablespace temp add tempfile '/ORA/db/oradata/TX/temp01_01.dbf' size 100m autoextend on

Completed: alter tablespace temp add tempfile '/ORA/db/oradata/TX/temp01_01.dbf' size 100m autoextend on

Fri Nov 09 10:27:21 2018

alter tablespace temp drop tempfile '/ORA/db/oradata/TX/temp01.dbf'

Errors in file /ORA/db/diag/rdbms/lg_st/TX/trace/LG_ora_451105.trc:

ORA-01122: database file 201 failed verification check

ORA-01110: data file 201: '/ORA/db/oradata/TX/temp01.dbf'

ORA-01203: wrong incarnation of this file - wrong creation SCN

Errors in file /ORA/db/diag/rdbms/lg_st/TX/trace/LG_ora_451105.trc:

ORA-01258: unable to delete temporary file /ORA/db/oradata/TX/temp01.dbf

Completed: alter tablespace temp drop tempfile '/ORA/db/oradata/TX/temp01.dbf'

Oracle DBA 生涯日誌

2018年11月12日星期一

oracle cursor: mutex X

2018年11月8日星期四

Oracle temp ORA-01122 ORA-01110 驗證失敗錯誤

文章列表

搜尋此網誌

2018年11月12日 星期一