原文博客鏈接地址:數據庫open報錯ORA-01555: snapshot too old
今天正在東莞蜜月的時候,一個學生說他管理的測試庫出問題了,無法open,我們先來看看是什麼問題:
Recovery of Online Redo Log: Thread 1 Group 4 Seq 4 Reading mem 0 Mem# 0: /onlinelog/shr/redo04.log Completed redo application of 0.00MB Completed crash recovery at Thread 1: logseq 4, block 3, scn 7755957 0 data blocks read, 0 data blocks written, 0 redo k-bytes read Thread 1 advanced to log sequence 5 (thread open) Thread 1 opened at log sequence 5 Current log# 5 seq# 5 mem# 0: /onlinelog/shr/redo05.log Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Jun 19 13:31:35 2014 SMON: enabling cache recovery ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0000.007658ba): select ctime, mtime, stime from obj$ where obj# = :1 Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_5262.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_1263032392$" too small Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_5262.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 6 with name "_SYSSMU6_1263032392$" too small Error 704 happened during db open, shutting down database USER (ospid: 5262): terminating the instance due to error 704 Instance terminated by USER, pid = 5262 ORA-1092 signalled during: ALTER DATABASE OPEN... opiodr aborting process unknown ospid (5262) as a result of ORA-1092 Thu Jun 19 13:31:37 2014 ORA-1092 : opitsk aborting process
從上面的錯誤來看,該數據庫之所以open失敗,是由於Oracle在bootstrap階段執行遞歸SQL時出現ora-01555錯誤,
這樣bootstrap過程無法繼續下去,也就導致數據庫無法open。我們可以看到報錯的SQL語句如下:
select ctime, mtime, stime from obj$ where obj# = :1
這是很熟悉的一個SQL,通過10046 trace跟蹤Oracle open的過程你會發現該SQL。
針對該錯誤,或許有人以為是回滾段的問題,實際上並不是,這種情況下推進下SCN 就可以很順利的把數據庫open。
但是這裡有個問題:該兄弟的數據庫是Oracle 11.2.0.4,已經不支持傳統的10015 event的方式了。
下面我們通過oradebug 來解決該問題:
SQL> conn /as sysdba Connected to an idle instance. SQL> startup mount ORACLE instance started. Total System Global Area 4275781632 bytes Fixed Size 2260088 bytes Variable Size 989856648 bytes Database Buffers 3271557120 bytes Redo Buffers 12107776 bytes Database mounted. SQL> SQL> oradebug poke 0x06001AE70 4 0x859AFA ORA-00074: no process has been specified SQL> oradebug setmypid Statement processed. SQL> oradebug DUMPvar SGA kcsgscn_ kcslf kcsgscn_ [06001AE70, 06001AEA0) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000 SQL> oradebug poke 0x06001AE70 4 0x859AFA BEFORE: [06001AE70, 06001AE74) = 00000000 AFTER: [06001AE70, 06001AE74) = 00859AFA SQL> alter database open; Database altered. SQL>
這裡簡單解釋一下,4 為長度,0x859AFA是16進制,我在原來的v$datafile_header.checkpoint_change#的基礎之上
加上上1000000得到該值。
我們可以看到,順利打開了數據庫。最後再出觀察下alert log發現居然有ora-00600 4194錯誤。
Thu Jun 19 14:48:43 2014 Dumping diagnostic data in directory=[cdmp_20140619144843], requested by (instance=1, osid=9140 (MMON)), summary=[incident=132122]. Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x3D6C3836] [PC:0x97F4DF6, kgegpa()+40] [flags: 0x0, count: 1] Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x3D6C3836] [PC:0x97F386A, kgebse()+776] [flags: 0x2, count: 2] Thu Jun 19 14:48:46 2014 Flush retried for xcb 0x159c668c8, pmd 0x15870d270 Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_pmon_9112.trc (incident=132017): ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] Incident details in: /oracle/diag/rdbms/shr/shr/incident/incdir_132017/shr_pmon_9112_i132017.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Thu Jun 19 14:48:47 2014 Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_ora_9268.trc (incident=132209): ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] ORA-00001: unique constraint (SYSTEM.UNQ_PAIRS) violated Incident details in: /oracle/diag/rdbms/shr/shr/incident/incdir_132209/shr_ora_9268_i132209.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /oracle/diag/rdbms/shr/shr/trace/shr_pmon_9112.trc: ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], [] PMON (ospid: 9112): terminating the instance due to error 472 System state dump requested by (instance=1, osid=9112 (PMON)), summary=[abnormal instance termination]. System State dumped to trace file /oracle/diag/rdbms/shr/shr/trace/shr_diag_9122_20140619144848.trc Dumping diagnostic data in directory=[cdmp_20140619144848], requested by (instance=1, osid=9112 (PMON)), summary=[abnormal instance termination]. Instance terminated by PMON, pid = 9112
這是一個非常常見的錯誤了,最簡單的方式就是通過重建undo或處理回滾段來解決,這裡我們用最簡單的方式:
修改undo_management=manual undo_tablespace='system' SQL> l 1* create pfile='/tmp/1.ora' from spfile SQL> startup mount pfile='/tmp/1.ora'; ORACLE instance started. Total System Global Area 4275781632 bytes Fixed Size 2260088 bytes Variable Size 989856648 bytes Database Buffers 3271557120 bytes Redo Buffers 12107776 bytes Database mounted. SQL> alter database open; Database altered. SQL> select name from v$datafile; NAME -------------------------------------------------------------------------------- /oracle/oradata/shr/system01.dbf /oracle/oradata/shr/sysaux01.dbf /oracle/oradata/shr/undotbs01.dbf /oracle/oradata/shr/users01.dbf /oradata/shr/jy_shr01.dbf /oradata/shr/jy_shr02.dbf /oradata/shr/jy_shr03.dbf /oradata/shr/eas_d_stand01.dbf /oradata/shr/eas_d_stand02.dbf /oradata/shr/eas_d_stand03.dbf /backup/eas/eas_d_stand04.dbf NAME -------------------------------------------------------------------------------- /backup/eas/eas_d_stand05.dbf /backup/eas/eas_d_stand06.dbf 13 rows selected. SQL> show parameter undo NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ undo_management string MANUAL undo_retention integer 900 undo_tablespace string SYSTEM SQL> create undo tablespace undotbs2 datafile '/oracle/oradata/shr/undotbs2_01.dbf' size 200m; Tablespace created. SQL> shutdown immediate
重建undo之後,再停庫,修改undo參數即可順利打開數據庫,如下:
SQL> create spfile from pfile='/tmp/1.ora'; File created. SQL> startup mount ORACLE instance started. Total System Global Area 4275781632 bytes Fixed Size 2260088 bytes Variable Size 989856648 bytes Database Buffers 3271557120 bytes Redo Buffers 12107776 bytes Database mounted. SQL> alter database open; Database altered. SQL>
處理完成之後alert log不再拋出任何錯誤。