本博文出自Bkjia博客客居天涯博主,有任何問題請進入博主頁面互動討論!
博文地址:http://tiany.blog.51cto.com/513694/1617646
Oracle DG(Dataguard)是目前比較常見的數據庫HA配置策略。通過實現Physical Standby和Logical Standby,可以實現數據冗余容錯機制。防止在主庫出現嚴重故障,不能支持服務的時候,沒有快速的後備支持環境。
在DG中,switchover和failover是兩個重要的概念,也是DG實現的核心。兩者共同點都是Primary和Standby角色切換,差異在於Planned和UnPlanned之分。Switchover關鍵點在於Planned,這個切換動作是在運維機構規劃范圍內的動作。比如,進行定期系統軟硬件升級、設備維修等動作。而Failover是真正出現嚴重系統故障,如數據庫宕機、軟硬件故障導致的Primary不能支持服務,從而進行的切換動作。
根據不同的DG配置,switchover和failover也是有差異的。理論上,Switchover是不會造成數據丟失的,Primary在切換之後也是在DG配置環境中,作為Standby存在的。但是Failover則不同,除了運行在最大保護(Maximum Protection)模式下,Primary突發的故障可能引起一部分Redo Log不能及時的傳遞到Standby端,切換之後很可能有數據損失的情況。更重要的是,Primary端在發生Failover之後,是不能夠直接加入回DG配置的!也就是說,Failover之後,Primary實際上就是被“拋出”了DG環境。
那麼,有什麼方法實現Primary回到原有的環境呢?這個問題的困難在於保持Primary和Standby一致。在正常情況下,Primary和Standby之間是關聯同步的,即使發生了Switchover,也在可控情況下。Failover過程中有數據的缺失,還有Primary修復問題。在目前流行版本(11g)中,有三個方法:
案例分析:
一、在主庫端模擬數據庫意外宕機
- 7scott@bjdb>conn /as sysdba
- Connected.
- sys@bjdb>alter system switch logfile;
- System altered.
- sys@bjdb>shutdown abort
- ORACLE instance shut down.
二、在備庫端
1、查看切換信息
- 5sys@shdb>select name,database_role,switchover_status from v$database;
- NAME DATABASE_ROLE SWITCHOVER_STATUS
- --------- ---------------- --------------------
- TESTDB12 PHYSICAL STANDBY NOT ALLOWED
- 可以看到此時備庫處於無法切換狀態
2、直接切換
- sys@shdb>alter database commit to switchover to primary;
- alert_log:(告警日志)
- Fatal NI connect error 12514, connecting to:
- (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=shsrv)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=shdb)(CID=(PROGRAM=oracle)(HOST=bjsrv)(USER=oracle))))
- VERSION INFORMATION:
- TNS for Linux: Version 11.2.0.3.0 - Production
- TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
- Time: 04-MAR-2015 21:25:13
- Tracing not turned on.
- Tns error struct:
- ns main err code: 12564
- TNS-12564: TNS:connection refused
- ns secondary err code: 0
- nt main err code: 0
- nt secondary err code: 0
- nt OS err code: 0
- Error 12514 received logging on to the standby
- FAL[client, MRP0]: Error 12514 connecting to shdb for fetching gap sequence
- Wed Mar 04 21:26:00 2015
- alter database commit to switchover to primary
- ALTER DATABASE SWITCHOVER TO PRIMARY (TestDB12)
- Maximum wait for role transition is 15 minutes.
- Switchover: Media recovery is still active
- Database not available for switchover
- End-Of-REDO archived log file has not been recovered
- Database not available for switchover
- End-Of-REDO archived log file has not been recovered
- Database not available for switchover
3、關閉standby MPR進程
- 35sys@shdb>ALTER DATABASE RECOVER managed standby database finish;
- ALTER DATABASE RECOVER managed standby database finish
- Terminal Recovery: request posted (TestDB12)
- Wed Mar 04 21:34:34 2015
- Begin: Standby Redo Logfile archival
- End: Standby Redo Logfile archival
- Terminal Recovery timestamp is '03/04/2015 21:34:34'
- Terminal Recovery: applying standby redo logs.
- Terminal Recovery: thread 1 seq# 34 redo required
- Media Recovery Waiting for thread 1 sequence 34
- Terminal Recovery: End-Of-Redo log allocation
- Terminal Recovery: standby redo logfile 4 created '/dsk4/arch_bj/arch_1_0_820054583.log'
- This standby redo logfile is being created as part of the
- failover operation. This standby redo logfile should be
- deleted after the switchover to primary operation completes.
- Media Recovery Log /dsk4/arch_bj/arch_1_0_820054583.log
- Terminal Recovery: log 4 reserved for thread 1 sequence 34
- Recovery of Online Redo Log: Thread 1 Group 4 Seq 34 Reading mem 0
- Mem# 0: /dsk4/arch_bj/arch_1_0_820054583.log
- Identified End-Of-Redo (failover) for thread 1 sequence 34 at SCN 0xffff.ffffffff
- Incomplete Recovery applied until change 1234252 time 03/04/2015 21:23:43
- MRP0: Media Recovery Complete (TestDB12)
- Terminal Recovery: successful completion
- Wed Mar 04 21:34:35 2015
- ARCH: Archival stopped, error occurred. Will continue retrying
- ORACLE Instance TestDB12 - Archival Error
- ORA-16014: log 4 sequence# 34 not archived, no available destinations
- ORA-00312: online log 4 thread 1: '/dsk4/arch_bj/arch_1_0_820054583.log'
- Forcing ARSCN to IRSCN for TR 0:1234252
- Attempt to set limbo arscn 0:1234252 irscn 0:1234252
- Resetting standby activation ID 2865247982 (0xaac836ee)
- MRP0: Background Media Recovery process shutdown (TestDB12)
- Terminal Recovery: completion detected (TestDB12)
- Completed: ALTER DATABASE RECOVER managed standby database finish
4、切換數據庫到Primary
- sys@shdb>select status from v$instance;
- STATUS
- ------------
- OPEN
- sys@shdb>select name,database_role,switchover_status from v$database;
- NAME DATABASE_ROLE SWITCHOVER_STATUS
- --------- ---------------- --------------------
- TESTDB12 PHYSICAL STANDBY TO PRIMARY
- sys@shdb>alter database commit to switchover to primary;
- Database altered.
- sys@shdb>alter database open;
- Database altered.
- 告警日志:
- alter database commit to switchover to primary
- ALTER DATABASE SWITCHOVER TO PRIMARY (TestDB12)
- Maximum wait for role transition is 15 minutes.
- All dispatchers and shared servers shutdown
- CLOSE: killing server sessions.
- CLOSE: all sessions shutdown successfully.
- Wed Mar 04 21:35:47 2015
- SMON: disabling cache recovery
- Backup controlfile written to trace file /u01/app/oracle/diag/rdbms/bjdb/TestDB12/trace/TestDB12_ora_3146.trc
- Standby terminal recovery start SCN: 1234251
- RESETLOGS after incomplete recovery UNTIL CHANGE 1234252
- Online log /dsk2/oradata/bjdb/redo01b.log: Thread 1 Group 1 was previously cleared
- Online log /dsk1/oradata/bjdb/redo01a.log: Thread 1 Group 1 was previously cleared
- Online log /dsk2/oradata/bjdb/redo02b.log: Thread 1 Group 2 was previously cleared
- Online log /dsk1/oradata/bjdb/redo02a.log: Thread 1 Group 2 was previously cleared
- Online log /dsk2/oradata/bjdb/redo03b.log: Thread 1 Group 3 was previously cleared
- Online log /dsk1/oradata/bjdb/redo03a.log: Thread 1 Group 3 was previously cleared
- Standby became primary SCN: 1234250
- Wed Mar 04 21:35:47 2015
- Setting recovery target incarnation to 3
- AUDIT_TRAIL initialization parameter is changed back to its original value as specified in the parameter file.
- Switchover: Complete - Database mounted as primary
- Completed: alter database commit to switchover to primary
三、原主庫修復後,開機
- sys@bjdb>startup
- ORACLE instance started.
- Total System Global Area 442601472 bytes
- Fixed Size 2229184 bytes
- Variable Size 281021504 bytes
- Database Buffers 155189248 bytes
- Redo Buffers 4161536 bytes
- Database mounted.
- Database opened.
- sys@bjdb>select name,database_role,switchover_status from v$database;
- NAME DATABASE_ROLE SWITCHOVER_STATUS
- --------- ---------------- --------------------
- TESTDB12 PRIMARY FAILED DESTINATION
現在原來的主庫被修復後,整個DataGuara架構已經被破壞了,所以必須把原來的主庫構建成新的備庫,重新恢復DataGuard的環境。
四、重新構建DataGuard
- 1sys@bjdb>select name,database_role from v$database;
NAME DATABASE_ROLE
-------------------------------------------------- ----------------
TESTDB12 PHYSICAL STANDBY