DG主备同步状态异常。alert日志报错ORA-01274,MRP进程异常中断。

Posted by wukaiqiang; tagged with none

DG主备同步状态异常。alert日志报错ORA-01274,MRP进程异常中断。
问题现象
具体现象
问题分析
问题原因
解决方案:
扩展
备库standby_file_management参数为auto
备库standby_file_management参数为manual
备库standby_file_management参数为auto
结论
问题现象
DG主备同步状态异常。alert日志报错ORA-01274,MRP进程异常中断。

具体现象
SQL> col client_pid for a10
SQL> SELECT inst_id, thread#, process, pid, status, client_process, client_pid,
2 sequence#, block#, active_agents, known_agents FROM gv$managed_standby ORDER BY thread#, pid;

INST_ID THREAD# PROCESS PID STATUS CLIENT_P CLIENT_PID SEQUENCE# BLOCK# ACTIVE_AGENTS KNOWN_AGENTS


   1        0 ARCH             2702 CONNECTED    ARCH     2702               0          0             0            0
   1        0 ARCH             2706 CONNECTED    ARCH     2706               0          0             0            0
   1        0 ARCH             2708 CONNECTED    ARCH     2708               0          0             0            0
   1        0 RFS              2840 IDLE         UNKNOWN  2830               0          0             0            0
   1        0 RFS              2858 IDLE         UNKNOWN  2824               0          0             0            0
   1        0 RFS              2869 IDLE         ARCH     2828               0          0             0            0
   1        1 ARCH             2704 CLOSING      ARCH     2704              62          1             0            0
   1        1 RFS              2860 IDLE         LGWR     2832              63       1439             0            0

8 rows selected.

日志应用状态:

SQL> 2 FROM (select thread# thrd, MAX(sequence#) almax
3 FROM v$archived_log
4 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) al,
5 (SELECT thread# thrd, MAX(sequence#) lhmax
6 FROM v$log_history
7 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) lh
8 WHERE al.thrd = lh.thrd;

Thread Last Seq Received Last Seq Applied

     1                65               62

1 row selected.

查看同步状态

SELECT * FROM v$dataguard_stats WHERE name LIKE '%lag%';SQL>

NAME VALUE UNIT TIME_COMPUTED DATUM_TIME


transport lag +00 00:00:00 day(2) to second(0) interval 05/09/2020 16:15:43 05/09/2020 16:15:43
apply lag +00 00:05:49 day(2) to second(0) interval 05/09/2020 16:15:43 05/09/2020 16:15:43

2 rows selected.

已经出现将近6min的lag。所以现在dg的同步状态是异常的。

查看主库alert日志:

Sat May 09 16:09:54 2020
alter tablespace zhuo add datafile size 20M
Completed: alter tablespace zhuo add datafile size 20M

查看备库的alert日志:

Sat May 09 16:09:54 2020
File #6 added to control file as 'UNNAMED00006' because
the parameter STANDBY_FILE_MANAGEMENT is set to MANUAL
The file should be manually created to continue.
MRP0: Background Media Recovery terminated with error 1274
Errors in file /u01/app/oracle/diag/rdbms/zhuodg/zhuodg/trace/zhuodg_mrp0_2906.trc:
ORA-01274: cannot add datafile '/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_hcdsblpt_.dbf' - file could not be created
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 789594
MRP0: Background Media Recovery process shutdown (zhuodg)

发现主库在添加数据文件成功后,备库报错ORA-01274。并且MRP0进程异常关闭。

先再备库手动重启下MRP进程:

SQL> recover managed standby database using current logfile disconnect;
Media recovery complete.

alert日志:

Sat May 09 16:11:42 2020
ALTER DATABASE RECOVER managed standby database using current logfile disconnect
Attempt to start background Managed Standby Recovery process (zhuodg)
Sat May 09 16:11:42 2020
MRP0 started with pid=28, OS id=3140
MRP0: Background Managed Standby Recovery process started (zhuodg)
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Sat May 09 16:11:47 2020
Errors in file /u01/app/oracle/diag/rdbms/zhuodg/zhuodg/trace/zhuodg_dbw0_2668.trc:
ORA-01186: file 6 failed verification tests
ORA-01157: cannot identify/lock data file 6 - see DBWR trace file
ORA-01111: name for data file 6 is unknown - rename to correct file
ORA-01110: data file 6: '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00006'
File 6 not verified due to error ORA-01157
MRP0: Background Media Recovery terminated with error 1111
Errors in file /u01/app/oracle/diag/rdbms/zhuodg/zhuodg/trace/zhuodg_mrp0_3140.trc:
ORA-01111: name for data file 6 is unknown - rename to correct file
ORA-01110: data file 6: '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00006'
ORA-01157: cannot identify/lock data file 6 - see DBWR trace file
ORA-01111: name for data file 6 is unknown - rename to correct file
ORA-01110: data file 6: '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00006'
Managed Standby Recovery not using Real Time Apply
MRP0: Background Media Recovery process shutdown (zhuodg)
Completed: ALTER DATABASE RECOVER managed standby database using current logfile disconnect

也是有一大堆报错,都是备库不能锁定这个文件。
在主备库查询这个数据文件的状态:
主库:

 FILE# NAME                                                                                                 STATUS

     1 /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_system_gxd20h14_.dbf                                     SYSTEM
     2 /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_sysaux_gxd20k1y_.dbf                                     ONLINE
     3 /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_undotbs1_gxd20lnp_.dbf                                   ONLINE
     4 /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_users_gxd20pxk_.dbf                                      ONLINE
     5 /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf                                       ONLINE
     6 /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_hcdsblpt_.dbf                                       ONLINE  ----<<<新添加的数据文件

6 rows selected.
备库

SQL> select file#,name,status from v$datafile;

 FILE# NAME                                                                                                 STATUS

     1 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_system_05uvp9bj_.dbf                            SYSTEM
     2 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_sysaux_06uvp9bj_.dbf                            ONLINE
     3 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_undotbs1_07uvp9bj_.dbf                          ONLINE
     4 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_users_08uvp9c4_.dbf                             ONLINE
     5 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_04uvp9bj_.dbf                              ONLINE
     6 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00006                                             RECOVER ----<<<同步过来的数据文件命名和状态都不对

问题分析
根据mos:Background Media Recovery terminated with ORA-1274 after adding a Datafile (Doc ID 739618.1)
1)查看参数

SQL> show parameter standby;

NAME TYPE VALUE


standby_archive_dest string ?/dbs/arch
standby_file_management string MANUAL

2)重命名数据文件

SQL> alter database create datafile '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00006' as new;

Database altered.

SQL> select file#,name,status from v$datafile;

 FILE# NAME                                                                                                 STATUS

     1 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_system_05uvp9bj_.dbf                            SYSTEM
     2 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_sysaux_06uvp9bj_.dbf                            ONLINE
     3 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_undotbs1_07uvp9bj_.dbf                          ONLINE
     4 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_users_08uvp9c4_.dbf                             ONLINE
     5 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_04uvp9bj_.dbf                              ONLINE
     6 /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_hcdw395t_.dbf                              RECOVER

6 rows selected.

3)设置参数
SQL> alter system set standby_file_management=auto;

System altered.
4)重启mrp进程
SQL> recover managed standby database using current logfile disconnect;
Media recovery complete.

alert日志:

Sat May 09 17:00:11 2020
ALTER DATABASE RECOVER managed standby database using current logfile disconnect
Attempt to start background Managed Standby Recovery process (zhuodg)
Sat May 09 17:00:11 2020
MRP0 started with pid=19, OS id=3386
MRP0: Background Managed Standby Recovery process started (zhuodg)
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /u01/app/oracle/oradata/arch/zhuodg/ZHUODG/archivelog/2020_05_09/o1_mf_1_63_hcdsmnt1_.arc
Media Recovery Log /u01/app/oracle/oradata/arch/zhuodg/ZHUODG/archivelog/2020_05_09/o1_mf_1_64_hcdsmx23_.arc
Media Recovery Log /u01/app/oracle/oradata/arch/zhuodg/ZHUODG/archivelog/2020_05_09/o1_mf_1_65_hcdsn2tw_.arc
Media Recovery Waiting for thread 1 sequence 66 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 66 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/zhuodg/ZHUODG/onlinelog/o1_mf_4_hcdrt3nx_.log
Completed: ALTER DATABASE RECOVER managed standby database using current logfile disconnect

—日志传输已经正常,以前没应用的日志已经应用。

5)查看同步状态

SQL> SELECT * FROM v$dataguard_stats WHERE name LIKE '%lag%';
NAME VALUE UNIT TIME_COMPUTED DATUM_TIME


transport lag +00 00:00:00 day(2) to second(0) interval 05/09/2020 17:01:52 05/09/2020 17:01:51
apply lag +00 00:00:00 day(2) to second(0) interval 05/09/2020 17:01:52 05/09/2020 17:01:51

2 rows selected.

SQL> col client_pid for a10
SQL> SELECT inst_id, thread#, process, pid, status, client_process, client_pid,
2 sequence#, block#, active_agents, known_agents FROM gv$managed_standby ORDER BY thread#, pid;

INST_ID THREAD# PROCESS PID STATUS CLIENT_P CLIENT_PID SEQUENCE# BLOCK# ACTIVE_AGENTS KNOWN_AGENTS


   1        0 ARCH             2706 CONNECTED    ARCH     2706               0          0             0            0
   1        0 RFS              2840 IDLE         UNKNOWN  2830               0          0             0            0
   1        0 RFS              2858 IDLE         UNKNOWN  2824               0          0             0            0
   1        0 RFS              2869 IDLE         ARCH     2828               0          0             0            0
   1        1 ARCH             2702 CLOSING      ARCH     2702              64          1             0            0
   1        1 ARCH             2704 CLOSING      ARCH     2704              65          1             0            0
   1        1 ARCH             2708 CLOSING      ARCH     2708              63          1             0            0
   1        1 RFS              2860 IDLE         LGWR     2832              66      10444             0            0
   1        1 MRP0             3386 APPLYING_LOG N/A      N/A               66      10444             0            0

9 rows selected.

同步状态已经正常

问题原因
物理standby端的standby_file_management参数设置错误,导致在primary端添加数据文件或者创建表空间的时候,不能正常传输到备库。

解决方案:
alter database create datafile '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00006' as new;
alter system set standby_file_management=auto;
recover managed standby database using current logfile disconnect;

扩展
多数情况下,primary数据库的修改会随着redo数据传播到物理standby数据库端并被应用,不需要在物理standby端做额外的操作,不过根据实际配置的不同,也会有例外,
有些操作不是没有被传播到standby端,而是传播过去了,但不能正确执行,其中最常见的就是对表空间和日志文件的管理操作。

创建表空间或数据文件
初始化参数standby_file_management用来控制是否自动将primary数据库增加表空间或数据文件的改动,传播到物理standby数据库。该参数有两个值:
1)auto:如果该参数值设置为AUTO,则primary数据库执行的表空间创建操作也会传播到物理standby数据库上执行。
2)MANUAL:如果设置为MANUAL或者未设置任何值(默认值是MANUAL),需要手工复制新创建的数据文件到物理standby服务器。

standby_file_management参数特指primary数据库端的表空间或数据文件创建,如果数据文件是从其他数据库复制而来(比如通过TTS传输表空间),则不管standby_file_management参数值如何设置。
都必须手工复制到standby数据库,并重建物理standby数据库的控制文件。

备库standby_file_management参数为auto
SQL> show parameter standby;

NAME TYPE VALUE

standby_archive_dest string ?/dbs/arch
standby_file_management string AUTO
SQL>

主库创建表空间
SQL> create tablespace test datafile size 20M;

Tablespace created.
查看数据文件路径
SQL> col tbs for a10
SQL> col files for a100
SQL> select a.ts#,a.name tbs,b.name files from ts$ a,v$datafile b where a.ts#=b.ts#;

TS# TBS FILES
1

 0 SYSTEM     /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_system_gxd20h14_.dbf
 1 SYSAUX     /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_sysaux_gxd20k1y_.dbf
 2 UNDOTBS1   /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_undotbs1_gxd20lnp_.dbf
 4 USERS      /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_users_gxd20pxk_.dbf
 5 ZHUO       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf
 5 ZHUO       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_hcdsblpt_.dbf
 6 TEST       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_test_hcf6r72x_.dbf    ---《《《新创建的

7 rows selected.
备库查看数据文件路径:
SQL> col tbs for a10
SQL> col files for a100
SQL> set lines 200

TS# TBS FILES
1

 0 SYSTEM     /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_system_05uvp9bj_.dbf
 1 SYSAUX     /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_sysaux_06uvp9bj_.dbf
 2 UNDOTBS1   /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_undotbs1_07uvp9bj_.dbf
 4 USERS      /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_users_08uvp9c4_.dbf
 5 ZHUO       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_04uvp9bj_.dbf
 5 ZHUO       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_hcdw395t_.dbf
 6 TEST       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_test_hcf6r765_.dbf  ---《《显示正常

7 rows selected.

SQL>

备库standby_file_management参数为manual
备库修改参数为manual
SQL> alter system set standby_file_management=manual;

System altered.
主库添加数据文件并查看数据文件路径:

SQL> alter tablespace test add datafile size 10m;

Tablespace altered.

SQL> select a.ts#,a.name tbs,b.name files from ts$ a,v$datafile b where a.ts#=b.ts#;

TS# TBS FILES
1

 0 SYSTEM     /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_system_gxd20h14_.dbf
 1 SYSAUX     /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_sysaux_gxd20k1y_.dbf
 2 UNDOTBS1   /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_undotbs1_gxd20lnp_.dbf
 4 USERS      /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_users_gxd20pxk_.dbf
 5 ZHUO       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf
 5 ZHUO       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_hcdsblpt_.dbf
 6 TEST       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_test_hcf6r72x_.dbf
 6 TEST       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_test_hcf76186_.dbf

8 rows selected.

备库查看:
SQL> select a.ts#,a.name tbs,b.name files from ts$ a,v$datafile b where a.ts#=b.ts#;

TS# TBS FILES
1

 0 SYSTEM     /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_system_05uvp9bj_.dbf
 1 SYSAUX     /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_sysaux_06uvp9bj_.dbf
 2 UNDOTBS1   /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_undotbs1_07uvp9bj_.dbf
 4 USERS      /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_users_08uvp9c4_.dbf
 5 ZHUO       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_04uvp9bj_.dbf
 5 ZHUO       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_hcdw395t_.dbf
 6 TEST       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_test_hcf6r765_.dbf
 6 TEST       /u01/app/oracle/product/11.2.0/dbhome_1/dbs/UNNAMED00008         ----《《《文件命名异常,有unname字样,路径在$ORACLE_HOME/dbs存放

8 rows selected.
alert日志:

Sat May 09 19:59:03 2020
Successfully added datafile 7 to media recovery
Datafile #7: ‘/u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_test_hcf6r765_.dbf’
Sat May 09 20:03:32 2020
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [56.13%] pct of memory swapped out [37.44%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.
Sat May 09 20:06:09 2020
ALTER SYSTEM SET standby_file_management=‘MANUAL’ SCOPE=BOTH;
Sat May 09 20:06:25 2020
File #8 added to control file as ‘UNNAMED00008’ because
the parameter STANDBY_FILE_MANAGEMENT is set to MANUAL
The file should be manually created to continue.
MRP0: Background Media Recovery terminated with error 1274
Errors in file /u01/app/oracle/diag/rdbms/zhuodg/zhuodg/trace/zhuodg_mrp0_3386.trc:
ORA-01274: cannot add datafile ‘/u01/app/oracle/oradata/ZHUO/datafile/o1_mf_test_hcf76186_.dbf’ - file could not be created
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 808919
MRP0: Background Media Recovery process shutdown (zhuodg) ----《《《MRP进程异常停止

发现和上述案例的现象一致,就是因为参数为manual的原因。
处理方式就是上述方法,不在叙述。

备库standby_file_management参数为auto
主库删除表空间
SQL> drop tablespace test including contents and datafiles;

Tablespace dropped.

SQL> select a.ts#,a.name tbs,b.name files from ts$ a,v$datafile b where a.ts#=b.ts#;

TS# TBS FILES
1

 0 SYSTEM     /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_system_gxd20h14_.dbf
 1 SYSAUX     /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_sysaux_gxd20k1y_.dbf
 2 UNDOTBS1   /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_undotbs1_gxd20lnp_.dbf
 4 USERS      /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_users_gxd20pxk_.dbf
 5 ZHUO       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf
 5 ZHUO       /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_hcdsblpt_.dbf

6 rows selected.
备库查看
SQL> select a.ts#,a.name tbs,b.name files from ts$ a,v$datafile b where a.ts#=b.ts#;

TS# TBS FILES
1

 0 SYSTEM     /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_system_05uvp9bj_.dbf
 1 SYSAUX     /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_sysaux_06uvp9bj_.dbf
 2 UNDOTBS1   /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_undotbs1_07uvp9bj_.dbf
 4 USERS      /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_users_08uvp9c4_.dbf
 5 ZHUO       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_04uvp9bj_.dbf
 5 ZHUO       /u01/app/oracle/oradata/zhuodg/ZHUODG/datafile/o1_mf_zhuo_hcdw395t_.dbf

6 rows selected.

正常删除,同步到了备库

结论
综上,初始化参数standby_file_management为manual的时候,对于表空间和数据文件的增加和删除操作必须手工处理。
为auto的时候,对于表空间和数据文件的操作无需dba手工干预,物理standby能自动进行处理。