1. Home
  2. Knowledge Base
  3. Data Guard
  4. 19c FSFO Observer – some findings

19c FSFO Observer – some findings

In a previous post we have looked at some of the new features for the Data Guard Broker Observer in 19c like the ability to run the Observer in OBSERVE ONLY mode as well as some other features like multiple Observers and multiple failover target standby databases.

In this note we look at how Master Observer failover happens and also an important finding which can be a cause for concern.

If we host the Master Observer in the STANDBY data center and we lose both the Master Observer as well as the Data Guard standby database, we have seen the case where FSFO causes the Primary database to also shut down!

Not very good – kindly test the same scenario in your environment and let me know if this is really what you experience as well


#######################################################################################################
LISTENER STOPPED AND MASTER OBSERVER LOSES CONTACT WITH PRIMARY
#######################################################################################################



[root@rac01 dbs]# su - grid 
Last login: Mon Jun 15 14:19:20 AWST 2020
[grid@rac01 ~]$ lsnrctl stop 




DGMGRL> connect /@oradb1
Unable to connect to database using oradb1
ORA-12541: TNS:no listener

Failed.
Warning: You are no longer connected to ORACLE.


DGMGRL> connect / as sysdg
Connected to "oradb1"
Connected as SYSDG.



#######################################################################################################
OBSERVER STILL SHOWS NO ISSUE 
#######################################################################################################


DGMGRL> show configuration

Configuration - oradb1_dg

  Protection Mode: MaxPerformance
  Members:
  oradb1    - Primary database
    oradb1_sb - (*) Physical standby database 
    clouddb   - Physical standby database 

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
SUCCESS   (status updated 49 seconds ago)

DGMGRL> show observer 

Configuration - oradb1_dg

  Primary:            oradb1
  Active Target:      oradb1_sb

Observer "rac01" - Master

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         0 seconds ago
  Last Ping to Target:          3 seconds ago

Observer "rac02" - Backup

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         2 seconds ago
  Last Ping to Target:          2 seconds ago




#######################################################################################################
NOW OBSERVER CANNOT CONNECT AFTER OBSERVERRECONNECT VALUE IS CHANGED  
#######################################################################################################


DGMGRL> edit configuration set property observerreconnect=30;
Property "observerreconnect" updated


DGMGRL> show observer 

Configuration - oradb1_dg

  Primary:            oradb1
  Active Target:      oradb1_sb

Observer "rac01" - Master

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         6 seconds ago
  Last Ping to Target:          3 seconds ago

Observer "rac02" - Backup

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         5 seconds ago
  Last Ping to Target:          2 seconds ago



#######################################################################################################
Try again after 30 seconds expired ..... MASTER OBSERVER HAS CHANGED 
#######################################################################################################


DGMGRL> /

Configuration - oradb1_dg

  Primary:            oradb1
  Active Target:      oradb1_sb

Observer "rac02" - Master

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         (unknown)
  Last Ping to Target:          (unknown)

Observer "rac01" - Backup

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         40 seconds ago
  Last Ping to Target:          1 second ago




#######################################################################################################
START THE LISTENER
#######################################################################################################


[grid@rac01 ~]$ lsnrctl start 

 

#######################################################################################################
OBSERVER CAN NOW PING THE PRIMARY AND MASTER OBSERVER BACK TO ORIGINAL 
#######################################################################################################



DGMGRL> /

Configuration - oradb1_dg

  Primary:            oradb1
  Active Target:      oradb1_sb

Observer "rac01" - Master

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         1 second ago
  Last Ping to Target:          1 second ago

Observer "rac02" - Backup

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         2 seconds ago
  Last Ping to Target:          0 seconds ago



#######################################################################################################
SHUTDOWN STANDBY ORADB1_SB 
#######################################################################################################



[oracle@rac02 ~]$ sqlplus /@oradb1_sb as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Jun 15 14:38:57 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SQL> shutdown abort 
ORACLE instance shut down.
ERROR:
ORA-12514: TNS:listener does not currently know of service requested in connect
descriptor


Warning: You are no longer connected to ORACLE.


#######################################################################################################
ACTIVE TARGET STANDBY CHANGED TO CLOUDDB 
#######################################################################################################


DGMGRL> show observer 

Configuration - oradb1_dg

  Primary:            oradb1
  Active Target:      clouddb

Observer "rac01" - Master

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         3 seconds ago
  Last Ping to Target:          3 seconds ago

Observer "rac02" - Backup

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         3 seconds ago
  Last Ping to Target:          2 seconds ago



DGMGRL> show configuration

Configuration - oradb1_dg

  Protection Mode: MaxPerformance
  Members:
  oradb1    - Primary database
    clouddb   - (*) Physical standby database 
    oradb1_sb - Physical standby database 

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
SUCCESS   (status updated 60 seconds ago)


#######################################################################################################
SHUTDOWN ACTIVE TARGET STANDBY CLOUDDB 
#######################################################################################################



[oracle@rac02 ~]$ . oraenv
ORACLE_SID = [oradb1sb] ? clouddb 
The Oracle base remains unchanged with value /u01/app/oracle
[oracle@rac02 ~]$ sqlplus sys as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Jun 15 14:50:54 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Enter password: 

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SQL> shutdown abort;
ORACLE instance shut down.
SQL> 


#######################################################################################################
ACTIVE TARGET STANDBY DATABASE IS CHANGED TO ORADB1_SB
#######################################################################################################


DGMGRL> show observer 

Configuration - oradb1_dg

  Primary:            oradb1
  Active Target:      oradb1_sb

Observer "rac01" - Master

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         1 second ago
  Last Ping to Target:          4 seconds ago

Observer "rac02" - Backup

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         0 seconds ago
  Last Ping to Target:          4 seconds ago


#######################################################################################################
STOP LISTENER ON RAC01 AND RAC02 
#######################################################################################################


[grid@rac01 ~]$ lsnrctl stop 

[grid@rac02 ~]$ lsnrctl stop 



DGMGRL> show configuration

Configuration - oradb1_dg

  Protection Mode: MaxPerformance
  Members:
  oradb1    - Primary database
    Error: ORA-16778: redo transport error for one or more members

    oradb1_sb - (*) Physical standby database 
    clouddb   - Physical standby database 
      Error: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
ERROR   (status updated 52 seconds ago)



DGMGRL> show fast_start failover 

Fast-Start Failover: Enabled in Potential Data Loss Mode

  Protection Mode:    MaxPerformance
  Lag Limit:          30 seconds

  Threshold:          30 seconds
  Active Target:      oradb1_sb
  Potential Targets:  "oradb1_sb,clouddb"
    oradb1_sb  valid
    clouddb    valid
  Observers:      (*) rac02
                      rac01
  Shutdown Primary:   TRUE
  Auto-reinstate:     TRUE
  Observer Reconnect: 30 seconds
  Observer Override:  FALSE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO
    Stuck Archiver                  NO
    Datafile Write Errors          YES

  Oracle Error Conditions:
    (none)



DGMGRL> show observer

Configuration - oradb1_dg

  Primary:            oradb1
  Active Target:      oradb1_sb

Observer "rac02" - Master

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         (unknown)
  Last Ping to Target:          (unknown)

Observer "rac01" - Backup

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         119 seconds ago
  Last Ping to Target:          3 seconds ago




#######################################################################################################
SHUTDOWN ABORT PRIMARY
#######################################################################################################



DGMGRL> quit
[oracle@rac01 trace]$ sqlplus sys as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Jun 15 15:02:55 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Enter password: 

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SQL> shutdown abort 
ORACLE instance shut down.




#######################################################################################################
FSFO HAS HAPPENED PRIMARY DATABASE IS NOW ORADB1_SB
#######################################################################################################


[oracle@rac02 ~]$  . oraenv
ORACLE_SID = [clouddb] ? oradb1sb
The Oracle base remains unchanged with value /u01/app/oracle
[oracle@rac02 ~]$ dgmgrl
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Mon Jun 15 15:04:26 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect / as sysdg
Connected to "ORADB1_SB"
Connected as SYSDG.

DGMGRL> show configuration

Configuration - oradb1_dg

  Protection Mode: MaxPerformance
  Members:
  oradb1_sb - Primary database
    oradb1    - (*) Physical standby database (disabled)
      ORA-16661: the standby database needs to be reinstated

    clouddb   - Physical standby database (disabled)
      ORA-16661: the standby database needs to be reinstated

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
SUCCESS   (status updated 120 seconds ago)

DGMGRL> 


#######################################################################################################
TRY TO START ORIGINAL PRIMARY ORADB1 
#######################################################################################################


SQL> startup
ORACLE instance started.

Total System Global Area 1476391088 bytes
Fixed Size		    8896688 bytes
Variable Size		 1375731712 bytes
Database Buffers	   83886080 bytes
Redo Buffers		    7876608 bytes
Database mounted.
ORA-16649: possible failover to another database prevents this database from
being opened




#######################################################################################################
NEW PRIMARY IS ORADB1_SB 
#######################################################################################################


[oracle@rac02 ~]$ . oraenv
ORACLE_SID = [racdb2] ? oradb1sb
The Oracle base has been set to /u01/app/oracle

[oracle@rac02 ~]$ sqlplus sys as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Jun 15 15:10:49 2020
Version 19.6.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Enter password: 

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SQL> select database_role,open_mode from v$database;

DATABASE_ROLE	 OPEN_MODE
---------------- --------------------
PRIMARY 	 READ WRITE



#######################################################################################################
ONCE LISTENER COMES UP ON RAC01 AND RAC02 OLD PRIMARY AND SECOND STANDBY ARE REINSTATED 
#######################################################################################################



ORADB1

SQL> select open_mode,database_role from v$database;

OPEN_MODE	     DATABASE_ROLE
-------------------- ----------------
MOUNTED 	     PRIMARY

SQL> /

OPEN_MODE	     DATABASE_ROLE
-------------------- ----------------
MOUNTED 	     PRIMARY

SQL> /

OPEN_MODE	     DATABASE_ROLE
-------------------- ----------------
READ ONLY WITH APPLY PHYSICAL STANDBY



DGMGRL> connect /@oradb1_sb
Connected to "ORADB1_SB"
Connected as SYSDBA.

DGMGRL> show configuration

Configuration - oradb1_dg

  Protection Mode: MaxPerformance
  Members:
  oradb1_sb - Primary database
    oradb1    - (*) Physical standby database 
    clouddb   - Physical standby database 

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
SUCCESS   (status updated 59 seconds ago)


#######################################################################################################
CHANGE MASTER OBSERVER SITE 
#######################################################################################################



DGMGRL> SET MASTEROBSERVER TO rac01;
Succeeded.

DGMGRL> show observer

Configuration - oradb1_dg

  Primary:            oradb1_sb
  Active Target:      oradb1

Observer "rac01" - Master

  Host Name:                    rac01.localdomain
  Last Ping to Primary:         1 second ago
  Last Ping to Target:          2 seconds ago

Observer "rac02" - Backup

  Host Name:                    rac02.localdomain
  Last Ping to Primary:         1 second ago
  Last Ping to Target:          1 second ago



#######################################################################################################
CHANGE TO MAXAVAILABILITY 
#######################################################################################################



DGMGRL> edit configuration set protection mode maxavailability;
Error: ORA-16654: fast-start failover is enabled

Failed.

DGMGRL> disable fast_start failover;
Disabled.

DGMGRL>  edit configuration set protection mode maxavailability;
Error: ORA-16627: operation disallowed since no member would remain to support protection mode

Failed.

DGMGRL> edit database oradb1 set property LogXptMode='SYNC';
Property "logxptmode" updated

DGMGRL> edit database oradb1_sb set property LogXptMode='SYNC';
Property "logxptmode" updated

DGMGRL> edit database clouddb set property LogXptMode='SYNC';
Property "logxptmode" updated

DGMGRL>  edit configuration set protection mode maxavailability;
Succeeded.
DGMGRL> 


DGMGRL> enable fast_start failover 
Enabled in Zero Data Loss Mode.

DGMGRL> show configuration

Configuration - oradb1_dg

  Protection Mode: MaxAvailability
  Members:
  oradb1_sb - Primary database
    oradb1    - (*) Physical standby database 
    clouddb   - Physical standby database 

Fast-Start Failover: Enabled in Zero Data Loss Mode

Configuration Status:
SUCCESS   (status updated 47 seconds ago)




#######################################################################################################
STOP RAC01 - MASTER OBSERVER IS RUNNING HERE
#######################################################################################################



#######################################################################################################
NEW PRIMARY ON RAC02 ALSO SHUTDOWN!!!!!!!!!!!!!!!!!!!!!!!!!!! 
#######################################################################################################


2020-06-15T15:29:43.910963+08:00
DMON: FSFP network call timeout. Killing process FSFP.
2020-06-15T15:29:43.913016+08:00
Process termination requested for pid 29389 [source = rdbms], [info = 2] [request issued by pid: 17941, uid: 54321]
2020-06-15T15:29:58.915933+08:00
Starting background process FSFP
2020-06-15T15:29:59.157480+08:00
FSFP started with pid=88, OS id=30541
2020-06-15T15:30:00.919770+08:00
Primary has heard from neither observer nor target standby within FastStartFailoverThreshold seconds.
It is likely an automatic failover has already occurred. Primary is shutting down.
2020-06-15T15:30:00.952957+08:00
Errors in file /u01/app/oracle/diag/rdbms/oradb1_sb/oradb1sb/trace/oradb1sb_lgwr_17908.trc:
ORA-16830: primary isolated from fast-start failover partners longer than FastStartFailoverThreshold seconds: shutting down
2020-06-15T15:30:01.674507+08:00
System state dump requested by (instance=1, osid=17908 (LGWR)), summary=[abnormal instance termination]. error - 'Instance is terminating.
'
System State dumped to trace file /u01/app/oracle/diag/rdbms/oradb1_sb/oradb1sb/trace/oradb1sb_diag_17863.trc
LGWR (ospid: 17908): terminating the instance due to ORA error 16830
2020-06-15T15:30:08.357552+08:00
Instance terminated by LGWR, pid = 17908
~                                                                                                                                
Updated on June 2, 2021

Was this article helpful?

Related Articles

Leave a Comment