Testing Oracle Standard Edition High Availability

In this post I’ll continue discussing SEHA. If you don’t know what it is or you would like to read more about it, look at my previous post. I will not discuss here the solution and concepts, but try all kind of different scenarios and see how it behaves. I would also like to thank Markus Michalewicz for assisting with some questions I had and clarifications while I was writing this post.

Errors During Grid Infrastructure (GI) Command Execution

If you missed this from the previous post, you might encounter errors (like “PRCD-1146 : Database se1 is not a RAC One Node database”) when running GI commands. This is because the DB home’s srvctl is old (due to old OCW patch that is not part of the DB RU). All you need to do is to use the srvctl from the GI home instead.

Graceful Relocate

The nice thing about SEHA is that it integrates Standard Edition database with GI that we all know. That’s why managing it is super intuitive and simple, unlike any custom-built solution.

The first thing we’ll do is relocate the database from one node to another:

[oracle@seha1 trace]$ srvctl status database -db se1
Instance se1 is running on node seha1
[oracle@seha1 trace]$ srvctl relocate database -db se1 -node seha2
[oracle@seha1 trace]$ srvctl status database -db se1
Instance se1 is running on node seha2

And when looking at the alert logs:

seha1:

2020-07-29T14:30:40.991548-07:00
Shutting down ORACLE instance (immediate) (OS id: 15473)
2020-07-29T14:30:40.991700-07:00
Shutdown is initiated by oraagent.bin@seha1 (TNS V1-V3).
...
2020-07-29T14:30:54.264438-07:00
Stopping background process RBAL
2020-07-29T14:31:00.409248-07:00
Instance shutdown complete (OS id: 15473)

seha2:

Starting ORACLE instance (normal) (OS id: 27905)
2020-07-29T14:31:04.080729-07:00
****************************************************
...
2020-07-29T14:31:48.432969-07:00
CJQ0 started with pid=58, OS id=28262
Completed: ALTER DATABASE OPEN /* db agent *//* {1:35177:12360} */

Note that the startup on seha2 only started after the shutdown on seha1 is completed. This is how SEHA behaves as it cannot have two instances up, even for a very short time (unlike RAC or RAC One Node).

Failures

Before I start testing failures, I’d like to explain a few challenges here, as this is something I realized when configuring Oracle DB as a resource in a cluster (not GI) I worked with.

When a database crashes, the cluster has two options: start it on the same node, or failover to the other node. Usually the preferred option is to start it on the same node. But what happens if the database crashes again? Or maybe there is a specific problem with this node? At some point the cluster will have to decide to failover to the other node. This is why clusters are usually configured to try to restart the database on the same node a specific number of times before failing over to the other node.

For a cluster to do that, it should keep a counter to know how many times it tried to restart the resource on the same node, so it will know when it’s time to failover to the other node. For example, think that this limit is 1 restart. When the database fails, the cluster will try to restart it on the same node once. If it fails again, the database will failover. What happens if the database starts normally? Everything is great, but if the cluster remembers that it had tried one restart already, the next failure (whenever that might be) will cause a failover, which is not what we would expect. This is why the cluster also needs to reset this counter once everything looks good.

Now, back to Oracle. Oracle GI, being a very comprehensive cluster, does exactly that. The GI has a few relevant configuration parameters:

RESTART_ATTEMPTS (default is 2) – this is the threshold in which Oracle will decide to stop trying to restart the database on the same node, and will failover to the other node instead. The value is 2, meaning, 2 restart tries will be attempted on the same node before the failover
UPTIME_THRESHOLD (default is 1 hour) – this is the “reset” option. A default of 1 hour means that after a failure the GI “remembers” that a failure happened for 1 hour before resetting this status. Another failure within this 1 hour window will be considered as a second failure (and a second restart attempt). The third restart will occur on the other node
RESTART_COUNT – this is the variable that counts how many restart operations were executed. Once this counter reaches 2 (the value of RESTART_ATTEMPTS), a failover will occur.

Now that we understand how it works, let’s crash some databases:

DB Crash

This is a simple test: I’ll crash the database by killing pmon. The expected result is a restart on the same node while the RESTART_COUNT will change to 1:

[oracle@seha1 ~]$ crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=0

[oracle@seha1 ~]$ ps -ef|grep pmon
oracle    3877     1  0 15:03 ?        00:00:00 asm_pmon_+ASM1
oracle    4865     1  0 15:04 ?        00:00:00 ora_pmon_se1
oracle   31509 26488  0 17:01 pts/0    00:00:00 grep --color=auto pmon
[oracle@seha1 ~]$ kill -9 4865
[oracle@seha1 ~]$ crsctl status resource ora.se1.db
NAME=ora.se1.db
TYPE=ora.database.type
TARGET=ONLINE
STATE=ONLINE on seha1

[oracle@seha1 ~]$  crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=1

And the alert log shows:

2020-08-04T17:01:47.682715-07:00
USER(prelim) (ospid: 31681): terminating the instance
2020-08-04T17:01:48.792413-07:00
Instance terminated by USER(prelim), pid = 31681
2020-08-04T17:01:51.613966-07:00
Adjusting the requested value of parameter parallel_max_servers
from 0 to 1 due to running in CDB mode
Starting ORACLE instance (normal) (OS id: 31706)
2020-08-04T17:01:52.042268-07:00
****************************************************
 Sys-V shared memory will be used for creating SGA
 ****************************************************
...
2020-08-04T17:02:25.509562-07:00
CJQ0 started with pid=54, OS id=31924
Completed: ALTER DATABASE OPEN /* db agent *//* {0:1:11} */

Multiple Crashes

In this scenario I will abuse the database a bit more:

I will kill pmon again and expect it to start on seha1 (this would be the second as I did this right after the previous test)
Then I will kill pmon once more while now I expect a failover

[oracle@seha1 ~]$  crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=1

[oracle@seha1 ~]$ ps -ef|grep pmon
oracle    3877     1  0 15:03 ?        00:00:00 asm_pmon_+ASM1
oracle   31726     1  0 17:01 ?        00:00:00 ora_pmon_se1
oracle   32706 26488  0 17:06 pts/0    00:00:00 grep --color=auto pmon
[oracle@seha1 ~]$ kill -9 31726
[oracle@seha1 ~]$ crsctl status resource ora.se1.db
NAME=ora.se1.db
TYPE=ora.database.type
TARGET=ONLINE
STATE=ONLINE on seha1

[oracle@seha1 ~]$  crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=2

[oracle@seha1 ~]$ ps -ef|grep pmon
oracle     398     1  0 17:06 ?        00:00:00 ora_pmon_se1
oracle     979 26488  0 17:07 pts/0    00:00:00 grep --color=auto pmon
oracle    3877     1  0 15:03 ?        00:00:00 asm_pmon_+ASM1
[oracle@seha1 ~]$ kill -9 398
[oracle@seha1 ~]$ crsctl status resource ora.se1.db
NAME=ora.se1.db
TYPE=ora.database.type
TARGET=ONLINE
STATE=ONLINE on seha2

[oracle@seha1 ~]$  crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=0

Note that the RESTART_COUNT was reset after the successful failover.

What happens if I try the same on se2 (which is not configured as SEHA, but only as Oracle Restart)?

[oracle@seha2 trace]$ crsctl status resource ora.se2.db -v -attr RESTART_COUNT
NAME=ora.se2.db 1 1
RESTART_COUNT=0

[oracle@seha2 trace]$ ps -ef|grep pmon
oracle    5740     1  0 15:06 ?        00:00:00 asm_pmon_+ASM2
oracle    6886     1  0 15:06 ?        00:00:00 ora_pmon_se2
oracle   26624     1  0 17:07 ?        00:00:00 ora_pmon_se1
oracle   27432 21534  0 17:11 pts/0    00:00:00 grep --color=auto pmon
[oracle@seha2 trace]$ kill -9 6886
[oracle@seha2 trace]$ crsctl status resource ora.se2.db -v -attr RESTART_COUNT
NAME=ora.se2.db 1 1
RESTART_COUNT=1

[oracle@seha2 trace]$ ps -ef|grep pmon
oracle    5740     1  0 15:06 ?        00:00:00 asm_pmon_+ASM2
oracle   26624     1  0 17:07 ?        00:00:00 ora_pmon_se1
oracle   27600     1  0 17:11 ?        00:00:00 ora_pmon_se2
oracle   28070 21534  0 17:12 pts/0    00:00:00 grep --color=auto pmon
[oracle@seha2 trace]$ kill -9 27600
[oracle@seha2 trace]$ crsctl status resource ora.se2.db -v -attr RESTART_COUNT
NAME=ora.se2.db 1 1
RESTART_COUNT=2

[oracle@seha2 trace]$ crsctl status resource ora.se2.db
NAME=ora.se2.db
TYPE=ora.database.type
TARGET=ONLINE
STATE=ONLINE on seha2

[oracle@seha2 trace]$ ps -ef|grep pmon
oracle    5740     1  0 15:06 ?        00:00:00 asm_pmon_+ASM2
oracle   26624     1  0 17:07 ?        00:00:00 ora_pmon_se1
oracle   28120     1  0 17:12 ?        00:00:00 ora_pmon_se2
oracle   28678 21534  0 17:14 pts/0    00:00:00 grep --color=auto pmon
[oracle@seha2 trace]$ kill -9 28120
[oracle@seha2 trace]$ crsctl status resource ora.se2.db
NAME=ora.se2.db
TYPE=ora.database.type
TARGET=ONLINE
STATE=OFFLINE

[oracle@seha2 trace]$ crsctl status resource ora.se2.db -v -attr RESTART_COUNT
NAME=ora.se2.db 1 1
RESTART_COUNT=2

Here, after the resource was attempted to start RESTART_ATTEMPTS (2) times, the GI will give up and leave the resource down. Database se2 is offline now.

Infinite Crashes

This is also something to test. What happens if the database has a problem and it cannot start at all? In the case of Oracle Restart (we saw that with the se2 database) the GI will give up and leave the database down. What will happen with SEHA?

In order to emulate this I simply renamed the parameter CONTROL_FILES in the spfile and then killed pmon:

[oracle@seha2 dbs]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Tue Aug 4 17:42:26 2020
Version 19.8.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
Version 19.8.0.0.0

SQL> show parameter control

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time        integer     7
control_files                        string      +DATA/SE1/CONTROLFILE/current.
                                                 284.1046963929
control_management_pack_access       string      NONE

SQL> alter system set control_files='+DATA/blabla' scope=spfile;

System altered.

SQL> Disconnected from Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
Version 19.8.0.0.0
[oracle@seha2 dbs]$ ps -ef|grep pmon
oracle    1635     1  0 17:36 ?        00:00:00 ora_pmon_se1
oracle    3866 21534  0 17:42 pts/0    00:00:00 grep --color=auto pmon
oracle    5740     1  0 15:06 ?        00:00:00 asm_pmon_+ASM2
[oracle@seha2 dbs]$ kill -9 1635

I followed the alert logs on both servers and this is what happened (remember that se1 is now running on seha2 node):

The database was restarted on seha2 (unsuccessfully)
The database was restarted on seha2 again (unsuccessfully)
The database was restarted on seha1 (unsuccessfully)
The database was restarted on seha1 again (unsuccessfully)
Here GI gave up and left the database down.

When I checked, I found out that FAILURE_COUNT wasn’t reset and it was left 2. This means that Oracle reset it only upon a successful start (which makes sense). However, I’m not sure yet how the GI knows not to failover again to the other node (when trying to start on seha1 with FAILURE_COUNT=2, it should try to start it on seha2, but apparently it knows that it couldn’t start in on seha2 as well so it just gave up).

One last comment here: I’ve seen clusters that treat a startup failure as a “hard” failure: When a resource fails during a restart, it immediately failover (unlike here where we saw Oracle tries to start the DB on seha2 again, even though the start itself failed). This is not too bad, as there is a chance that the database WILL be able to start the second time, it’s just a note.

Server Shutdown

Now let’s go back to normal scenarios and test what happens when I shutdown one of the servers. This is the cluster status before I start:

[oracle@seha1 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       seha1                    STABLE
               ONLINE  ONLINE       seha2                    STABLE
ora.chad
               ONLINE  ONLINE       seha1                    STABLE
               ONLINE  ONLINE       seha2                    STABLE
ora.net1.network
               ONLINE  ONLINE       seha1                    STABLE
               ONLINE  ONLINE       seha2                    STABLE
ora.ons
               ONLINE  ONLINE       seha1                    STABLE
               ONLINE  ONLINE       seha2                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    STABLE
      2        ONLINE  ONLINE       seha2                    STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    STABLE
      2        ONLINE  ONLINE       seha2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       seha1                    STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    Started,STABLE
      2        ONLINE  ONLINE       seha2                    Started,STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    STABLE
      2        ONLINE  ONLINE       seha2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       seha1                    STABLE
ora.qosmserver
      1        ONLINE  ONLINE       seha1                    STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       seha1                    STABLE
ora.se1.db
      1        ONLINE  ONLINE       seha1                    Open,HOME=/oracle/db
                                                             /19,STABLE
ora.se2.db
      1        ONLINE  ONLINE       seha2                    Open,HOME=/oracle/db
                                                             /19,STABLE
ora.seha1.vip
      1        ONLINE  ONLINE       seha1                    STABLE
ora.seha2.vip
      1        ONLINE  ONLINE       seha2                    STABLE
--------------------------------------------------------------------------------

Let’s shutdown seha2 first. This is what happens after seha2 is down:

[oracle@seha1 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       seha1                    STABLE
ora.chad
               ONLINE  ONLINE       seha1                    STABLE
ora.net1.network
               ONLINE  ONLINE       seha1                    STABLE
ora.ons
               ONLINE  ONLINE       seha1                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       seha1                    STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    Started,STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       seha1                    STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       seha1                    STABLE
ora.qosmserver
      1        ONLINE  ONLINE       seha1                    STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       seha1                    STABLE
ora.se1.db
      1        ONLINE  ONLINE       seha1                    Open,HOME=/oracle/db
                                                             /19,STABLE
ora.se2.db
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.seha1.vip
      1        ONLINE  ONLINE       seha1                    STABLE
ora.seha2.vip
      1        ONLINE  INTERMEDIATE seha1                    FAILED OVER,STABLE
--------------------------------------------------------------------------------

As expected, se2 database is down. Now let’s start seha2 and shutdown seha1. This is that happens:

[oracle@seha2 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       seha2                    STABLE
ora.chad
               ONLINE  ONLINE       seha2                    STABLE
ora.net1.network
               ONLINE  ONLINE       seha2                    STABLE
ora.ons
               ONLINE  ONLINE       seha2                    STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       seha2                    STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       seha2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       seha2                    STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       seha2                    Started,STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       seha2                    STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       seha2                    STABLE
ora.qosmserver
      1        ONLINE  ONLINE       seha2                    STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       seha2                    STABLE
ora.se1.db
      1        ONLINE  ONLINE       seha2                    Open,HOME=/oracle/db
                                                             /19,STABLE
ora.se2.db
      1        ONLINE  ONLINE       seha2                    Open,HOME=/oracle/db
                                                             /19,STABLE
ora.seha1.vip
      1        ONLINE  INTERMEDIATE seha2                    FAILED OVER,STABLE
ora.seha2.vip
      1        ONLINE  ONLINE       seha2                    STABLE
--------------------------------------------------------------------------------

As you can see, when seha1 was down, se1 failed over to seha2.

Also note that after seha1 came back online, se1 database didn’t failback to it. This is something that we need to do manually (and this is very important as it requires downtime and we do not want it to happen automatically).

Stopping and Starting the DB

What happens when we simply stop and start the database. The only interesting scenario here is when se1 database is located on seha2. Let’s see:

Instance se1 is running on node seha2
[oracle@seha2 ~]$ srvctl stop database -db se1
[oracle@seha2 ~]$ srvctl status database -db se1
Database is not running.
[oracle@seha2 ~]$ srvctl start database -db se1
[oracle@seha2 ~]$ srvctl status database -db se1
Instance se1 is running on node seha1

This means that a SEHA database will always start on the original node by default. You can specify the node if you’d like:

[oracle@seha2 ~]$ srvctl stop database -db se1
[oracle@seha2 ~]$ srvctl start database -db se1 -node seha2
[oracle@seha2 ~]$ srvctl status database -db se1
Instance se1 is running on node seha2

Reset of RESTART_COUNT

As I explain above, the UPTIME_THRESHOLD parameter (1 hour by default) determines when the FAILURE_COUNT will be reset. Let’s check this:

[oracle@seha2 ~]$ ps -ef|grep pmon
oracle    4102     1  0 09:32 ?        00:00:00 asm_pmon_+ASM2
oracle    4304     1  0 09:32 ?        00:00:00 ora_pmon_se2
oracle   15056     1  0 09:45 ?        00:00:00 ora_pmon_se1
oracle   16442  4919  0 09:51 pts/0    00:00:00 grep --color=auto pmon
[oracle@seha2 ~]$ kill -9 15056
[oracle@seha2 ~]$ date
Wed Aug  5 09:51:41 PDT 2020
[oracle@seha2 ~]$ crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=1
 
[oracle@seha2 ~]$ date
Wed Aug  5 10:44:49 PDT 2020
[oracle@seha2 ~]$ crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=1
 
[oracle@seha2 ~]$ date
Wed Aug  5 11:39:11 PDT 2020
[oracle@seha2 ~]$ crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=1

What’s going on here? Shouldn’t the RESTART_COUNT go back to zero after 1 hour? Apparently not. The RESTART_COUNT is reset only during a resource state change (start, stop, relocate, etc.), so we shouldn’t expect a reset of this counter until something changes:

[oracle@seha2 ~]$ date
Wed Aug  5 11:39:11 PDT 2020

[oracle@seha2 ~]$ crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=1

[oracle@seha2 ~]$ ps -ef|grep pmon
oracle    4102     1  0 09:32 ?        00:00:00 asm_pmon_+ASM2
oracle    4304     1  0 09:32 ?        00:00:00 ora_pmon_se2
oracle    5188  4919  0 12:00 pts/0    00:00:00 grep --color=auto pmon
oracle   16492     1  0 09:51 ?        00:00:00 ora_pmon_se1
[oracle@seha2 ~]$ kill -9 16492
[oracle@seha2 ~]$ crsctl status resource ora.se1.db -v -attr RESTART_COUNT
NAME=ora.se1.db 1 1
RESTART_COUNT=1

When I crashed the database again it was restarted but note that the RESTART_COUNT was still 1 as opposed to be increased to 2. This is because 1 hour has passed so the counter got reset but then was increased to 1 because of the new failure. Also note that regardless of the time passed, every relocate and fresh start of the database will reset the counter back to 0.

This is a matter of efficiency. We don’t want Oracle to check the time and status constantly in order to reset the counter. It’s an unnecessary overhead. This is why Oracle handles the counter reset only when it handles the resource anyway.

Conclusion

SEHA is a built-in solution for high availability in Standard Edition. It is very easy to use as the integration with the mature GI is perfect. This is a feature that should have been available long time ago to allow managing SE databases in a fail-over cluster environment.

Tags: rac, SEHA

4 thoughts on “Testing Oracle Standard Edition High Availability”

Pingback: Patching OCW for OH | Fernando Simon
pcora says:

April 21, 2021 at 4:30 am

Thanks for this article.

In the last section before Conclusion i.e. Reset of FAILURE_COUNT, you have mentioned about FAILURE_COUNT variable but you have actually shown the behaviour for RESTART_COUNT variable. I was actually looking for the behaviour of FAILURE_COUNT variable. In case you have more details, then pls share. Reason being, everytime a failure takes place, FAILURE_COUNT value does not go up by 1. So there is some understanding that needs to seep in.

Thanks

1. Liron Amitzi says:
  
  April 22, 2021 at 10:33 am
  
  Sorry, it seems like I made a mistake and called it “FAILURE_COUNT” instead of “RESTART_COUNT”. I fixed the title.
  Will try to find time to look into FAILURE_COUNT.
  Thanks for the comment
  
Eramirton Ferreira Crispim says:

September 20, 2022 at 3:08 pm

Excellent explanation, I’m working with SEHA now, and I’m very liking your simplicity and functionality.
Good job!

Testing Oracle Standard Edition High Availability

Errors During Grid Infrastructure (GI) Command Execution

Graceful Relocate

Failures

DB Crash

Multiple Crashes

Infinite Crashes

Server Shutdown

Stopping and Starting the DB

Reset of RESTART_COUNT

Conclusion

Like this:

4 thoughts on “Testing Oracle Standard Edition High Availability”

Leave a Reply Cancel reply

Testing Oracle Standard Edition High Availability

Errors During Grid Infrastructure (GI) Command Execution

Graceful Relocate

Failures

DB Crash

Multiple Crashes

Infinite Crashes

Server Shutdown

Stopping and Starting the DB

Reset of RESTART_COUNT

Conclusion

Share this:

Like this:

4 thoughts on “Testing Oracle Standard Edition High Availability”

Leave a Reply Cancel reply

Related Post

Even More 12.2 FeaturesEven More 12.2 Features

Setup.exe -applyRU in WindowsSetup.exe -applyRU in Windows

Tips for Upgrading a DatabaseTips for Upgrading a Database