Oracle database on NFS – Failure scenario testing

This post will provide a simple test case for the efficacy of creating and using an Oracle database cluster using NFS storage. The basic configuration will be shown, as well as the test case scenarios and results.  A complete installation of an Oracle RAC environment will not be shown.

From a high level, the installation and configuration requires only the following:

Create NFS server
Create and export directory on NFS server
Create two Linux servers
Create cluster on linux servers using NFS filesystem for storage
Create database on cluster using NFS filesystem for storage

CONFIGURATION

All environments are virtualized, and consist of the following components:

Windows Server 2008 for Active Directory and DNS services
Centos 6.5 (2.6.32-431.el6.x86_64 kernel) for the following guests:
* rac01
* rac02
* nfs01

On each of the Linux guests, there is a separate network interface for the public IP, the private cluster communication IP, and the storage IP.  This more closely resembles production and allows us to test the failure of individual components.

Below is the output of our base installation:

  rac01:oracle:nfsdb:/home/oracle>./crsstat.sh
  NAME                      TARGET     STATE           SERVER       STATE_DETAILS
  ------------------------- ---------- ----------      ------------ ------------------
  ora.LISTENER.lsnr         ONLINE     ONLINE          rac01
  ora.LISTENER.lsnr         ONLINE     ONLINE          rac02
  ora.asm                   OFFLINE    OFFLINE         rac01        Instance Shutdown
  ora.asm                   OFFLINE    OFFLINE         rac02
  ora.gsd                   OFFLINE    OFFLINE         rac01
  ora.gsd                   OFFLINE    OFFLINE         rac02
  ora.net1.network          ONLINE     ONLINE          rac01
  ora.net1.network          ONLINE     ONLINE          rac02
  ora.ons                   ONLINE     ONLINE          rac01
  ora.ons                   ONLINE     ONLINE          rac02
  ora.LISTENER_SCAN1.lsnr   ONLINE     ONLINE          rac02
  ora.cvu                   ONLINE     ONLINE          rac02
  ora.nfsdb.db              ONLINE     ONLINE          rac01        Open
  ora.nfsdb.db              ONLINE     ONLINE          rac02        Open
  ora.oc4j                  ONLINE     ONLINE          rac01
  ora.rac01.vip             ONLINE     ONLINE          rac01
  ora.rac02.vip             ONLINE     ONLINE          rac02
  ora.scan1.vip             ONLINE     ONLINE          rac02
  rac01:oracle:nfsdb:/home/oracle>

We show where the cluster related files are stored…

  rac01:oracle:nfsdb:/u01/oradata/storage>ls  -lrt
  total  23540
  -rw-r-----.  1 root dba 272756736 Sep  2 09:55 ocr
  -rw-r-----.  1 grid dba  21004800 Sep  2 10:00 vdsk

We then show the database file locations…

  rac01:oracle:nfsdb:/u01/oradata/db>ls -lrt
  total 8
  drwxr-x---.  5 oracle dba 4096 Sep  2 09:00 NFSDB  
  drwxr-x---.  2 oracle dba 4096 Sep  2 09:53 nfsdb
  rac01:oracle:nfsdb:/u01/oradata/db>ls -lrt NFSDB/datafile/
  total 3022556
  -rw-r-----.  1 oracle dba  30416896 Sep  2 09:09 o1_mf_temp_b0chx0oq_.tmp
  -rw-r-----.  1 oracle dba 545267712 Sep  2 09:10 o1_mf_sysaux_b0chqslw_.dbf
  -rw-r-----.  1 oracle dba   5251072 Sep  2 09:10 o1_mf_users_b0chqsrt_.dbf
  -rw-r-----.  1 oracle dba 328343552 Sep  2 09:10 o1_mf_example_b0chxg1g_.dbf
  -rw-r-----.  1 oracle dba  26222592 Sep  2 09:10 o1_mf_undotbs2_b0cj8z4d_.dbf
  -rw-r-----.  1 oracle dba 104865792 Sep  2 09:10 o1_mf_undotbs1_b0chqsoo_.dbf
  -rw-r-----.  1 oracle dba 754982912 Sep  2 09:10 o1_mf_system_b0chqsc3_.dbf
  -rw-r-----.  1 oracle dba  20979712 Sep  2 09:53 o1_mf_temp_b0clwoqd_.tmp
  -rw-r-----.  1 oracle dba   5251072 Sep  2 09:54 o1_mf_users_b0clqtm9_.dbf
  -rw-r-----.  1 oracle dba  26222592 Sep  2 10:00 o1_mf_undotbs2_b0clxo1j_.dbf
  -rw-r-----.  1 oracle dba 524296192 Sep  2 10:00 o1_mf_sysaux_b0clqtgb_.dbf
  -rw-r-----.  1 oracle dba  36708352 Sep  2 10:00 o1_mf_undotbs1_b0clqtjh_.dbf
  -rw-r-----.  1 oracle dba 734011392 Sep  2 10:00 o1_mf_system_b0clqt70_.dbf
  rac01:oracle:nfsdb:/u01/oradata/db>

 
…and finally, the state of each instance of the database…

   
  SQL> select host_name,status from gv$instance;
   
  HOST_NAME            STATUS
  -------------------- ------------
  rac01.howard.local   OPEN
  rac02.howard.local   OPEN
   
  SQL>

TESTING

Oracle failure algorithm

Before describing our test scenarios, some background on how different failures are handled by the Oracle software would be beneficial.

Oracle handles failures based on a two prong approach.

1) Network communication – This is checked once per second by each node in the cluster to ensure it can successfully reach the others using the private cluster connection
2) Voting disk checks – Once per second, the oracle software issues a pwrite() call to the cluster voting disk with a supplied offset 512 bytes in size.  Each node in the cluster has its own 512 byte “slot” in the voting disk file for status

The failure can be handled in an orderly manner if either of the above components and associated checks fail on a given server, or even multiple servers.  This is defined as follows:

* Voting disk access failures will be allowed 200 seconds to self-heal, as long as the network heartbeats between nodes in the cluster are successful
* Network heartbeat failures will be allowed 30 seconds to self-heal, as long as the failing node(s) can continue to write their status to the shared voting disk for the other nodes to see

After each time period above, if the respective failure condition is still in place, the failing node will “commit suicide” by taking itself out of the cluster.  Oracle 11.2.0 introduced fencing that did not always require a node reboot.  The Oracle High Availability service (“OHAS”) will be used to trigger actions in the clusterware that result in nodes removing and adding themselves from and to the active list.

Test harness

Stun various servers by pausing guest in VirtualBox manager
Run java command line program that threads a connection to each instance in the cluster.  This will be used to exercise the database as well as print the results of that activity.  This is included as APPENDIX A

Specific tests

Single server has a failed connection to the NFS server

When we run our test and fail the storage connection for node 2, notice that no activity occurs for 207 seconds after the previous successful update.  This is due to the fact the row is locked by the failed node, and this is not resolved until the node has been evicted.  Notice also that this time is comprised of the 200 second timeout provided for any possible resolution of disk access failures, plus a few seconds for the actual eviction.

   
  Tue Sep 02 12:31:15 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 12:31:16 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 12:31:16 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 12:31:17 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 12:31:17 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 12:31:18 GMT-05:00 2014      updated table in thread 1
  java.sql.SQLException:  No more data to read from socket
          at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1199)
          at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:308)          
          at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:199)
          at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:542)

  

          at testIt.run(testIt.java:32)
          at java.lang.Thread.run(Thread.java:637)
  Tue Sep 02 12:34:45 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 12:34:46 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 12:34:47 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 12:34:48 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 12:34:49 GMT-05:00 2014      updated table in thread 1

NFS server pause
 
In this test, we simulate a high availability event in which the enterprise NFS server stack fails a component of its environment. To simulate this, we pause the NFS server in the VirtualBoxManager UI for about ten seconds.

As we can see, the software was blocked for about 13 seconds, but no cluster reconfiguration occurred.

   
  rac01:oracle:nfsdb1:/home/oracle>java testIt
  Tue Sep 02 13:05:20 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 13:05:20 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 13:05:21 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 13:05:21 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 13:05:22 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 13:05:22 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 13:05:23 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 13:05:23 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 13:05:24 GMT-05:00 2014      updated table in thread 1  
  Tue Sep 02 13:05:24 GMT-05:00 2014      updated table in thread 2  
  Tue Sep 02 13:05:37 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 13:05:37 GMT-05:00 2014      updated table in thread 2  
  Tue Sep 02 13:05:38 GMT-05:00 2014      updated table in thread 1  
  Tue Sep 02 13:05:38 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 13:05:39 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 13:05:39 GMT-05:00 2014      updated table in thread 2
  Tue Sep 02 13:05:40 GMT-05:00 2014      updated table in thread 1
  Tue Sep 02 13:05:40 GMT-05:00 2014      updated table in thread 2

 
NFS server pause when the database storage is accessed separately from the cluster devices

This is not provided, as it is expected the storage will be accessed over a common network interface. If necessary, it would be trivial to add this to the test list.  All that would be required is a separate interface for the database storage, as well as a separately mounted filesystem.

APPENDIX A

  import java.sql.*;
  import java.util.*;
   
  public class testIt implements Runnable {
    String server;
    Thread t;
    static Random r;
    public static void main (String args[]) {
      r = new Random();
      testIt t1 = new testIt("1");
      testIt t2 = new testIt("2");
    }
   
    testIt(String server) {
      try {
        t = new Thread(this);
        this.server = server;
        t.start();
      }
      catch (Exception e) {
        e.printStackTrace();
      }
    }
   
    public void run () {
      try {
        Class.forName("oracle.jdbc.driver.OracleDriver");
        Connection conn = DriverManager.getConnection("jdbc:oracle:thin:system/welcome@rac0"  +  server + ":1521:nfsdb" + server);
        PreparedStatement pst = conn.prepareStatement("update test set c = ?");
        while (true) {
          pst.setInt(1,r.nextInt());
          pst.execute();
          System.out.println(new java.util.Date().toString() + '\t' + "updated table in thread " + this.server);
          Thread.sleep(1000);
        }
      }
      catch (Exception e) {
        e.printStackTrace();
      }
    }
  }

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.