{"id":89,"date":"2009-05-04T10:30:18","date_gmt":"2009-05-04T15:30:18","guid":{"rendered":"http:\/\/appcrawler.com\/wordpress\/?p=89"},"modified":"2011-07-06T09:54:31","modified_gmt":"2011-07-06T14:54:31","slug":"analyzing-cache-transfers-between-instances-in-a-rac","status":"publish","type":"post","link":"http:\/\/appcrawler.com\/wordpress\/2009\/05\/04\/analyzing-cache-transfers-between-instances-in-a-rac\/","title":{"rendered":"Analyzing cache transfers between instances in a RAC"},"content":{"rendered":"<p>We were initially taught by oracle that RAC was a panacea.  Just add nodes on the fly, and the workload of a database instance will be automatically distributed over all available serves in the cluster.  However, as time wore on, users began to realize that while moving database blocks between servers over the \u201chigh speed interconnect\u201d was much faster than getting them from disk, it was not free.  The process of managing the blocks in memory and who \u201cowned\u201d them proved to be fairly expensive.  As such, user groups began calling for segmenting \u201cworkloads\u201d on given servers in the cluster.  While this did not provide for the plug in play scalability originally marketed by oracle, it did provide some scalability, and definitely fault tolerance.  In other words, if I had four servers in my cluster, and decided to run updates on server4, I could expect my workload to continue processing if server4 failed.  This is because I was told to configure the service running on server4 to be \u201cavailable\u201d on servers 1, 2 and 3.  This proved to be workable.<\/p>\n<p>However, things like this always drive me crazy.  I decided to dig into a methodology to determine the provable cost of transferring blocks between instances.  Luckily, our ole\u2019 friend AWR came to the rescue.  There is a view called dba_hist_inst_cache_transfer while provides all this information.  The query below looks at the number of read consistent blocks found to be \u201cbusy\u201d during all windows contained in our AWR.<\/p>\n<pre lang=\"sql\" line=\"1\">\r\nselect a.begin_interval_time,\r\n       a.inst2 inst2_to_inst1,\r\n       a.inst3 inst3_to_inst1,\r\n       b.inst1 inst1_to_inst2,\r\n       b.inst3 inst3_to_inst2,\r\n       c.inst1 inst1_to_inst3,\r\n       c.inst2 inst2_to_inst3\r\nfrom \r\n(select * \r\n  from (select distinct begin_interval_time, \r\n                        inst2,\r\n                        inst3\r\n          from (select to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') begin_interval_time,\r\n                       dhss.instance_number,\r\n                       cr_busy - lag(cr_busy, 1, 0) over (order by dhss.instance_number, dhss.snap_id) as value_delta\r\n                  from dba_hist_snapshot dhs,\r\n                       dba_hist_inst_cache_transfer dhss\r\n                  where dhs.snap_id = dhss.snap_id\r\n                    and dhs.instance_number = dhss.instance_number\r\n                    and class = 'data block'\r\n                    and instance = 1\r\n                  order by 1) c\r\n          model\r\n            ignore nav\r\n            dimension by(c.instance_number i,\r\n                         c.begin_interval_time begin_interval_time)\r\n            measures(c.value_delta value_delta, \r\n                     0 inst2, \r\n                     0 inst3)\r\n            rules(inst2[any,any] = value_delta[i = 2, cv(begin_interval_time)],\r\n                  inst3[any,any] = value_delta[i = 3, cv(begin_interval_time)])\r\n          order by 1)\r\n  where (inst2>= 0 and inst3>= 0)) a,\r\n  (select * \r\n    from (select distinct begin_interval_time, \r\n                          inst1,\r\n                          inst3\r\n            from (select to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') begin_interval_time,\r\n                         dhss.instance_number,\r\n                         cr_busy - lag(cr_busy, 1, 0) over (order by dhss.instance_number, dhss.snap_id) as value_delta\r\n                    from dba_hist_snapshot dhs,\r\n                         dba_hist_inst_cache_transfer dhss\r\n                    where dhs.snap_id = dhss.snap_id\r\n                      and dhs.instance_number = dhss.instance_number\r\n                      and class = 'data block'\r\n                      and instance = 2\r\n                    order by 1) c\r\n            model\r\n              ignore nav\r\n              dimension by(c.instance_number i,\r\n                           c.begin_interval_time begin_interval_time)\r\n              measures(c.value_delta value_delta, \r\n                       0 inst1, \r\n                       0 inst3)\r\n              rules(inst1[any,any] = value_delta[i = 1, cv(begin_interval_time)],\r\n                    inst3[any,any] = value_delta[i = 3, cv(begin_interval_time)])\r\n            order by 1)\r\n  where (inst1>= 0 and inst3>= 0)) b,\r\n  (select * \r\n    from (select distinct begin_interval_time, \r\n                          inst1,\r\n                          inst2\r\n            from (select to_char(begin_interval_time,'YYYY_MM_DD HH24:MI') begin_interval_time,\r\n                         dhss.instance_number,\r\n                         cr_busy - lag(cr_busy, 1, 0) over (order by dhss.instance_number, dhss.snap_id) as value_delta\r\n                    from dba_hist_snapshot dhs,\r\n                         dba_hist_inst_cache_transfer dhss\r\n                    where dhs.snap_id = dhss.snap_id\r\n                      and dhs.instance_number = dhss.instance_number\r\n                      and class = 'data block'\r\n                      and instance = 3\r\n                    order by 1) c\r\n            model\r\n              ignore nav\r\n              dimension by(c.instance_number i,\r\n                           c.begin_interval_time begin_interval_time)\r\n              measures(c.value_delta value_delta, \r\n                       0 inst1, \r\n                       0 inst2)\r\n              rules(inst1[any,any] = value_delta[i = 1, cv(begin_interval_time)],\r\n                    inst2[any,any] = value_delta[i = 2, cv(begin_interval_time)])\r\n            order by 1)\r\n  where (inst1>= 0 and inst2>= 0)) c\r\n  where a.begin_interval_time = b.begin_interval_time\r\n    and a.begin_interval_time = c.begin_interval_time\r\n<\/pre>\n<p>When I ran what is above, I found a couple of periods where CR_BUSY was higher than for others.  I went to the dba_hist_sql_stat view for one of the periods using something similar to what is below.<\/p>\n<pre lang=\"sql\" line=\"1\">\r\nselect * \r\n  from (select dbms_lob.substr(sql_text,4000,1), \r\n               dhss.sql_id,\r\n               to_char(begin_interval_time,'YYYY_MM_DD HH24:MI'), \r\n               dhs.instance_number,\r\n               clwait_delta  \r\n          from dba_hist_sqltext dhst, \r\n               dba_hist_snapshot dhs, \r\n               dba_hist_sqlstat dhss \r\n          where dhst.sql_id = dhss.sql_id \r\n            and dhss.snap_id = dhs.snap_id \r\n            and dhss.instance_number = dhs.instance_number \r\n            and dhs.begin_interval_time between to_date('2009042522','YYYYMMDDHH24') \r\n                                            and to_date('2009042601','YYYYMMDDHH24')\r\n  order by 5 desc)\r\n  where rownum <= 10\r\n<\/pre>\n<p>A statement bubbled to the top, where I could begin further investigation.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We were initially taught by oracle that RAC was a panacea. Just add nodes on the fly, and the workload of a database instance will be automatically distributed over all available serves in the cluster. However, as time wore on,&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"http:\/\/appcrawler.com\/wordpress\/2009\/05\/04\/analyzing-cache-transfers-between-instances-in-a-rac\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"footnotes":""},"categories":[19,22],"tags":[],"_links":{"self":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/89"}],"collection":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/comments?post=89"}],"version-history":[{"count":8,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/89\/revisions"}],"predecessor-version":[{"id":1300,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/posts\/89\/revisions\/1300"}],"wp:attachment":[{"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/media?parent=89"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/categories?post=89"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/appcrawler.com\/wordpress\/wp-json\/wp\/v2\/tags?post=89"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}