Thursday, June 2, 2011

Cardinality and the in clause

I was running a query that had an in clause, and the cardinality turned out to be wrong.. After investigating, I found that the optimizer was double applying the selectivity of the index.

Here is my example

select * from system.tab1 a,system.tab2 b
where a.col1 IN ('WRH$_PARAMETER_PK','WRH$_SEG_STAT','WRH$_DB_CACHE_ADVICE')
and col1=col2;

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 54186084

------------------------------------------------------------------------------------------
| Id  | Operation                      | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |         |     3 |   162 |    14   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                  |         |       |       |            |          |
|   2 |   NESTED LOOPS                 |         |     3 |   162 |    14   (0)| 00:00:01 |
|   3 |    INLIST ITERATOR             |         |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID| TAB1    |     3 |    75 |     7   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN         | TAB1_PK |     3 |       |     4   (0)| 00:00:01 |
|*  6 |    INDEX RANGE SCAN            | TAB2_IX |     1 |       |     2   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID  | TAB2    |     1 |    29 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("A"."COL1"='WRH$_DB_CACHE_ADVICE' OR "A"."COL1"='WRH$_PARAMETER_PK'
              OR "A"."COL1"='WRH$_SEG_STAT')
   6 - access("COL1"="COL2")
       filter("COL2"='WRH$_DB_CACHE_ADVICE' OR "COL2"='WRH$_PARAMETER_PK' OR
              "COL2"='WRH$_SEG_STAT')

23 rows selected.


Notice the cardinality of 3 rows.   What made me wonder, is when I used just an '=' sign

select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_DB_CACHE_ADVICE'
and col1=col2;

------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1964798218

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |    26 |  1404 |    23   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |    26 |  1404 |    23   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TAB1    |     1 |    25 |     2   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | TAB1_PK |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| TAB2    |    26 |   754 |    21   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | TAB2_IX |    26 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"='WRH$_DB_CACHE_ADVICE')
   5 - access("COL2"='WRH$_DB_CACHE_ADVICE')



Notice the cardinality is 26 ?  why is the optimizer changing the cardinality from 26 to 3, when I include an in clause with one other value ? I would expect the cardinality to be 78.  In fact when I use a union it comes back right.

select * from system.tab1 a,system.tab2 b

where col1 ='WRH$_DB_CACHE_ADVICE'
and col1=col2
union
select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_SEG_STAT'
and col1=col2
union
select * from system.tab1 a,system.tab2 b
where col1 ='WRH$_PARAMETER_PK'
and col1=col2;

------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2805776637

---------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                    |    77 |  7014 |    71  (68)| 00:00:01 |
|   1 |  SORT UNIQUE                       |                    |    77 |  7014 |    71  (68)| 00:00:01 |
|   2 |   UNION-ALL                        |                    |       |       |            |          |
|   3 |    NESTED LOOPS                    |                    |    26 |  1404 |    23   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID    | TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN             | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID    | TAB2               |    26 |   754 |    21   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN              | TAB2_IX            |    26 |       |     2   (0)| 00:00:01 |
|   8 |    NESTED LOOPS                    |                    |       |       |            |          |
|   9 |     NESTED LOOPS                   |                    |    51 |  5610 |    46   (0)| 00:00:01 |
|  10 |      VIEW                          | VW_JF_SET$623BBB07 |     2 |   162 |     4   (0)| 00:00:01 |
|  11 |       SORT UNIQUE                  |                    |     2 |    50 |     4  (50)| 00:00:01 |
|  12 |        UNION-ALL                   |                    |       |       |            |          |
|  13 |         TABLE ACCESS BY INDEX ROWID| TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|* 14 |          INDEX UNIQUE SCAN         | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|  15 |         TABLE ACCESS BY INDEX ROWID| TAB1               |     1 |    25 |     2   (0)| 00:00:01 |
|* 16 |          INDEX UNIQUE SCAN         | TAB1_PK            |     1 |       |     1   (0)| 00:00:01 |
|* 17 |      INDEX RANGE SCAN              | TAB2_IX            |    26 |       |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS BY INDEX ROWID    | TAB2               |    26 |   754 |    21   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("COL1"='WRH$_PARAMETER_PK')
   7 - access("COL2"='WRH$_PARAMETER_PK')
  14 - access("COL1"='WRH$_DB_CACHE_ADVICE')
  16 - access("COL1"='WRH$_SEG_STAT')
  17 - access("ITEM_1"="COL2")




So I ran a 10053 trace and this is what I found.

1) the table (tab2) has 1280422 rows with 49,840 distinct values

Table Stats::

Table: TAB2 Alias: B
#Rows: 1280422 #Blks: 5972 AvgRowLen: 29.00 ChainCnt: 0.00
Column (#1): COL2(
AvgLen: 23 NDV: 49840 Nulls: 0 Density: 0.000020
Index Stats::
Index: TAB2_IX Col#: 1
LVLS: 2 #LB: 5787 #DK: 49840 LB/K: 1.00 DB/K: 17.00 CLUF: 884604.00
 
2) oracle takes this cardinality and computes it for the table access (1,280,422/49,840) * 3 = 77
 
SINGLE TABLE ACCESS PATH



Single Table Cardinality Estimation for TAB2[B]
Table: TAB2 Alias: B
Card: Original: 1280422.000000 Rounded: 77 Computed: 77.07 Non Adjusted: 77.07
Access Path: TableScan
 
 
 
 
3) It then takes the cardinality of the first table (tab1) the cardinality for the second table (tab2), and applies the selectivity.
 
Join Card: 2.964306 = = outer (3.000000) * inner (77.071950) * sel (0.012821)
 
Here is my example and my 10053 trace


Wednesday, May 25, 2011

ORA-27303: additional information: requested interface xxx.xxx.xxx.xxx failed bind. Check output from ifconfig command

Finally on the exadata, and I make a simple mistake in the login script. 

First I logon to the exadata, and I can start and stop the default database. Everything looks fine.  I then go to create my own database and I get this message

ORA-27504: IPC error creating OSD context

ORA-27300: OS system dependent operation:bind_fail failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvifconf
ORA-27303: additional information: requested interface xx.xx.xxx.xxx failed bind. Check output from ifconfig command

Huh ?  We are having issues transferring files to the machine also (scp's are stalling). so this looks like a network issue right ?  Check output from ifconfig, bind failed for an IP address.. all network related clues.

Well the issue turned out to be ORA_CRS_HOME set incorrectly.    Once I set it correctly, I am able to start up a new database, and start my dbduplicate process.

I just wanted to pass this along, and hopefully help someone else who get's this cryptic message.




Tuesday, May 24, 2011

Extended RAC vs Golden Gate


 Every so often this question comes up... "Why don't we do extended RAC ?"

 

 
Well here is some of my answer as to what we need to consider.

 
First, Extended RAC is setting up a RAC cluster across datacenters.. Each datacenter has it's own independent Storage arrays. Some Nodes are placed on one datacenter (a) and some are placed in (b). To keep it simple and for redundancy, lets say there is 2 nodes in each datacenter (4 total). These 4 nodes are all part of the same cluster, and they all share the same Global cache. Disk writes are mirrored across datacenters. This means that all writes are synchronous, and the write must be acknowledged on both arrays (in both datacenters).. Dark fiber is a must to accomplish this.

  
Issues ? There are number of concerns this brings up.

 

 
1) The global cache is shared across the datacenters. Any latency is felt by any processing that requires sharing data between clusters in opposite datacenters. This can cause some performance degradation.

 
2) Writes are synchronous across datacenters. This can increase the write time for any disk writes.

 
3) You are not protected against logical corruption, upgrade outages etc.

 
Best practice is to also have a physical standby for HA, and to allow patching etc (transient logical), etc.

 

 
In the end you end up with twice as much equipment, twice as much storage, and a lot more complex system over just keeping a physical standby database. It is even recommended that your quorum is kept in a third datacenter.

 
The cost doubles, and the complexity doubles. The question you need to ask is.. is it worth all this ?

 
Another option is to utilize Golden Gate. Golden gate gives you similar flexibility by going active-active across datacenters. You still need to have a physical standby, but the advantage of GG over Extended RAC is that the physical does not have to be the same class server or storage ($$). The whole point of the physical is to keep the changes so that they can be sent to the other active cluster.. The standby cluster never becomes a a client available primary.

 

 
I would summerize the 2 choices as follows

 

 
Extended RAC

 
PROS
  • Guaranteed synchronous write across datacenters.

 CONS

 
  • Less availability for any database changes requiring shutdown (parameter or patching).
  • Greater latency when data is sent between datacenters.
  • BCP requires full size footprint
 
Golden Gate

 

 
PROS

  •  Database available for any database changes requiring downtime.
  •  Application releases can be applied in a rolling fasion
  •  Smaller footprint for BCP servers
  •  Less lattency for reads and commits (though latency for data availabilty)

 
CONS

 
  • Database updates are asynchronous. There will be a delay before update is visible in other datacenter (< 2 seconds)

 

 
Neithor one of these will guarantee 99.999% uptime, but GG comes a lot closer. With extended RAC, there is more planned downtime because it is a single database.
As you can see the decision should be driven by how important it is for the application to be able to have the data immediately available for read across datacenters. If you can tolerate the < 2 second latency, GG is a better product. If the application can't tolerate the latency Extended RAC is only viable solution.