Friday, October 1, 2010

Exadata storage Software

How else should I spend a Friday night, other than drinking hard cider, and running performance numbers on the Exadata storage software.

This is my dive into the storage software and WOW is it impressive.. I am selecting from a 200+ million row table (no indexes).. Without storage software it takes 3 minutes to scan the whole table.. really impressive. Then with the storage indexing it takes 30 seconds to come back with a distinct column value. 6 times faster..

Then I was really impressed when I used a Unique key lookup.. No index, it took 8 seconds to find the data, compared to 189 seconds. 23x faster with the Storage software.

Next I made the table parallel 64.. Now it comes back in 3 seconds (no storage software), and 1 second with storage software.. Unbelievable numbers.

One of the first things I noticed is that the Exadata makes you rethink your redo log sizes. When loading data a lot of my waits are waiting on the redo to flush out because it is so small.

All in all the storage software looks pretty impressive.

Saturday, September 25, 2010

Rear view of an Exadata


Why HCC is exadata only

First the Physics.

The SAS drives spinning at 15k rpm's can produce 200m of data/second.Hypbrid Columnar compression get's on average 50x compression rate.200m x 50x = 10g of data PER DISK is read.

There are 100+ disks that can be read. This causes 2 issues.

a) The data is actually compressed/uncompressed at the storage tier. All the CPU's in the storage servers are utilized to make this happen.. Only the exadata can take advantage of the storage CPU's through the storage software

b) The data that is uncompressed is huge.. The disk can return 20.8g of data per second, but if you do get 50x compression, you are now trying to work with 1tb of disk/second..

Even if you are running infiniband, the system can't handle this volume of data.. The predicate elimination, and column eliminate will limit the data returned from the storage tier, making the processes of the data possible.
Without the storage software along with the CPU's at the storage level uncompressing data AND eliminating data, it is impossible to process the volume of data produced from HCC.