Some of my customers have been pushing for more availability in their Oracle database applications. They want to eliminate downtime completely even if they experience a site failure. Whether this is a real business requirement or a technology push, I’m not sure – I guess a bit of both.
Most of these customers have already implemented Oracle RAC (Real Application Clusters), which provides them active/active server clustering for Oracle. If one of the servers in a RAC cluster fails, the others just keep running – no restart or recovery involved. This is a High Availability option typically for local sites.
For Disaster Recovery, most customers have some sort of storage replication (i.e. EMC SRDF/Synchronous or SRDF/Async, or they use Oracle Data Guard for this which replicates data on the Oracle database level). This protects against site failures and offers zero or near-zero dataloss (for committed transactions in Oracle – the non-committed transactions are rolled back during the restart – and this is exactly one of the problems by the way).
(more…)