Wednesday, May 10, 2023

ZDLRA real-time redo demonstrated

 One of the key features of the ZDLRA is the ability to capture changes from the database "real-time" just like a standby database does. In this blog post I am going to demonstrate what is happening during this process so that you can get a better understanding of how it works.

ZDLRA Real-time Redo


If you look at the GIF above, I will explain what is happening, and show what happens with a demo of the process.

The ZDLRA uses the same process as a standby database.  In fact if you look at the flow of the real-time redo you will notice the redo blocks are sent to BOTH the local redo log files, AND to the staging area on the ZDLRA.  The staging area on the ZDLRA acts just like a standby redo does on a standby database.

As the ZDLRA receives the REDO blocks from the protected database they are validated to ensure that they are valid Oracle Redo block information.  This ensures that a man-in-the-middle attack does not change any of the backup information.  The validation process also assures that if the database is attacked by ransomware (changing blocks), the redo received is not tainted.


The next thing that happens during the process is the logic when a LOG SWITCH occurs.  As we all know, when a log switch occurs on a database instance, the contents of the redo log are written to an archive log.  With real-time redo, this causes the contents of the redo staging area on the ZDLRA (picture a standby redo log) to become a backup set of an archive log.  The RMAN catalog on the ZDLRA is then updated with the internal location of the backup set.


Log switch operation

I am going to go through a demo of what you see happen when this process occurs.

ZDLRA is configured as a redo destination

Below you can see that my database has a "Log archive destination" 3 configured.  The destination itself is the database on the ZDLRA (zdl9), and also notice that the log information will be sent for ALL_ROLES, which will send the log information regardless if it is a primary database or a standby database.
Archive Dest


List backup of recent archive logs from RMAN catalog


Before I demonstrate what happens with the RMAN catalog, I am going to list out the current archive log backup. Below you see that the current archive log backed up to the ZDLRA has the "SEQUENCE #10".

archive log backups prior

Perform a log switch

As you see in the animation at the top of the post, when a log switch occurs, the contents of the redo log in the "redo staging area" are used to create an archive log backup that is stored and cataloged.  I am going to perform a log switch to force this process.

Log switch


List backup of archive logs from RMAN catalog

Now that the log switch occurred, you can see below that there is a new backup set created from the redo staging area.
There are a couple of interesting items to note when you look at the backup set created.

archive logs after


  1. The backup of the archive log is compressed.  As part of the policy on the ZDLRA you have the option to have the backup of the archive log compressed when it is created from the "staged redo". This does NOT require the ACO (Advanced Compression) license. The compressed archive log will be sent back to the DB compressed during a restore operation, and the DB host will uncompress it.  This is the default option (standard compression) and I recommend changing it.  If you decide to compress, then MEDIUM or Low is recommended. Keep this in mind that he this may put more workload on the client to uncompress  the backup sets which may affect recovery times.  NOTE: When using TDE, there will be little to no compression possible.
  2. The TAG is automatically generated. By looking at the timestamp in the RMAN catalog information, you can see that the TAG is automatically generated using the timestamp to make it unique.
  3. The handle begins with "$RSCN_", this is because the backup piece was generated by the ZDLRA itself, and archivelog backup sets will begin with these characters.

Restore and Recovery using partial log information


Now I am going to demonstrate what happens when the database crashes, and there is no time for the database to perform a log switch.

List the active redo log and current SCN

Below you can see that my currently active redo log is sequence # 12.  This is where I am going to begin my test.

begin test


Create a table 

To demonstrate what happens when the database crashes I am going to create a new table. In the table I am going to store the current date, and the current SCN. Using the current SCN we will be able to determine the redo log that contains the table creation.

table create


Abort the database


As you probably know, if I shut down the database gracefully, the DB will automatically clean out the redo logs and archive it's contents. Because I want to demonstrate what happens with crash I am going to shut the database down with an ABORT to ensure the log switch doesn't occur.  Then start the database mount so I can look at the current redo log information

abort


Verify that the log switch did not occur


Next I am going to look at the REDO Log information and verify that my table creation (SCN 32908369) is still in the active redo log and did not get archived during the shutdown.

Log switch doesn't occur

Restore the database


Next I am going to restore the database from backup.


restore

Recover the database


This is where the magic occurs so I am going to show that happens step by step.

Recover using archive logs on disk


The first step the database does is to use the current archive logs to recover the database. You can see in the screenshot below that the database recovers the database using archive logs on disk up to sequence #11 for thread 1.  This contains all the changes for this thread, but does not include what is in the REDO log sequence #12.  Sequence #12 contains the create table we are interested in.

archives on disk

Recover using partial redo log


This step is where the magic of the ZDLRA occurs.  You can see from the screen shot below that the RMAN catalog on the ZDLRA returns the redo log information for Sequence #12 even though it was never archived. The ZDLRA was able to create an archive log backup from the partial contents it had in the Redo Staging area.

rtr recovery

Open the database and display table contents.


This is where it all comes together.  Using the partial redo log information from Redo Log sequence #12, you can see that when the database is opened, the table creation transaction is indeed in the database even though the redo did not become an archive log.
'


Conclusion : I am hoping this post gives you a better idea of how Real-time redo works on the ZDLRA, and how it handles recovering transactions after a database crash

Thursday, March 23, 2023

Why DBCS (Oracle Base Database Service) in OCI can make a DBA's life much easier (even with BYOL)

DBCS (now named Oracle Base Database service, but I will call it DBCS throughout this post) in OCI  can help make a DBA's life easier.  When I was testing the new Autonomous Recovery Service for Oracle Database in OCI, I created a LOT of different DBCS systems to test backup and recovery.  Along the way I learned a lot about the workings of DBCS, and I came to appreciate how it makes sense, even if you are a BYOL (bring your own license) customer.




I'm more of a an "old school" DBA, preferring command line, and scripting processes myself.  I am typically not a fan of automation.  When using DBCS I was surprised by all the things it would do for me that I would have to do manually.

Install oracle software and create a database

Having installed oracle software hundreds of times, and having created test databases, I didn't think I would care much about automation that did this for me.

Central Software image management

What I found in OCI, is that you can create your own software images that can be used to ensure each new database environment is consistent.  OCI gives you ability to create your own set of release images (which can include patches).  This ensures each time I create a new DBCS environment, and choose my custom image, it's running the same version in all environments. No more installing base release, then patches, and then then any possible one-off patches.  This makes the installation of the database software much, much easier, and ensures consistency.


Easy Database creation

Recently I've gotten familiar with performing a silent database creation, as using dbca isn't always easy to configure.  The tooling provided by DBCS will not only create a database for you, but will also configure TDE encryption (with a local wallet, or using OCI vault).  It can even create a RAC database across 2 nodes.  And don't forget, it can create the standby for me also.


Configure ASM storage

Now this is the most interesting piece I found when using DBCS.  Not only does the DBCS service create a disk group, but it automatically stripes multiple block volumes together maximizing performance.  This is a HUGE help in ensuring I am getting the best performance.
When I was going through what the configuration did, I tried to build tables showing how the different storage sizes translate to the storage configurations.
There were 2 configurations and DB data storage sizes, one for Flex, and one for Standard shapes.

Flex


First I looked at flex, and regardless of the performance level these were the sizes.


Then within Flex, I looked at the "Balanced performance" configuration.

Balanced Performance configuration





You can see that as the DB storage available goes up, the number of disks goes up also allowing for a higher  possible IOPS than you would get from a single Block Storage device.

Below is the chart for "High Performance"

High Performance configuration



You can see that the IOPS is even higher, and it is using even more disks to get that performance.

Standard


Next looked at standard shapes, and regardless of the performance level these were the sizes. Note that with Standard shapes, there were many more options for configurations.


Balanced Performance configuration





High Performance configuration






Benefits of DBCS

I also went through what some of the other benefits of DBCS are, and below is the list I came up with.

  • When using the DBCS service,  the storage cost is based on the Block Storage cost. This is the same cost as you would pay in an IaaS service.  Having the storage striped and configured for maximum IOPS makes this a huge plus.

  • DBCS allows you purchase licenses if you don't have enough licenses to use the BYOL option.

  • The DBCS service price is based on OCPU and is the same regardless of the shape. Memory is included in the OCPU cost.

  • DBCS automatically configures RAC if you choose it.

  • DBCS provides tooling that automatically configures backups, can apply patches, and rotate encryption keys.

  • DBCS allows you to automate the cloning of your database, and automate any restores.

  • DBCS includes TDE, and relieves you of having to own the ASO license.  

Conclusion:

DBCS offers a lot more than you realize. Take a deep dive into what it can do for you to save time as DBA and you also might realize that sometimes tooling along with automation has it's benefits.


Wednesday, March 1, 2023

Oracle Database recovery using Incremental merge, snapshots, OMF and "switch to copy"

I work with backup and recovery of the Oracle Database, and sometimes this means looking at the Incremental Merge backup strategy.  I know this isn't the best backup/recovery strategy, and below are few posts giving you more detail on the topic.

They have some great points, and I typically don't recommend using incremental merge backups.  The incremental merge backup strategy is almost always paired with snapshots to increase the recovery window.

Below is an image of how these are typically paired with snapshots.



One of the biggest draws of using the incremental merge strategy with snapshots, is the ability to perform a "switch to copy" as a recovery strategy.

NOTE: When you perform "switch to copy" the database is now accessing datafiles using the backup copy.  This is not supported on Exadata for any storage other than Oracle ZFS.

If you review the MOS note "Using External Storage with Exadata (Doc ID 2663308.1)" you will find that "Use of non-Oracle storage for database files is not supported."

Given all of that, I got the question "I am using the incremental merge strategy on Oracle an ZFS appliance using snapshots. If I perform a switch-to-copy recovery of one or more datafiles, how do I avoid forcing a new full backup on the next incremental merge backup?".

I thought this was a great question, and I created a test database, and started googlin'.  Below are some of the posts I looked at.

I started  by using the first post and walked through a testing scenario using a DBCS database in OCI.
My database was a 19.8 database using local storage (to make things easier to see the datafiles), and it was using OMF by default.  
The piece that was missing from the first post was the "alter database move datafile 'xxx' to 'xxx' KEEP;

What I found is that it wasn't so easy with my database using OMF for 2 reasons.
  1. Using OMF, you don't specify the "to 'xxx'" since OMF will automatically name the destination datafile.
  2. Using "KEEP" is ignored when the source file is OMF.  This meant that the original image copy being used by the database is removed when move process completes.  I couldn't catalog the image copy.
Since it took a bit of research to find the best strategy I wanted to share the process that I would recommend when dealing with OMF and non-OMF image copy backups with snapshots.

NON-OMF image copy backups

  1. Snap the backup storage just to preserve the starting point. --> optional but recommended
  2. Take the tablespaces offline
  3. Perform a "switch to copy" of the datafiles --> This will use the incremental merge backup.
  4. Recover the datafiles
  5. Bring the tablespaces online ---> Application is running using the external image copy
  6. Perform an "alter database move datafile 'xxx' to 'xxx' KEEP; --> Using keep will preserve the original copy, but will only work if the image copy is NOT OMF. If the destination is an OMF file, you will not use the "to"
  7. Catalog the image copy that was preserved with the "KEEP" ensuring you use the same tag used for the incremental merge. "catalog datafilecopy '+fra/ocm121/datafile/MONSTER.346.919594959' level 0 TAG 'incr_update';"
  8. The next incremental merge will pick up with the updated image copy.

OMF image copy backups

  1. Snap the backup storage to create a copy for the switch to copy.
  2. Unmount the "current" image copy 
  3. Mount the snap copy using the same mount point as the "current" image copy.
  4. Take the tablespaces offline
  5. Perform a "switch to copy" of the datafiles --> This will use the snap copy of the incremental merge backup on external storage.
  6. Recover the datafiles
  7. Bring the tablespaces online ---> Application is running using external copies.
  8. Perform an "alter database move datafile 'xxx' ; --> Since the source is an OMF file you cannot use "KEEP" to preserve the original copy. The original copy will be removed.
  9. Once all moves are complete, unmount the snapped copy.
  10. Mount the "current" copy. This is as of when you started this process.
  11. Catalog the image copy for all datafiles that performed the "switch to copy" ensuring you use the same tag used for the incremental merge "catalog datafilecopy '+fra/ocm121/datafile/MONSTER.346.919594959' level 0 TAG 'incr_update';"
  12. You can now destroy the snap that was created to perform the switch to copy.
  13. The next incremental merge will pick up with the current image datafile copies where it left off.
As you can see, using OMF greatly complicates preserving the incremental merge backup, and forces you to start at the last backup.