Bryan's Oracle Blog

Tuesday, May 12, 2020

ZDLRA and using stored scripts

One new item I learned with RMAN is the ability to utilize shared scripts. Using shared scripts can be very useful as you standardize your environment.

This is where it fits into the ZDLRA. The ZDLRA is an awesome product that helps you standardize your Oracle Backups, so adding shared scripts to your environment makes it even better !!

First, a little bit on shared scripts if you are new to them (I was).

Shared scripts are exactly as you expect by the name. They are a way to store a script in your RMAN catalog that can be shared by all the databases that utilize that RMAN catalog.

The list of commands to be used for shared scripts are.

[CREATE]/[CREATE OR REPLACE] {GLOBAL} SCRIPT
DELETE {GLOBAL} SCRIPT
EXECUTE {GLOBAL} SCRIPT
PRINT {GLOBAL} SCRIPT

Using the {GLOBAL} keyword is optional, and is only useful if you are are using multiple VPD users in your RMAN catalog. In a multi-VPD RMAN catalog, each VPD users can store scripts with the same name and they will be separate.

I will go through VPD users in a future post, as this is a topic on it's own.

Now let's go through how to best utilize stored scripts through an example. In my example, I have 2 ZDLRAs (let's pretend). With ZDLRA, the RMAN catalog is contained with the ZDLRA itself. This means that because I have 2 ZDLRAs, I have 2 RMAN catalogs. In this example, I will show how I can use a shared script to automatically switch backups to a different ZDLRA when the primary ZDLRA is not available during a patching window. Below is the picture of my fictitious environment.

In this case my backups go to ZDLRA_SF, and they are replicated to ZDLRA_NYC. My database is registered in both RMAN catalogs, but I only connect to the catalog on ZDLRA_SF if it is available.

Now let's create a script to put in each catalog that is specific to each ZDLRA.

Step #1 - add a script to both ZDLRAs that creates the channel configuration.

This is the script for the SF ZDLRA.

create or replace script ZDLRA_INCREMENTAL_BACKUP_L1
COMMENT "Normal level 1 incremental backups for San Francisco Primary ZDLRA"
  {
###
### Script to backup incremental level 1 to Primary San Francisco ZDLRA
###
###
###  First configure the channel for San Francisco ZDLRA
###
    CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' FORMAT '%d_%U' PARMS
      "SBT_LIBRARY=/u01/app/oracle/lib/libra.so,
      ENV=(RA_WALLET='location=file:/u01/app/oracle/wallet
       credential_alias=zdlrasf-scan:1521/zdlra:dedicated')";
###
###  Perform backup command
###
    backup device type sbt cumulative incremental level 1 filesperset 1 section size 64g database plus archivelog filesperset 32 not backed up;
   }


created script ZDLRA_INCREMENTAL_BACKUP_L1

This is the script for the NYC ZDLRA

create or replace script ZDLRA_INCREMENTAL_BACKUP_L1
COMMENT "Normal level 1 incremental backups for New York Primary ZDLRA"
  {
###
### Script to backup incremental level 1 to Primary New York ZDLRA
###
###
###  First configure the channel for New York ZDLRA
###
    CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' FORMAT '%d_%U' PARMS
      "SBT_LIBRARY=/u01/app/oracle/lib/libra.so,
      ENV=(RA_WALLET='location=file:/u01/app/oracle/wallet
       credential_alias=zdlranyc-scan:1521/zdlra:dedicated')";
###
###  Perform backup command
###
    backup device type sbt cumulative incremental level 1 filesperset 1 section size 64g database plus archivelog filesperset 32 not backed up;
   }

created script ZDLRA_INCREMENTAL_BACKUP_L1

Now, I have a script in each ZDLRA which allocates the channel properly for each ZDLRA.

Step #2 - Ensure the connection to the RMAN fails over if primary ZDLRA is not available.

The best way to accomplish this is the GDS - Global Data Services. In order to perform my test I am going to set up a failover in my tnsnames.ora file. Below is the entry in my file.


ZDLRA_SF=
       (DESCRIPTION_LIST=(FAILOVER=YES)(LOAD_BALANCE= NO)
          (DESCRIPTION=(FAILOVER= NO)
             (CONNECT_DATA=(SERVICE_NAME=ZDLRA_SF))
             (ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=1522))
           )
          (DESCRIPTION=(FAILOVER= NO)
             (CONNECT_DATA=(SERVICE_NAME=ZDLRA_NYC))
             (ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=1522))
           )
       )

Step #3 - Let's put it all together.

Since I don't have 2 ZDLRAs to test this, I will test it all by printing out the script that will be executed in RMAN.

I am going first going to show what happens under normal conditions. I am going to execute the following script I created in /tmp/rman.sql

print script  ZDLRA_INCREMENTAL_BACKUP_L1;
exit;

I am going run that script over and over to ensure I keep connecting properly. Below is the command.

 rman target / catalog rman/oracle@zdlra_sf @/tmp/rman.sql

Here is the output I get everytime showing that I am connecting and I will allocate channels to ZDLRA_SF

RMAN> print script  ZDLRA_INCREMENTAL_BACKUP_L1;
2> exit;
printing stored script: ZDLRA_INCREMENTAL_BACKUP_L1
{
###
### Script to backup incremental level 1 to Primary San Francisco ZDLRA
###
###
###  First configure the channel for San Francisco ZDLRA
###
    CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' FORMAT '%d_%U' PARMS
      "SBT_LIBRARY=/u01/app/oracle/lib/libra.so,
      ENV=(RA_WALLET='location=file:/u01/app/oracle/wallet
       credential_alias=zdlrasf-scan:1521/zdlra:dedicated')";
###
###  Perform backup command
###
    backup device type sbt cumulative incremental level 1 filesperset 1 section size 64g database plus archivelog filesperset 32 not backed up;
   }

Everything is working fine. Now let's simulate the primary being unavailable like below.

I shutdown the ZDLRA_SF database.

Now I am executing the same command as before.

 rman target / catalog rman/oracle@zdlra_sf @/tmp/rman.sql

My output now shows that it is using my New York ZDLRA.

RMAN> print script  ZDLRA_INCREMENTAL_BACKUP_L1;
2> exit;
printing stored script: ZDLRA_INCREMENTAL_BACKUP_L1
{
###
### Script to backup incremental level 1 to Primary New York ZDLRA
###
###
###  First configure the channel for New York ZDLRA
###
    CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' FORMAT '%d_%U' PARMS
      "SBT_LIBRARY=/u01/app/oracle/lib/libra.so,
      ENV=(RA_WALLET='location=file:/u01/app/oracle/wallet
       credential_alias=zdlranyc-scan:1521/zdlra:dedicated')";
###
###  Perform backup command
###
    backup device type sbt cumulative incremental level 1 filesperset 1 section size 64g database plus archivelog filesperset 32 not backed up;
   }

Recovery Manager complete.

SUMMARY :

With stored scripts, you can easily script a central backup script to be shared by all databases, and not have to worry about the individual channel configuration.

If there are any changes to the best practice for my backups (Doc ID 2176686.1), I just have to update the stored script in the RMAN catalog.

Wednesday, April 15, 2020

ZDLRA, Real-time redo compression

ZDLRA: Changes in the Protection Policy Compression Algorithms (Doc ID 2654539.1)

First I wanted to go through a little background on how compression works for archive logs.

As you probably know, database backups are automatically compressed. That is not necessarily the case for archive logs.

Archive logs can reach the ZDLRA by 2 different methods.

Real-time redo,
Log sweeps.

Regardless of how the Archive Logs reach the ZDLRA, they are stored as backup files.

Log Sweeps

Now for log sweeps, the backupset of archive logs is stored exactly as they are sent.

if you follow the best practices for backups, you send over archive logs as

- Plus archive logs not backed up filesperset=32;

As you can figure out, this creates a single, or multiple backupsets each backupset contains multiple archive logs up to a total of 32 archive logs per backupset.

Also note, that that by default the archive log backupsets are NOT compressed. You can RMAN compress the archive log backups (if you have the ACO license), but be careful because it is not recommended to compress the datafile backups.

Real-time redo

Real-time redo is where this MOS note takes affect. Up to the current release of ZDLRA, archive logs sent via real-time redo were compressed using basic compression. A "standby redo" log would mirror the redo log for the protected database, and a "BACKUP_ARCH" task would be queued to create a backupset from the "standby redo log". In the process of creating the RMAN backupset, basic compression was applied and a compress RMAN backupset is created.

This does a few things that I want to point to out.

During the creation of the archivelog backupset, CPU is used to compress the backupset, and the final result is a smaller backupset.
During the validation of backups for a database, CPU is used to uncompress the RMAN backupset to validate it. NOTE: this would also occur if you RMAN compress the archive logs during a log sweep.
When recovering a database, the compressed RMAN backupset is sent over the network (less network traffic), and the Database Host uncompresses the backupset to read the archive logs.

If you read this carefully, you will see that there a CPU cost for compression, and there is a space savings when using it.

New Feature to control log compression!

The new feature allows you to control the log compression for archive logs sent through real-time redo.

NOTE: When using compression on the ZDLRA, you don't need the ACO license.

Below is the information from the documentation.

log_compression_algorithm

The setting for the archivelog compression feature. 
This setting is used to adjust the compression level of NZDL/polled archivelog backups.


OFF means that the archivelogs will not be compressed. 
BASIC means the BASIC compression alogorithm will be used to compress the backups. 
LOW means the LOW compression alogorithm will be used to compress the backups. 
MEDIUM means the MEDIUM compression alogorithm will be used to compress the backups. 
HIGH means the HIGH compression alogorithm will be used to compress the backups.

This is a great lever to control what happens with archive logs.

Medium compression can be used if you have ample CPU and are concerned about space, OFF can be used if you have CPU concerns and ample space.

Remember, the deeper the compression, the higher the CPU. Like most things in IT, there are tradeoffs .

Monday, February 24, 2020

What are those processes in V$DATAGUARD_PROCESSES on the ZDLRA?

With 12.2 of Oracle there is a now detail in the V$DATAGUARD_PROCESS view on the downstream database. Keep in mind that the "downstream database" may not only be a standby database, it can include a Far Sync database, or a ZDLRA receiving Real-time-redo.

Below is a table of the processes that I've seen in this view from a ZDLRA, and I will go through what each one of these are.

NAME ACTION        CLIENT_ROLE      COUNT(*)
---- ------------ ---------------- ----------
rfs  IDLE          async SRL single      10
rfs  IDLE          async SRL multi       20
rfs  IDLE          async ORL single      30
rfs  IDLE          async ORL multi       40
rfs  IDLE          gap manager          100
rfs  IDLE          archive gap           10

First off. I want to point out what happens in a RAC environment. As you can image, each thread (node) in a RAC environment independently sends it's redo to the downstream. The same thing happens with a standby database sending it's redo to a downstream.
For example, if my primary DB cluster is a 4 Node RAC cluster, and I have a standby database that is a single node, redo sent from both the primary AND the standby cluster to a ZDLRA will be sent as 4 independent streams.

Now to walk through what each of these types of processes are, I will first define the types.
Since I am seeing multiple types, this example is from a ZDLRA (rather than a standby database) so I will define the terms from a ZDLRA standpoint.

SRL – Standby Redo Logs are sending the changes. These are standby databases

ORL – Online Redo Logs are sending the changes. These are primary databases

Multi - The Protected database is using the same log buffer to write both archive logs and REDO to the ZDLRA (and possibly another standby).

Single – The Protected database is using separate buffers for REDO to the ZDLRA. The ZDLRA could not keep up with the archive/standby database.

Async - Real-time redo processes

Gap manager – Process that keeps track of any gaps in redo, and can send archive logs to fill the gap.

Archive gap – Additional processes that send over any gaps in redo. This is controlled by the parameter “LOG_ARCHIVE_MAX_PROCESSES” on the protected databases. The Gap manager will send over the first archive log, but archive gap processes will be started in parallel with the Gap manager if more than 1 archive log is needed. This is controlled by the LOG_ARCHIVE_MAX_PROCESSES parameter which has a default value of 4.

So for my example, this is what we are seeing

“Async SRL single” processes of 10 is telling us that there are 10 standby treads being applied and sending real-time-redo using the same buffer on the protected database that the archive redo process is using.

Async SRL multi” processes of 20 is telling us that there are 20 standby threads being applied and sending real-time-redo that could not keep up with the archiver process and spawned an additional buffer.

A total of 30 combined processes for "Async SRL" is telling us that there are 30 threads sending real-time redo in total.

**** Notice I used the term "threads" above. The number of threads is the number of RAC nodes on the primary database regardless of how many instances are on the standby applying redo.

“Async ORL single” processes of 30 is telling us that there are 30 primary database instances sending real-time-redo using the same buffer on the protected database that the archive redo process is using.

“Async ORL Multi” processes of 40 is telling us that there are 40 primary database instances sending real-time-redo that could not keep up with the archiver process and spawned an additional buffer.

A total of 70 combined processes for "Async ORL" is telling us there are 70 primary instances sending real-time redo.

**** Notice I used the term "instances" above. The number of instances is the number of RAC nodes on the primary database rather than the number of databases.

"Gap Manager" process of 100 is telling us that there are 100 process that are monitoring the async process. Notice that the total number of "Gap Manager" processes matches the total number of async processes.

“archive gap” processes are temporary and should be killed once they are idle for 2 minutes since they completed their work. We should only see these processes if the ZDLRA falls behind in collecting redo.

I hope this helps explain what the processes are that are in use to manage the real-time redo.