Monday, July 24, 2023

RMAN - Create weekly archival backup from weekly full backups

 This blog post demonstrates a process to create KEEP archival backups dynamically by using backups pieces within a  weekly full/daily incremental backup strategy.  

Thanks to Battula Surya Shiva Prasad and Kameswara RaoIndrakanti for coming up process of doing this.


KEEP backups

First let's go through what keep a keep backup is and how it affects your backup strategy.

  1. A KEEP backup is a self-contained backupset.  The archive logs needed to de-fuzzy the database files are automatically included in the backupset.  
  2. The archive logs included in the backup are only the archive logs needed to de-fuzzy.
  3. The backup pieces in the KEEP backup (both datafile backups and included archive log pieces) are ignored in the normal incremental backup strategy, and in any log sweeps.
  4. When a recovery window is set in RMAN, KEEP backup pieces are ignored in any "Delete Obsolete" processing.
  5. KEEP backup pieces, once past the "until time" are removed using the "Delete expired" command.

Normal  process to create an archival KEEP backup.

  • Perform a weekly full backup and a daily incremental backup that are deleted using an RMAN recovery window.
  • Perform archive log backups with the full/incremental backups along with log sweeps. These are also deleted using the an RMAN recovery window.
  • One of these processes are used to create an archival KEEP backup.
    • A separate full KEEP backup is performed along with the normal weekly full backup
    • The weekly full backup (and archive logs based on tag) are copied to tape with "backup as backupset" and marked as "KEEP" backup pieces.

Issues with this process

  • The process of copying the full backup to tape using "backup as backupset" requires 2 copies of the same backup for a period of time.  You don't want to wait until the end of retention to copy it to tape.
  • If the KEEP full backups are stored on disk, along with the weekly full backups you cannot use the backup as backupset, you must perform a  second, separate backup.

Proposal to create a weekly KEEP backup

Problems with simple solution

The basic idea is that you perform a weekly full backup, along with daily incremental backups that are kept for 30 days. After the 30 day retention, just the full backups (along with archive logs to defuzzy) are kept for an additional 30 days.

The most obvious way to do this is to

  •  Set the RMAN retention 30 days
  • Create a weekly full backup that is a KEEP backup with an until time of 60 days in the future.
  • Create a daily incremental backup that NOT a keep backup.
  • Create archive backups as normal.
  • Allow delete obsolete to remove the "non-KEEP" backups after 30 days.
.
Unfortunately when you create an incremental backups, and there is only KEEP backups proceeding it, the incremental Level 1 backup is forced into an incremental level 0 backups.  And with delete obsolete, if you look through MOS note "RMAN Archival (KEEP) backups and Retention Policy (Doc ID 986382.1)" you find that the incremental backups and archive logs are kept for 60 days because there is no proceeding non-KEEP backup.


Solution

The solution is to use tags, mark the weekly full as a keep after a week, and use the "delete backups completed before tag='xx'" command.

Weekly full backup scripts

run
{
   backup archivelog all filesperset=20  tag ARCHIVE_ONLY delete input;
   change backup tag='INC_LEVEL_0'  keep until time 'sysdate+53';
   backup incremental level 0 database tag='INC_LEVEL_0' filesperset=20  plus archivelog filesperset=20 tag='INC_LEVEL_0';

  delete backup completed before 'sysdate-61' tag= 'INC_LEVEL_0';
  delete backup completed before 'sysdate-31' tag= 'INC_LEVEL_1';
  delete backup completed before 'sysdate-31' tag= 'ARCHIVE_ONLY';
}

Daily Incremental backup scripts

run
{
  backup incremental level 1 database tag='INC_LEVEL_1'  filesperset=20 plus archivelog filesperset=20 tag='INC_LEVEL_1';
}

Archive log sweep backup scripts

run
{
  backup archivelog all tag='ARCHIVE_ONLY' delete input;
}


Example

I then took these scripts, and built an example using a 7 day recovery window.  My full backup commands are below.
run
{
   backup archivelog all filesperset=20  tag ARCHIVE_ONLY delete input;
   change backup tag='INC_LEVEL_0'  keep until time 'sysdate+30';
   backup incremental level 0 database tag='INC_LEVEL_0' filesperset=20  plus archivelog filesperset=20 tag='INC_LEVEL_0';

  delete backup completed before 'sysdate-30' tag= 'INC_LEVEL_0';
  delete backup completed before 'sysdate-8' tag= 'INC_LEVEL_1';
  delete backup completed before 'sysdate-8' tag= 'ARCHIVE_ONLY';
}


First I am going to perform a weekly backup and incremental backups for 7 days to see how the settings affect the backup pieces in RMAN.

for Datafile #1.


 File# Checkpoint Time   Incr level Incr chg# Chkp chg# Incremental Typ Keep Keep until Keep options    Tag
------ ----------------- ---------- --------- --------- --------------- ---- ---------- --------------- ---------------
     3 06-01-23 00:00:06          0         0   3334337 FULL            NO                              INC_LEVEL_0
     3 06-02-23 00:00:03          1   3334337   3334513 INCR1           NO                              INC_LEVEL_1
     3 06-03-23 00:00:03          1   3334513   3334665 INCR1           NO                              INC_LEVEL_1
     3 06-04-23 00:00:03          1   3334665   3334805 INCR1           NO                              INC_LEVEL_1
     3 06-05-23 00:00:03          1   3334805   3334949 INCR1           NO                              INC_LEVEL_1
     3 06-06-23 00:00:03          1   3334949   3335094 INCR1           NO                              INC_LEVEL_1
     3 06-07-23 00:00:03          1   3335094   3335234 INCR1           NO                              INC_LEVEL_1

for  archive logs

Sequence# First chg# Next chg# Create Time       Keep Keep until Keep options    Tag
--------- ---------- --------- ----------------- ---- ---------- --------------- ---------------
      625    3333260   3334274 15-JUN-23         NO                              ARCHIVE_ONLY
      626    3334274   3334321 01-JUN-23         NO                              INC_LEVEL_0
      627    3334321   3334375 01-JUN-23         NO                              INC_LEVEL_0
      628    3334375   3334440 01-JUN-23         NO                              ARCHIVE_ONLY
      629    3334440   3334490 01-JUN-23         NO                              INC_LEVEL_1
      630    3334490   3334545 02-JUN-23         NO                              INC_LEVEL_1
      631    3334545   3334584 02-JUN-23         NO                              ARCHIVE_ONLY
      632    3334584   3334633 02-JUN-23         NO                              INC_LEVEL_1
      633    3334633   3334695 03-JUN-23         NO                              INC_LEVEL_1
      634    3334695   3334733 03-JUN-23         NO                              ARCHIVE_ONLY
      635    3334733   3334782 03-JUN-23         NO                              INC_LEVEL_1
      636    3334782   3334839 04-JUN-23         NO                              INC_LEVEL_1
      637    3334839   3334876 04-JUN-23         NO                              ARCHIVE_ONLY
      638    3334876   3334926 04-JUN-23         NO                              INC_LEVEL_1
      639    3334926   3334984 05-JUN-23         NO                              INC_LEVEL_1
      640    3334984   3335023 05-JUN-23         NO                              ARCHIVE_ONLY
      641    3335023   3335072 05-JUN-23         NO                              INC_LEVEL_1
      642    3335072   3335124 06-JUN-23         NO                              INC_LEVEL_1
      643    3335124   3335162 06-JUN-23         NO                              ARCHIVE_ONLY
      644    3335162   3335211 06-JUN-23         NO                              INC_LEVEL_1
      645    3335211   3335273 07-JUN-23         NO                              INC_LEVEL_1
      646    3335273   3335311 07-JUN-23         NO                              ARCHIVE_ONLY


Next I'm going to execute the weekly full backup script that changes the last backup to a keep backup to see how the settings affect the backup pieces in RMAN.

for Datafile #1.
 File# Checkpoint Time   Incr level Incr chg# Chkp chg# Incremental Typ Keep Keep until Keep options    Tag
------ ----------------- ---------- --------- --------- --------------- ---- ---------- --------------- ---------------
     3 06-01-23 00:00:06          0         0   3334337 FULL            YES  08-JUL-23  BACKUP_LOGS     INC_LEVEL_0
     3 06-02-23 00:00:03          1   3334337   3334513 INCR1           NO                              INC_LEVEL_1
     3 06-03-23 00:00:03          1   3334513   3334665 INCR1           NO                              INC_LEVEL_1
     3 06-04-23 00:00:03          1   3334665   3334805 INCR1           NO                              INC_LEVEL_1
     3 06-05-23 00:00:03          1   3334805   3334949 INCR1           NO                              INC_LEVEL_1
     3 06-06-23 00:00:03          1   3334949   3335094 INCR1           NO                              INC_LEVEL_1
     3 06-07-23 00:00:03          1   3335094   3335234 INCR1           NO                              INC_LEVEL_1
     3 06-08-23 00:00:07          0         0   3335715 FULL            NO                              INC_LEVEL_0


for archive logs


Sequence# First chg# Next chg# Create Time       Keep Keep until Keep options    Tag
--------- ---------- --------- ----------------- ---- ---------- --------------- ---------------
      625    3333260   3334274 15-JUN-23         NO                              ARCHIVE_ONLY
      626    3334274   3334321 01-JUN-23         YES  08-JUL-23  BACKUP_LOGS     INC_LEVEL_0
      627    3334321   3334375 01-JUN-23         YES  08-JUL-23  BACKUP_LOGS     INC_LEVEL_0
      628    3334375   3334440 01-JUN-23         NO                              ARCHIVE_ONLY
      629    3334440   3334490 01-JUN-23         NO                              INC_LEVEL_1
      630    3334490   3334545 02-JUN-23         NO                              INC_LEVEL_1
      631    3334545   3334584 02-JUN-23         NO                              ARCHIVE_ONLY
      632    3334584   3334633 02-JUN-23         NO                              INC_LEVEL_1
      633    3334633   3334695 03-JUN-23         NO                              INC_LEVEL_1
      634    3334695   3334733 03-JUN-23         NO                              ARCHIVE_ONLY
      635    3334733   3334782 03-JUN-23         NO                              INC_LEVEL_1
      636    3334782   3334839 04-JUN-23         NO                              INC_LEVEL_1
      637    3334839   3334876 04-JUN-23         NO                              ARCHIVE_ONLY
      638    3334876   3334926 04-JUN-23         NO                              INC_LEVEL_1
      639    3334926   3334984 05-JUN-23         NO                              INC_LEVEL_1
      640    3334984   3335023 05-JUN-23         NO                              ARCHIVE_ONLY
      641    3335023   3335072 05-JUN-23         NO                              INC_LEVEL_1
      642    3335072   3335124 06-JUN-23         NO                              INC_LEVEL_1
      643    3335124   3335162 06-JUN-23         NO                              ARCHIVE_ONLY
      644    3335162   3335211 06-JUN-23         NO                              INC_LEVEL_1
      645    3335211   3335273 07-JUN-23         NO                              INC_LEVEL_1
      646    3335273   3335311 07-JUN-23         NO                              ARCHIVE_ONLY
      647    3335311   3335652 07-JUN-23         NO                              ARCHIVE_ONLY
      648    3335652   3335699 08-JUN-23         NO                              INC_LEVEL_0
      649    3335699   3335760 08-JUN-23         NO                              INC_LEVEL_0
      650    3335760   3335833 08-JUN-23         NO                              ARCHIVE_ONLY


Finally I'm going to execute the weekly full backup script that changes the last backup to a keep backup and this time it will delete the older backup pieces to see how the settings affect the backup pieces in RMAN.

for Datafile #1.

File# Checkpoint Time   Incr level Incr chg# Chkp chg# Incremental Typ Keep Keep until Keep options    Tag
------ ----------------- ---------- --------- --------- --------------- ---- ---------- --------------- ---------------
     3 06-01-23 00:00:06          0         0   3334337 FULL            YES  15-JUL-23  BACKUP_LOGS     INC_LEVEL_0
     3 06-08-23 00:00:07          0         0   3335715 FULL            YES  15-JUL-23  BACKUP_LOGS     INC_LEVEL_0
     3 06-09-23 00:00:03          1   3335715   3336009 INCR1           NO                              INC_LEVEL_1
     3 06-10-23 00:00:03          1   3336009   3336183 INCR1           NO                              INC_LEVEL_1
     3 06-11-23 00:00:03          1   3336183   3336330 INCR1           NO                              INC_LEVEL_1
     3 06-12-23 00:00:03          1   3336330   3336470 INCR1           NO                              INC_LEVEL_1
     3 06-13-23 00:00:03          1   3336470   3336617 INCR1           NO                              INC_LEVEL_1
     3 06-14-23 00:00:04          1   3336617   3336757 INCR1           NO                              INC_LEVEL_1
     3 06-15-23 00:00:07          0         0   3336969 FULL            NO                              INC_LEVEL_0



for archive logs

Sequence# First chg# Next chg# Create Time       Keep Keep until Keep options    Tag
--------- ---------- --------- ----------------- ---- ---------- --------------- ---------------
      626    3334274   3334321 01-JUN-23         YES  15-JUL-23  BACKUP_LOGS     INC_LEVEL_0
      627    3334321   3334375 01-JUN-23         YES  15-JUL-23  BACKUP_LOGS     INC_LEVEL_0
      647    3335311   3335652 07-JUN-23         NO                              ARCHIVE_ONLY
      648    3335652   3335699 08-JUN-23         YES  15-JUL-23  BACKUP_LOGS     INC_LEVEL_0
      649    3335699   3335760 08-JUN-23         YES  15-JUL-23  BACKUP_LOGS     INC_LEVEL_0
      650    3335760   3335833 08-JUN-23         NO                              ARCHIVE_ONLY
      651    3335833   3335986 08-JUN-23         NO                              INC_LEVEL_1
      652    3335986   3336065 09-JUN-23         NO                              INC_LEVEL_1
      653    3336065   3336111 09-JUN-23         NO                              ARCHIVE_ONLY
      654    3336111   3336160 09-JUN-23         NO                              INC_LEVEL_1
      655    3336160   3336219 10-JUN-23         NO                              INC_LEVEL_1
      656    3336219   3336258 10-JUN-23         NO                              ARCHIVE_ONLY
      657    3336258   3336307 10-JUN-23         NO                              INC_LEVEL_1
      658    3336307   3336359 11-JUN-23         NO                              INC_LEVEL_1
      659    3336359   3336397 11-JUN-23         NO                              ARCHIVE_ONLY
      660    3336397   3336447 11-JUN-23         NO                              INC_LEVEL_1
      661    3336447   3336506 12-JUN-23         NO                              INC_LEVEL_1
      662    3336506   3336544 12-JUN-23         NO                              ARCHIVE_ONLY
      663    3336544   3336594 12-JUN-23         NO                              INC_LEVEL_1
      664    3336594   3336639 13-JUN-23         NO                              INC_LEVEL_1
      665    3336639   3336677 13-JUN-23         NO                              ARCHIVE_ONLY
      666    3336677   3336734 13-JUN-23         NO                              INC_LEVEL_1
      667    3336734   3336819 14-JUN-23         NO                              INC_LEVEL_1
      668    3336819   3336857 14-JUN-23         NO                              ARCHIVE_ONLY
      669    3336857   3336906 14-JUN-23         NO                              ARCHIVE_ONLY
      670    3336906   3336953 15-JUN-23         NO                              INC_LEVEL_0
      671    3336953   3337041 15-JUN-23         NO                              INC_LEVEL_0
      672    3337041   3337113 15-JUN-23         NO                              ARCHIVE_ONLY


Result

For my datafiles, I still have the weekly full backup, and it is a keep backup. For my archive logs, I still have the archive logs that were part of the full backup which are needed to de-fuzzy my backup.


Restore Test


Now for the final test using the next chg# on the June 1st archive logs 3334375;


RMAN> restore database until scn=3334375;

Starting restore at 15-JUN-23
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=259 device type=DISK
...
channel ORA_DISK_1: piece handle=/u01/ocidb/backups/da1tiok6_1450_1_1 tag=INC_LEVEL_0
channel ORA_DISK_1: restored backup piece 1
...
channel ORA_DISK_1: reading from backup piece /u01/ocidb/backups/db1tiola_1451_1_1
channel ORA_DISK_1: piece handle=/u01/ocidb/backups/db1tiola_1451_1_1 tag=INC_LEVEL_0
channel ORA_DISK_1: restored backup piece 1

RMAN> recover database until scn=3334375;
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=627
channel ORA_DISK_1: reading from backup piece /u01/ocidb/backups/dd1tiom8_1453_1_1
channel ORA_DISK_1: piece handle=/u01/ocidb/backups/dd1tiom8_1453_1_1 tag=INC_LEVEL_0
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archived log file name=/u01/app/oracle/product/19c/dbhome_1/dbs/arch1_627_1142178912.dbf thread=1 sequence=627
media recovery complete, elapsed time: 00:00:00
Finished recover at 15-JUN-23
RMAN> alter database open resetlogs;

Statement processed



Success !

Monday, June 5, 2023

Autotuned_reserved_space is a new feature on the ZDLRA that you should be using

 Autotuned_reserved_space is a new policy setting that got released with 21.1 and you should be using it. When I talk to customers about how to manage databases on a ZDLRA, the biggest confusion comes in when I talk about reserved space.  Reserved space needs to be understood, and properly managed. This new feature in 21.1 allows the ZDLRA to handle the reserved space for you, and I explain how to use it in this blog post.  First let's go through space usage, and reserved space in general.

space usage ZDLRA

Space usage on the ZDLRA. 


Recovery Window goal (which drives the space utilization)

The recovery window goal is set at the policy level, and this value (in days) is the number of days that you want to keep as a recovery window for all databases that are a member of this policy.  This will drive the space utilization.

Total space

The ZDLRA comes with all the space pre-allocated.  When you are looking at OEM, or in the SAR report you will see the total space listed. You want to make sure that you have enough space for your database backups and any incoming new backups.

Used Space

When the ZDLRA purges backups beyond the the Recovery Window Goal that you set, if does a bulk purge of backups.  This can be controlled by setting the maximum disk backup retention in days (which defaults to 1.5 times the recovery window goal).  Because of the bulk purge, more space is shown as used than is needed to support your recovery window goal.

Recovery Window Space

This is the amount space that is needed to support the recovery window goal.  Because, of the bulk purge, the recovery window space is less than the used space.


Reserved space

In order to control what happens with space, the concept of reserved space is used.  When a database is added to the ZLDRA, the reserved space value is set for this database.  This value should be updated regularly to ensure that there is enough space for the database backups to be stored.

The important things to know about reserved space are:
  • The sum of all the reserved space cannot be greater than the total space available on the ZDLRA.
  • When adding a new database, it's reserved space must fit within the unreserved space.
  • When a new database is added, the reserved space must be set to least the size of the database, and defaults to 2.5 times the size of the database.
  • The reserved space for a database needs to be at least the size of the largest datafile.
  • The reserved space should be larger than the amount of space needed to support the recovery window goal space for the database.  For databases with fluctuation, you need to reserve space for the peak usage. 
The reserved space serves two purposes when properly set
  1. It can be used to determine how much space is available for new database backups.
  2. If the ZDLRA determines that it does not have enough space to support the recovery window goal of the supported databases, space is reclaimed from databases whose reserved space is too small.
It is critical to keep the reserved space updated, and many customers have used an automated process to set the reserved space to "recovery window space needed" + 10%

Unfortunately configuring an automated process for all databases does not take into account any fluctuations in usage.  Let's say I have a database which is much busier at months end, I want to make that sure my reserved space is not adjusted down to the low value, I want it to stay adjusted based on the highest space usage value.

Autotuned_reserved_space 


This where autotuned reserved space can help you manage the reserved space.  This setting is controlled at the policy level.  

AUTOTUNED_RESERVED_SPACE

This value is set at the protection policy level and contains either "YES" or "NO", and defaults to "NO". "YES" will allow the ZDLRA to manage reserved space automatically for all databases (whose disk_reserved_space is not set) and are a member of this policy.

MAX_RESERVED_SPACE


This value is also set at the protection policy level.  This value is optional for autotuned_reserved_space, but if set, it will control the maximum amount of reserved space that can be set for an individual database in the protection policy. 

AUTOTUNE_SPACE_LIMIT


This value is set at the storage level for ALL databases. This sets a reserved space usage limit, where autotuning can slow down large reserved space increases. When reached, autotune will limit databases from increasing their reserved space growth to 10% per week.  This value is optional and will default to the total space if not set.  


SUMMARY:

  • autotuned_reserved_space - Enables autotuning of space within a protection policy
  • max_reserved_space - Controls the maximum reserved space of databases in a protection policy
  • autotune_space_limit - Slows the reserved space growth when a specified space limit is reached.

What does autotune reserved space do ?

  • On a regular basis, if needed, the reserved space for each autotune controlled database is adjusted to reserve space for the recovery window goal, and incoming backups.
  • If the database has a disk_reserved_space set, autotuning will not be used for this database.  It is assumed that the disk_reserved_space will be set manually for this database

Autotune  will replace the need for the ZDLRA admin to constantly update the reserved space for each database, as it's space needs change over time. It will also allow them to configure a constant reserved space for databases with fluctuating storage usage.

Thursday, May 11, 2023

ZDLRA Validation is your best protection against Ransomware

 Validation of Oracle backups on ZDLRA is often one the most overlooked features of the product. With the rise of ransomware, the question "how to I ensure that I have validated Oracle backups" is critical.



I know there are a lot of vendors out there that provide a great solution for most generic backups. But, as you probably know, Oracle Database backups are different from other system backups and they provide unique challenges which include

  • The backup of a large database consists of 100s, if not 1000s of backup pieces. All of which are necessary to successfully restore the database.
  • Oracle Database backups won't contain "ransomware signatures" or any easy way of determining if the backup pieces are tainted.
  • Oracle Database backups are in a proprietary format that can only be validated by performing a "restore validate" which reads, and validates  the contents of Oracle database backup pieces.

How ZDLRA provides superior validation

Backups land on flash during ingest


When backup pieces are sent to the ZDLRA during backup, they land on Flash Storage and are quarantined  within the ZDLRA waiting to be validated.  

Backup pieces are validated


The ZDLRA will then examine arriving backup pieces.  The internal metadata is read and the contents of the backup pieces are validated block-by-block.  This ensures that before storing the backup pieces, they are confirmed to be Oracle Database backup pieces, containing valid Oracle Blocks.

 Backup pieces are stored and virtual full created


Once the backup piece is examined, and the metadata is read, the individual validated blocks are stored on disk compressed.  The blocks are indexed, and a virtual full backup is built.  The final step in the process is to update the RMAN catalog on the ZDLRA with an entry pointing to the virtual full.

Weekly validation for both block content, and restore continuity


On a weekly basis all backups on the ZDLRA undergo a "restore validate" which will validate that all the backup pieces are valid, usable backup pieces. This is critical with an "incremental forever" strategy to ensure that unchanged blocks are valid.  Along with checking for the integrity of the backup piece, the ZDLRA also checks for "Restore Continuity". I know this a term I made up. The idea is that whatever time/SCN you choose within the recovery window, the ZDLRA ensures that ALL backup pieces needed to recover are available.  This is similar to performing a "restore preview" of all time periods to ensure that all backup pieces are available for recovery.


Validation during replication

Replication of backup pieces from one ZDLRA to another takes this process one step further.
Along with all the same validation that  occurs when the ZDLRA receives backups from databases,  the upstream ZDLRA also catalogs the replicated copy of the backup pieces.

ZDLRA in a Cyber vault





This is where all the pieces come together. The ZDLRA not only utilizes it's validated, incremental forever strategy to keep replication traffic to a minimum, but it also ensures that backups pieces are validated PRIOR to cataloging them.

The ZDLRA has a number of advantages in a Cyber vault scenario
  • Replication traffic is much smaller than most solutions which require a Weekly Full backup. The ZDLRA uses incremental forever.
  • Backup pieces are quarantined after arrival in the vault to ensure tainted backups are not included in restore plans. This process is similar to what other vendors do to check for ransomware. The ZDLRA goes one step further by using the proprietary knowledge of Oracle Blocks to ensure all backup, and blocks within the backups are valid.
  • Backups stored within the ZDLRA in the vault are validated on a weekly basis for both content, and continuity to ensure a restore will be successful.
  • The upstream sending the backup pieces catalogs what backups are in the vault, and can resend any backup pieces if necessary.

I hope this helps you understand better why the ZDLRA provides superior ransomware protection.