Tuesday, October 13, 2020

ZFS Appliance - Your on-premise cloud store

 This is the first in a series of posts about using ZFS as a cloud store for your S3 bucket, or your OCI (Oracle Cloud Infrastructure) bucket in your datacenter.



I am going to walk through in the next few posts how to utilize a ZFS appliance as a cloud store.

There maybe a number of reasons why you might want to do this. Maybe you want to learn more about using cloud buckets for storage, or maybe you are already familiar with the the concept, and you want to jump right in.

As my starting point for this I am going to go through how I configured this in my home office lab so you can too.. That is provided that you have enough hardware sitting around to set it up.


Step 1 : Download an Virtual box. 

If you haven't worked with virtualbox yet, now is a good time to get started.  In order to go through and emulate what I am explaining, you are going to need at least 1 virtualbox environment for the ZFS emulator, and a second environment (virtual or bare metal) for a DB server.  Virtual box will allow you to download an image (which includes the software install) and start it up immediately.  This saves a lot of time and trouble in downloading and configuring an environment (patches, libraries, settings) etc.

Step 2 : Download Database image, and ZFS image


At a bare minimum to go through my demo, you will need 2 environments, The ZFS environment must be a virtual image, but the DB environment is up to you. Below are the 2 Virtual Box images to get you started

VB 1 : Database Virtual Box Appliance / Virtual Machine. DB Version 19.3 (as of writing)
VB 2 : Oracle ZFS Storage Simulator. Version 8.8 (as of writing)

Step 3: Patch ZFS image to latest release.


This can be done by downloading the latest patch from My Oracle support, and applying it to the running ZFS image.  This is critical for the ZFS image, as there have been recent patches to to make these features work seamlessly.

At the end of completing these 3 steps, you will have a ZFS Simulator running on at least release 8.26 (the current release as of writing) , and a database running release 19.3 (mine is running 19.6)

FULL DISCLOSURE : Since I use my DB host for a lot of other testing, I am actually executing my demos on a bare metal environment using OEL (Oracle Enterprise Linux) and  DB version 19.6.

In this first post I am going to go through the steps to configure my ZFS simulator to present itself with both an S3 bucket and an OCI bucket.

From there I will write 2 more posts on setting up security, connecting to the buckets, and backing up my database.

For documentation, these are few of the places I found start with.

To make it easy, I created one big storage pool "mypool" Double Parity. I used 10 disks, and the pool is around 35 GB. Enough to perform some simple tests with. I also created a single share "myshare" using all the default values.

Step 5: Enable S3 and OCI services.

I logged onto the ZFS storage console through my web browser as root, using the IP address I assigned during the first startup, and the password I entered during the first startup of the VB image .

Start by going to the services page for the appliance with "Configuration --> Services"
On this page you will notice that the "HTTP" service (under Data services) is greyed out and disabled.
Click on the "power button" to enable the service.



Once the service starts, it will appear green like other online Data Services.
Now click on the "HTTP" itself bringing up the configuration screen.

First let's enable S3 for the appliance.  

Note: under HTTP, I clicked on the S3 tab bringing up the screen below.   
Click on "Enable S3" and press the "APPLY" button in the top right under "Logs".


Next let's enable OCI for the appliance.

Note: under HTTP, I clicked on the S3 tab bringing up the screen below.   
Click on "Enable OCI" and press the "APPLY" button in the top right under "Logs".


Now that I enabled the S3, and OCI bucket at the appliance I need to enable it for my share. This can be done at different levels, at the Project, or at the Share.  To make things easy (since I only have 1 project and 1 share), I enabled it at the project level.
You can see below that I changed both the "S3 API Mode" and the "OCI API Mode" from "None" to "Read/Write" and applied the changes.





You can also see the URL from the project level

S3    : http://10.0.0.115/s3/v1/export/
OCI : http://10.0.0.115/oci/n/export/

When I look at the share and the protocol setting, I can see the HTTP settings are inherited from the project, and my S3 and OCI bucket URLs are shown on the page.


You can also see the URL from the share level (inherited from the project)

S3    : http://10.0.0.115/s3/v1/export/myshare
OCI : http://10.0.0.115/oci/n/export/myshare

That's all there is to configuring my ZFS appliance to act as a cloud store for both S3 buckets and OCI buckets.

My next 2 posts (I am reserving the spots here to add the links later) will cover.




Tuesday, September 29, 2020

ZDLRA - DISK_RESERVED_SPACE checkup

 I wanted to go through some very basic items to think about on disk reserved space for databases backed up to a ZDLRA.


There are couple of posts that have been written on this by both myself and Sudhakar Kotagiri.

One of the key items to concentrate on is the DISK_RESERVED_SPACE for each database.  A simple explanation of the DISK_RESERVED_SPACE is this setting represents the amount of space you set aside on a database-by-database basis to keep a backup window to support the recovery window goal.

A simple example of the DISK_RESERVED_SPACE is.....

My Full backup takes up 40 TB.  Keeping 14 days of recovery (which is my recovery window goal) takes up an additional 1 TB/day.

In this example, I need 54 TB of storage to keep 14 days of recovery.

For this database, I would set the reserved space to be  56.5 TB to ensure I have an extra 5% of space available to handle any unexpected peaks.

Easy right ?  The value for RECOVERY_WINDOW_SPACE in the RA_DATABASE view gives you the space needed to support the Recovery Window.


But.. the reason I called this a checkup is that I wanted to make sure some thought is given to the setting. If your database is changing (which it almost always is), then this needs to be reviewed and adjusted.


Below are some simple rules of thumb of what to think about when adjusting DISK_RESERVED_SPACE


  • Stable Mature Databases - If your database is mature, and the workload is consistent, the DISK_RESERVED_SPACE should be 5-10% larger than the RECOVERY_WINDOW_SPACE. This setting should be reviewed at least 4 times year to be sure it is accurate.
  • Actively Changing Databases -  If the database has a changing workload. Maybe it is still growing, or maybe new features are being added to the application.  The DISK_RESERVED_SPACE should be set at 5-10% larger than RECOVERY_WINDOW_SPACE  + Include a percentage for growth. This should be reviewed monthly (at a minimum) OR if a big growth spurt is planned.
  • Databases with Peaks -  For some business, there may be databases that have peaks. Maybe they support "Black Friday", or maybe they support huge sales around the superbowl.  The DISK_RESERVED_SPACE should be 5-10% larger than the RECOVERY_WINDOW_SPACE needed during this peak.  This will ensure that the space is available when the peak comes.
  • TDE databases - when a database migrates to TDE, there is a time period where storage is needed for the Pre-TDE backup, and the Post-TDE backup.  You need to adjust the DISK_RESERVED_SPACE to take this into account.  NOTE: Staggering the migration when you migrate to TDE can avoid running out of DISK_RESERVED_SPACE for all databases.
  • Databases with ILM - if you have databases performing ILM activities this affects backup space needed.  A simple example would be a database whose historical data is moved to an HCC tablespace when it becomes inactive.  Space needs to be reserved in DISK_RESERVED_SPACE to hold the old structure, the new structure, and the archive logs created during this change.

My suggestion to simplify this is to use PROTECTION POLICIES.  Each type of database can be in it's own protection policy.  Review the DISK_RESERVED_SPACE at the appropriate time for each policy.

It's that easy. :)


Thursday, September 24, 2020

ZDLRA - How to do a storage checkup

 One of the items that comes up with the ZDLRA is a storage checkup.  The DBAs want to know more detail about the storage utilization of each database.



Once you understand the above concepts you realize that are there 2 major pieces that affect the storage utilization for a database.

1) How much space a level 0 backup takes.  Since the ZDLRA virtualizes full backups, each database has at least 1 copy of each block on the ZDLRA.  It would be only 1 if it doesn't change, or it could 30 copies of the same block if it changes every day (like system tablespace data). What you are interested is the size of 1 full backup

2) The amount of storage 1 day of changes takes up (on average).  This would be the stored size of an incremental backup (if you perform an incremental every day), and it would be the stored size of the archive logs for a day of workload.

By combining these 2 pieces you can then calculate how much storage is needed for X number of days of backups.

Now how do I do this ? below is the query I use, and I will explain the columns in it.

select db_unique_name,
               trunc(size_estimate,0) estimated_db_size,
               recovery_window_goal,
               trunc(space_usage,0) space_usage,
               trunc(estimate_zero_day_space - ((estimate_seven_day_space - estimate_one_day_space)/6),0) level_0_size,
               trunc((estimate_seven_day_space - estimate_one_day_space)/6,1) one_day_space,
               trunc(recovery_window_space,0) recovery_window_space,
               disk_reserved_space,
estimate_rwg_space
from
(Select db_unique_name,
       Space_usage,
       extract(day from recovery_window_goal) recovery_window_goal,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,numtodsinterval(1,'day')) estimate_zero_day_space,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,numtodsinterval(1,'day')) estimate_one_day_space,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,numtodsinterval(7,'day')) estimate_seven_day_space,
       dbms_ra.estimate_space (DB_UNIQUE_NAME,recovery_window_goal) estimate_rwg_space,
        RECOVERY_WINDOW_SPACE,
       disk_reserved_space,
       size_estimate
               from ra_database);


 What's returned

DB_UNIQUE_NAME

DB name

RECOVERY_WINDOW_GOAL

How long backups are kept

SPACE_USAGE

How much space (GB) is the DB using in total ?

LEVEL_0_SIZE

Estimated size (GB) of just the full backup

ONE_DAY_SPACE

Estimated space usage (GB) for a single day of backups

RECOVERY_WINDOW_SPACE

How much space is needed for a 14 day recovery window.

DISK_RESERVED_SPACE

How much space is set aside for backups ?

ESTIMATE_DB_SIZE

How big is the database (GB) estimated to be ?

ESTIMATE_RWG_SPACE

This returns the space (GB) needed for the recovery window from RA_DATABASE which may not match calculating using the columns returned.


Now let's take a look at what I can do with this..

This is an example, where I summarize the space utilization for a couple of  ZDLRAs.



And here I looked at the detail for the databases.


This report (just above this) gives me some really useful information.

  • I can see DB02 has a really big change rate. The database is only about 2.5 TB, but it is using 14 TB of storage.
  • I can see the disk_reserved_space is way too small for DB01.  DB01 needs about 15 TB to keep it's recovery window goal, but the disk_reserved_space is only 500 GB
  • DB03 looks good.  Change rate is not significant, and disk_reserved_space is set a little higher than the RECOVERY_WINDOW_SPACE.


Now finally, I was able to use the one_day_space, and graph out the space utilization for each days Recovery window.
This graph shows each day of RWG and it's storage needs,  the currently USED, and the USABLE space.  I can see that even though my used space is close to the usable space, there is still room for growth. I can also use this to see what happens to my storage utilization if I changed the RWG to 10 days.

I highly recommend periodically doing a health check on your storage utilization, and review the disk_reserved_space.

I hope this gives some information you can use to take a closer look.

**NOTE ** this query is accurate as 9/30/20.  It might need to be adjusted with future releases.

Also, it is only as accurate as the data. The longer a database has been backing up with a consistent workload, the better the estimate.