Data Sharing has become a big topic recently, and Oracle Cloud has added some new services to allow you to share data from an Autonomous Database. But how do you do this with your on-premises database ? In this post I show you how to use ZFS as your data sharing platform.
Data Sharing
Being able to securely share data between applications is a critical feature in todays world. The Oracle Database is often used to consolidate and summarize collected data, but is not always the platform for doing analysis. The Oracle Database does have the capability to analyze data, but tools such as Jupyter Notebooks, Tableu, Power Bi, etc are typically the favorites of Data Scientists and data analysts.
The challenge is how to give access to specific pieces of data in the database without providing access to the database itself. The most common solution is to use object storage and pre-authenticated URLs. Data is extracted from the database based on the user and stored in object storage in a sharable format (JSON for example). With this paradigm, and pictured above, you can create multiple datasets that contain a subset of the data specific to the users needs and authorization. The second part is the use of a pre-authenticated URL. This is a dynamically created URL that allows access to the object without authentication. Because it contains a long string of random characters, and is only valid for a specified amount of time, it can be securely shared with the user.
My Environment
For this post, I started with an environment I had previously configured to use DBMS_CLOUD. My database is a 19.20 database. In that database I used the steps specified in the MOS note and my blog (information can be found here) to use DBMS_CLOUD.
My ZFSSA environment is using 8.8.63, and I did all of my testing in OCI using compute instances.
For preparation to test I had
Installed DBMS_CLOUD packages into my database using MOS note #2748362.1
Downloaded the certificate for my ZFS appliance using my blog post and added them to wallet.
Added the DNS/IP addresses to the DBMS_CLOUD_STORE table in the CDB.
Created a user in my PDB with authority to use DBMS_CLOUD
Created a user on ZFS to use for my object storage authentication (Oracle).
Configured the HTTP service for OCI
Added my public RSA key from my key par to the OCI service for authentication.
Created a bucket
NOTE: In order to complete the full test, there were 2 other items I needed to do.
1) Update the ACL to also access port 80. The DBMS_CLOUD configuration sets ACLs to access websites using port 443. During my testing I used port 80 (http vs https).
2) I granted execute on DBMS_CRYPTO to my database user for the testing.
Step #1 Create Object
The first step was to create an object from a query in the database. This simulated pulling a subset of data (based on the user) and writing it to a object so that it could be shared. To create the object I used the DBMS_CLOUD.EXPORT_DATA package. Below is the statement I executed.
BEGIN
DBMS_CLOUD.EXPORT_DATA(
credential_name => 'ZFS_OCI2',
file_uri_list =>'https://zfs-s3.zfsadmin.vcn.oraclevcn.com/n/zfs_oci/b/db_back/o/shareddata/customer_sales.json',
format => '{"type" : "JSON" }',
query => 'SELECT OBJECT_NAME,OBJECT_TYPE,STATUS FROM user_objects');
END;
/
In this example:
CREDENTIAL_NAME - Refers to my authentication credentials I had previously created in my database.
FILE_URI_LIST - The name and location of the object I want to create on the ZFS object storage.
FORMAT - The output is written in JSON format
QUERY - This is the query you want to execute and store the results in the object storage.
As you can see, it would be easy to create multiple objects that contain specific data by customizing the query, and naming the object appropriately.
In order to get the proper name of the object I then selected the list of objects from object storage.
set pagesize 0
SET UNDERLINE =
col object_name format a25
col created format a20
select object_name,to_char(created,'MM/DD/YY hh24:mi:ss') created,bytes/1024 bytes_KB
from dbms_cloud.list_objects('ZFS_OCI2', 'https://zfs-s3.zfsadmin.vcn.oraclevcn.com/n/zfs_oci/b/db_back/o/shareddata/');
customer_sales.json_1_1_1.json 12/19/23 01:19:51 3.17382813
From the output, I can see that my JSON file was named 'customer_sales.json_1_1_1.json'.
Step #2 Create Pre-authenticated URL
The Package I ran to do this is below. I am going to break down the pieces into multiple sections. Below is the full code.
Step #2a Declare variables
The first part of the pl/sql package declares the variables that will be used in the package. Most of the variables are normal VARCHAR variables, but there a re a few other variable types that are specific to the packages used to encrypt and send the URL request.
sType,kType - These are constants used to sign the URL request with RSA 256 encryption
utl_http.req,utl_http.resp - These are request and response types used when accessing the object storage
json_obj - This type is used to extract the url from the resulting JSON code returned from the object storage call.
Step #2b Set variables
In this section of code I set the authentication information along with the host, and the private key part of my RSA public/private key pair.
I also set a variable with the current date time, in the correct GMT format.
NOTE: This date time stamp is compared with the date time on the ZFSSA. It must be within 5 minutes of the ZFSSA date/time or the request will be rejected.
Step #2c Set JSON body
In this section of code, I build the actual request for the pre-authenticated URL. The parameters for this are...
accessType - I set this to "ObjectRead" which allows me to create a URL that points to a specific object. Other options are Write, and ReadWrite.
bucketListingAction - I set this to "Deny", This disallows the listing of objects.
bucketName - Name of the bucket
name - A name you give the request so that it can be identified later
namespace/namepaceName - This is the ZFS share
objectName - This is the object name on the share that I want the request to refer to.
timeExpires - This is when the request expires.
NOTE: I did not spend the time to do a lot of customization to this section. You could easily make the object name a parameter that is passed to the package along with the bucketname etc. You could also dynamically set the expiration time based on sysdate. For example you could have the request only be valid for 24 hours by dynamically setting the timeExpires.
The last step in this section of code is to create a sha256 digest of the JSON "body" that I am sending with the request. I created it using the dbms_crypto.hash.
Step #2d Create the signing string
This section builds the signing string that is encrypted. This string is set in a very specific format. The string that is build contains.
(request-target): post /oci/n/{share name}/b/{bucket name}/p?compartment={compartment}
date: {date in GMT}
host: {ZFS host}
x-content-sha256: {sha256 digest of the JSON body parameters}
content-type: application/json
content-length: {length of the JSON body parameters}
NOTE: This signing string has to be created with the line feeds.
The final step in this section is sign the signing string with the private key.
In order to sign the string the DBMS_CRYPTO.SIGN package is used.
Step #2e Build the signature from the signing string
This section takes the signed string that was built in the prior step and encodes the string in Base 64. This section uses the utl_encode.base64_encode package to sign the raw string and it is then converted to a varchar.
Note: The resulting base64 encoded string is broken into 64 character sections. After creating the encoded string, I loop through the string, and combine the 64 character sections into a single string.
This took the most time to figure out.
Step #2f Create the authorization header
This section dynamically builds the authorization header that is sent with the call. This section includes the authentication parameters (tenancy OCID, User OCID, fingerprint), the headers (these must be in the order they are sent), and the signature that was created in the last 2 steps.
Step #2g Send a post request and header information
The next section sends the post call to the ZFS object storage followed by each piece of header information. After header parameters are sent, then the JSON body is sent using the utl_http.write_text call.
Step #2h Loop through the response
This section gets the response from the POST call, and loops through the response. I am using the json_object_t.parse call to create a JSON type, and then use the json_obj.get to retrieve the unique request URI that is created.
Finally I display the resulting URI that can be used to retrieve the object itself.
Documentation
There were a few documents that I found very useful to give me the correct calls in order to build this package.
Signing request documentation : This document gave detail on the parameters needed to send get or post requests to object storage. This document was very helpful to ensure that I had created the signature properly.
OCI rest call walk through : This post was the most helpful as it gave an example of a GET call and a PUT call. I was first able to create a GET call using this post, and then I built on it to create a GET call.
The latest release of ZFS (8.8.63) contains 2 new features associated with File Retention lock.
File retention (deletion or hold) after file expiration
Allow permission changes on retained files.
File retention on expiry policy
This new setting for projects/shares defaults to "off" which is the normal behavior of unlocking files, but leaving them on the filesystem. In order to delete files you need to wait until the retention period expires, and then you can delete the file.
There are 2 new settings you can use to work with locked files to change this behavior.
Delete
When set to "Delete", files will be immediately deleted when their retention lock expires. This can be very useful if you want files to be automatically cleaned up at the end of their retention without having to create a deletion process.
There are a few items to be aware of the automatic deletion process.
DO NOT use this with an RMAN retention window. Customers typically use a weekly full/daily incremental backup strategy with RMAN. With this strategy, a weeks worth of backups (all dependent on the oldest full backup) are deleted together. Deleting backups as soon as they expire would delete backups too soon. Even with archival backups I recommend letting RMAN perform the deletions, otherwise you risk having a file deleted too early.
Be careful changing this setting on an existing share. This setting takes effect immediately and will affect ALL files that have a retention lock. Any files that were locked, but their retention lock expired will be deleted when this setting is applied.
Hold
When set to "Hold", any files that have, or have had a retention lock will be affected. This setting immediately prevents the deletion of all retention locked files until the hold is removed regardless of when the lock is set to expire. Keep in mind that while a hold is in place, the files still have a retention lock with an expiration date.
Removing the hold: When you remove the hold, the normal expiration date takes effect. If you remove the hold by changing the expiry policy to "Delete", ALL files that have an expired retention will be immediately deleted. If you change the expire policy to "Off", the files remain, and you must manually delete them.
NOTE: Be very careful when changing the Expiry Policy. The new setting immediately affects existing files, not just new files going forward, unlike the other file retention settings.
Allow permission changes on retained files
What happens normally : When a file has a retention lock set, you are protecting this file from both being deleting AND from being updated. Because you were not allowed to update the file permissions, you were not able to change the settings from the default of -r--r-----+ while the file was locked.
This could be an issue depending on what type of file you are protecting. There are some cases where you want to make this file either
An executable file, not just a read only file.
readable by any user.
When you attempted to make this change to a locked file the update would fail with a "Operation not permitted".
-r--r-----+ 1 oracle oinstall 792 Dec 4 2023 testfile
[oracle@ssh-server rmanbackups]$ ls -al testfile
-r--r-----+ 1 oracle oinstall 792 Dec 4 20:26 testfile
[oracle@ssh-server rmanbackups]$ chmod 550 testfile
chmod: changing permissions of 'testfile': Operation not permitted
[oracle@ssh-server rmanbackups]$ chmod 444 testfile
chmod: changing permissions of 'testfile': Operation not permitted
[oracle@ssh-server rmanbackups]$
What this setting does: When you check the setting for "Allow permission changes on retained files" you are IMMEDIATELY able to change the permissions on files that are locked. The files are still protected from making them writable, but you can adjust both the "r" - read and "x" execute settings for all users.
NOTE: this setting does take affect immediately and will affect all currently locked files regardless of when they were created.
One topic that has been coming a lot as customers look at options for offsite protected backups, is the use of the Oracle Database Backup Cloud Service. This service can be used either directly from the database itself leveraging an RMAN tape library, or by performing a copy-to-cloud from the ZDLRA. In this post I will try to consolidate all the information I can find on this topic to get you started.
Overview
The best place to start is by downloading, and reading through this technical brief.
This document walks you through what the service is and how to implement it. Before you go forward with the Backup Cloud Service I suggest you download the install package and go through how to install it.
The key points I saw in this document are
RMAN encryption is mandatory - In this brief you will see that the backups being sent to OCI MUST be encrypted, and the brief explains how to create an encrypted backup. Included in the Backup Cloud Service is the use of encryption and compression (beyond basic compression) without requiring the ASO, or ACO license.
How to install the client files - The brief explains the parameters that are needed to install the client files, and what the client files are that get installed. I will go into more detail later on explaining additional features that have been added recently.
Config file settings including host - The document explains the contents of the configuration file used by the Backup Cloud Service library. It also explains how to determine the name of the host (OCI endpoint) based on the region you are sending the backups to.
Channel configuration example - There is an example channel configuration to show you how to connect to the service.
Best practices - The document includes sample scripts and best practices to use when using the Backup Cloud Service.
Lifecycle policies and storage tiers - This is an important feature of using the Backup Cloud Service, especially for long term archival backups. You most likely want have backups automatically moved to low cost archival storage after uploading to OCI.
NOTE: When using lifecycle polies to manage the storage tiers it is best to set the "-enableArchiving" and "-archiveAfterBackup" parameters when installing the backup module for a new bucket. There are small metadata files that MUST remain in standard storage, and the installation module creates a lifecycle rule with the bucket that properly archives backup pieces, leaving the metadata in standard storage.
Download
The version of the library on OTN (at the time I am writing this) is NOT the current release of the library, and that version does not support retention lock of objects.
Documentation on the newer features can be found here, using retention lock can be found here, and there is a oci_readme.txt file that contains all the parameters available.
Updates
There were a few updates since the tech brief was written, and I will summarize the important ones here. I also spoke the PM who is working on an updated brief that will contain this new information.
newRSAKeyPair - The installer is now able to generate the key pair for you making it much easier to generate new key pair. In order to have the installer ONLY create a new key pair pair, just pass the installer the "walletDir" parameter. The installer will generate both a public and private key, and place them in the walletDir (see below).
/u01/app/oracle/product/19c/dbhome_1/jdk/bin/java -jar oci_install.jar -newRSAKeyPair -walletDir /home/oracle/oci/wallet
Oracle Database Cloud Backup Module Install Tool, build 19.18.0.0.0DBBKPCSBP_2023-09-21
OCI API signing keys are created:
PRIVATE KEY --> /home/oracle/oci/wallet/oci_pvt
PUBLIC KEY --> /home/oracle/oci/wallet/oci_pub
Please upload the public key in the OCI console.
Once you generate the public/private key, you can upload the public key to the OCI console. This will show you the fingerprint, and you can execute the installer using the private key file.
"immutable-bucket" and "temp-metadata-bucket" - The biggest addition to library is the ability to support the use of retention rules on buckets containing backups. The uploading of backups is monitored by using a "heartbeat" file, and this file is deleted when the upload is successful. Because all objects in a bucket are locked, the "heartbeat" object must be managed from a second bucket without retention rules. This is the temp-metadata-bucket. When using retention rules you MUST have both buckets set in the config file.
NOTE:
I ran into 2 issues when executing this script.
1) When trying to execute the jar file, I used the default java version in my OCI tenancy that is located in "/user/bin". The installer received a java error
In order to properly execute the installer, I used the java executable located in $ORACLE_HOME/jdk/bin
2) When executing the jar file with my own RSA key that I had been previously used with OCI object storage, I received a java error.
Exception in thread "main" java.lang.RuntimeException: Could not produce a private key at oracle.backup.util.FileDownload.encode(FileDownload.java:823) at oracle.backup.util.FileDownload.addBmcAuthHeader(FileDownload.java:647) at oracle.backup.util.FileDownload.addHttpAuthHeader(FileDownload.java:169) at oracle.backup.util.FileDownload.addHttpAuthHeader(FileDownload.java:151) at oracle.backup.opc.install.BmcConfig.initBmcConnection(BmcConfig.java:437) at oracle.backup.opc.install.BmcConfig.initBmcConnection(BmcConfig.java:428) at oracle.backup.opc.install.BmcConfig.testConnection(BmcConfig.java:393) at oracle.backup.opc.install.BmcConfig.doBmcConfig(BmcConfig.java:250) at oracle.backup.opc.install.BmcConfig.main(BmcConfig.java:242) Caused by: java.security.spec.InvalidKeySpecException: java.security.InvalidKeyException: IOException : algid parse error, not a sequence
I found that this was caused by the PKCS format. I was using a PKCS1 key, and the java installer was looking for a PKCS8 key. The header in my private key file contained "BEGIN RSA PRIVATE KEY".
In order to convert my private PKCS1 key "oci_api_key.pem" to a PKCS8 key "pkcs8.key" I ran.
The next step is to execute the install. For my install I also wanted configure a lifecycle rule that would archive backups after 14 days. In order to implement this, I had the script create a new bucket "bsgtest". Below is parameters I used (note I used "..." to obfuscate the OCIDs).
This created a new bucket "bsgtest" containing a lifecycle rule.
I then added a 14 day retention rule to this bucket, and created a second bucket "bsgtest_meta" for the temporary metadata. If you want to make this rule permanent you enable retention rule lock which I highlighted on the screenshot below.
I then updated the config file to use the metadata bucket because I set a retention rule on the main bucket. Note that there is also a parameter that determines how long archival objects are cached in standard storage before they are returned to archival storage.
Once you execute the installer you will be able to begin backing up to OCI object storage. Don't forget that you need to:
Change the default device type to SBT_TAPE
Change the compression algorithm. I recommend "medium" compression.
Configure encryption for database ON.
Configure the device type SBT_TAPE to send COMPRESSED BACKUPSET to optimize throughput and storage in OCI.
Create a default channel configuration for SBT_TAPE (or allocate channels manually) that use the library that was downloaded, and point to the configuration file for the database.
If you do not use ACO and don't have a wallet , manually set an encryption password in your session.
I recommend sending a "small" backup piece first to ensure that everything is properly configured. My favorite command is
RMAN>backup incremental level 0 datafile 1;
Datafile 1 is always the system tablespace.
Below is what my configuration looks like for RMAN specifically for what I changed to use the Backup Cloud Service.
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE DEFAULT DEVICE TYPE TO 'SBT_TAPE';
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE SBT_TAPE TO '%F'; # default
CONFIGURE DEVICE TYPE 'SBT_TAPE' PARALLELISM 4 BACKUP TYPE TO COMPRESSED BACKUPSET;
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' PARMS 'SBT_LIBRARY=/home/oracle/oci/lib/libopc.so ENV=(OPC_PFILE=/home/oracle/oci/config/backupconfig.ora)';
CONFIGURE ENCRYPTION FOR DATABASE ON;
CONFIGURE ENCRYPTION ALGORITHM 'AES256'; # default
CONFIGURE COMPRESSION ALGORITHM 'MEDIUM' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE;
Network Performance
One of the big areas that comes up with using the Backup Cloud Service, is understanding the network capabilities.
The best place to start is with this MOS note
RMAN> run {
2> allocate channel foo device type sbt PARMS 'SBT_LIBRARY=/home/oracle/oci/lib/libopc.so ENV=(OPC_PFILE=/home/oracle/oci/config/backupconfig.ora)';
3> send channel foo 'NETTEST 1000M';
4> }
allocated channel: foo
channel foo: SID=431 device type=SBT_TAPE
channel foo: Oracle Database Backup Service Library VER=19.0.0.1
released channel: foo
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of send command at 11/22/2023 14:12:04
ORA-19559: error sending device command: NETTEST 1000M
ORA-19557: device error, device type: SBT_TAPE, device name:
ORA-27194: skgfdvcmd: sbtcommand returned error
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
KBHS-00402: NETTEST sucessfully completed
KBHS-00401: NETTEST RESTORE: 1048576000 bytes received in 15068283 microseconds
KBHS-00400: NETTEST BACKUP: 1048576000 bytes sent
Executing Backups
Now to put it all together I am going to execute a backup of datafile 1. My database is encrypted, so I am going to set a password along with the encryption key.
set encryption on identified by oracle;
executing command: SET encryption
RMAN> backup incremental level 0 datafile 1;
Starting backup at 22-NOV-23
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=404 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Database Backup Service Library VER=19.0.0.1
allocated channel: ORA_SBT_TAPE_2
channel ORA_SBT_TAPE_2: SID=494 device type=SBT_TAPE
channel ORA_SBT_TAPE_2: Oracle Database Backup Service Library VER=19.0.0.1
allocated channel: ORA_SBT_TAPE_3
channel ORA_SBT_TAPE_3: SID=599 device type=SBT_TAPE
channel ORA_SBT_TAPE_3: Oracle Database Backup Service Library VER=19.0.0.1
allocated channel: ORA_SBT_TAPE_4
channel ORA_SBT_TAPE_4: SID=691 device type=SBT_TAPE
channel ORA_SBT_TAPE_4: Oracle Database Backup Service Library VER=19.0.0.1
channel ORA_SBT_TAPE_1: starting incremental level 0 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/ACMEDBP/system01.dbf
channel ORA_SBT_TAPE_1: starting piece 1 at 22-NOV-23
channel ORA_SBT_TAPE_1: finished piece 1 at 22-NOV-23
piece handle=8t2c4fmi_1309_1_1 tag=TAG20231122T150554 comment=API Version 2.0,MMS Version 19.0.0.1
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:35
Finished backup at 22-NOV-23
Starting Control File and SPFILE Autobackup at 22-NOV-23
piece handle=c-1654679317-20231122-01 comment=API Version 2.0,MMS Version 19.0.0.1
Finished Control File and SPFILE Autobackup at 22-NOV-23
Restoring
Restoring is very easy as long as you have the entries in your controlfile. If you don't then there is a
script included in the installation that can catalog the backup pieces and I go through that process here.
This also allows you to display what's in the bucket.
Buckets 1 vs many
If you look at what created when executing backup you will see that there is a set format for the backup pieces. Below are the 2 backup pieces that I created
8t2c4fmi_1209_1_1 - This is the backup of datafile 1 for my database ACMEDBP
c-16546791317-20231122-01 - This is the controlfile backup for this database
Notice that the DB name is not in the name of the backup pieces, or in the visible nesting.
If you think about a medium sized database (let's say 100 datafiles), that has 2 weeks of backups (14 days), you would have 1,400 different backup pieces for the datafiles within the "sbt_catalog" directory.
My recommendation is to group small databases together in the same bucket (keeping the amount of backup pieces to a manageable level).
For large database (1,000+ datafiles), you can see where a 30 day retention could become 30,000+ backup pieces.
Having a large number of objects within a bucket increases the time to report the available backup pieces. There is no way to determine which database the object is a member of without looking at the metadata.
Keep this in mind when considering how many buckets to create.