Monday, September 29, 2025

Oracle DB 23ai supports external Apache Iceberg tables

One area that I have been spending a lot of time on is accessing data stored in object storage from within Oracle DB 23ai using external table definitions.

Parquet objects

I started by creating external tables on Parquet objects, and recently I have been testing accessing objects that are cataloged in Apache Iceberg.

Alexey Filanovskiy wrote a great blog post on Oracle Tables and parquet objects you can find here.

Reading through that blog post gives you a good foundation for why parquet formatted objects make sense for external tables. Oracle DB is able to leverage the metadata in parquet objects to optimize queries.

Defining external tables on parquet objects

Below is an example table definition for an external table accessing parquet objects. In the below example, the parquet objects were stored on ZFSSA using the OCI API.

CREATE TABLE "DW"."CUSTOMER_ADDRESS" 
   (    "CA_ADDRESS_SK" NUMBER(*,0), 
    "CA_ADDRESS_ID" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_STREET_NUMBER" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_STREET_NAME" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_STREET_TYPE" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_SUITE_NUMBER" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_CITY" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_COUNTY" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_STATE" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_ZIP" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_COUNTRY" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP", 
    "CA_GMT_OFFSET" NUMBER, 
    "CA_LOCATION_TYPE" VARCHAR2(100 BYTE) COLLATE "USING_NLS_COMP"
   )  DEFAULT COLLATION "USING_NLS_COMP" 
   ORGANIZATION EXTERNAL 
    ( TYPE ORACLE_BIGDATA
      DEFAULT DIRECTORY "DATA_PUMP_DIR"
      ACCESS PARAMETERS
      ( com.oracle.bigdata.credential.name=OCI_ZFS
        com.oracle.bigdata.fileformat=parquet
          com.oracle.bigdata.trimspaces=notrim
       )
      LOCATION
       ( 'oraclebmc://pmdemo-zs71-01-client.us.oracle.com/n/tcptest/b/tcpds_10000g_parquet/o/customer_address/*'
       )
    )
   REJECT LIMIT UNLIMITED 
  PARALLEL ;

What you will notice from this table definition is

It uses the "ORACLE_BIGDATA" type. This is necessary to create external tables using big data object types, which is currently [csv|textfile|avro|parquet|orc].
I am specifying a credential. I had already created a credential that contains the login credentials for the OCI API on my ZFSSA.
The object type for my objects is parquet.
The location of the object(s) is specified. In my case I am using an "*" as there are multiple objects within this path that make up the table.
The format of the URL is in this case "ORACLEBMC". The options are

ORACLEBMC - Specifies this is an OCI API object storage.
S3 - Specifies amazon S3 object storage
S3h- Specifies a non-standard S3 object store

You can find additional information here.

Notes:

The format of the URL is not well documented, I also found that in some cases "s3a" can be used for non-AWS s3 object storage.
DBMS_CLOUD does not allow you to use URLs other than "https". You will most likely have to create tables using the above syntax rather than using the DBMS_CLOUD package.

Apache Iceberg

Once I was able to successfully access parquet objects directly, I began testing cataloging parquet objects with Apache Iceberg using Spark.

I found the most difficult part of this was properly creating an Apache Iceberg manifest file that could be used to build an external table definition against Iceberg.

Environment to build manifest

My testing environment to build the Apache Iceberg manifest contains the following.

Python

Python 3.9.18
Below is a list of my python packages that are installed

Python 3.9.18
Package             Version
------------------- -----------
aiobotocore         2.23.0
aiohappyeyeballs    2.6.1
aiohttp             3.12.14
aioitertools        0.12.0
aiosignal           1.4.0
annotated-types     0.7.0
async-timeout       5.0.1
attrs               20.3.0
Babel               2.9.1
bcc                 0.28.0
boto3               1.39.7
botocore            1.38.27
cachetools          5.5.2
certifi             2020.12.5
cffi                1.14.5
chardet             4.0.0
circuitbreaker      1.3.2
click               8.1.8
cloud-init          23.4
cockpit             311.1
colorama            0.4.6
configobj           5.0.6
configshell-fb      1.1.30
cryptography        36.0.1
dasbus              1.4
dbus-python         1.2.18
distro              1.5.0
docutils            0.16
file-magic          0.4.0
frozenlist          1.7.0
fsspec              2025.7.0
gpg                 1.15.1
greenlet            3.2.3
halo                0.0.31
idna                2.10
Jinja2              2.11.3
jmespath            1.0.1
jsonpatch           1.21
jsonpointer         2.0
jsonschema          3.2.0
kmod                0.1
Ksplice Uptrack     1.2.84
libcomps            0.1.18
lockfile            0.12.2
log-symbols         0.0.14
markdown-it-py      3.0.0
MarkupSafe          1.1.1
mdurl               0.1.2
mmh3                5.1.0
multidict           6.6.3
netaddr             0.8.0
netifaces           0.10.6
nftables            0.1
numpy               2.0.2
nvmetcli            0.7
oauthlib            3.1.1
oci                 2.126.3
oci-utils           0.14.0
oracledb            3.2.0
pandas              2.3.1
parquet_tools       0.2.16
pciutils            2.3.6
pcp                 5.0
perf                0.1
pexpect             4.8.0
pip                 25.1.1
ply                 3.11
prettytable         0.7.2
propcache           0.3.2
psutil              5.8.0
ptyprocess          0.6.0
py4j                0.10.9.9
pyarrow             20.0.0
pycparser           2.20
pycurl              7.43.0.6
pydantic            2.11.7
pydantic_core       2.33.2
Pygments            2.19.2
PyGObject           3.40.1
pyiceberg           0.9.1
pyOpenSSL           19.0.0
pyparsing           3.2.3
pyrsistent          0.17.3
pyserial            3.4
PySocks             1.7.1
pyspark             4.0.0
python-daemon       2.2.4
python-dateutil     2.9.0.post0
python-dmidecode    3.12.2
python-linux-procfs 0.7.3
pytz                2021.1
pyudev              0.22.0
PyYAML              5.4.1
requests            2.25.1
rhnlib              2.8.6
rich                13.9.4
rpm                 4.16.1.3
s3fs                2025.7.0
s3transfer          0.13.0
sdnotify            0.3.2
selinux             3.6
sepolicy            3.6
setools             4.4.4
setroubleshoot      3.3.32
setuptools          53.0.0
six                 1.15.0
sortedcontainers    2.4.0
sos                 4.7.0
spinners            0.0.24
SQLAlchemy          2.0.41
strictyaml          1.7.3
systemd-python      234
tabulate            0.9.0
tenacity            9.1.2
termcolor           3.1.0
thrift              0.16.0
typing_extensions   4.14.0
typing-inspection   0.4.1
tzdata              2025.2
urllib3             1.26.5
urwid               2.1.2
wrapt               1.17.2
yarl                1.20.1

PySpark & Spark runtime

PySpark version: 4.0.0
Java version: 21.0.7
Scala version: version 2.13.16

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 4.0.0
      /_/

Using Scala version 2.13.16, OpenJDK 64-Bit Server VM, 21.0.7
Branch HEAD
Compiled by user wenchen on 2025-05-19T07:58:03Z
Revision fa33ea000a0bda9e5a3fa1af98e8e85b8cc5e4d4
Url https://github.com/apache/spark

Environment variables

HADOOP_OPTS= -Djava.library.path=/usr/local/hadoop/hadoop-3.4.1/lib/ -Djava.library.path=/usr/local/hadoop/hadoop-3.4.1/lib/ -Djava.library.path=/usr/local/hadoop/hadoop-3.4.1/lib/
JAVA_HOME=/usr/lib/jvm/java-21-openjdk-21.0.7.0.6-1.0.1.el9.x86_64
HADOOP_HOME=/usr/local/hadoop/hadoop-3.4.1
SPARK_HOME=/usr/local/spark
HADOOP_COMMON_LIB_NATIVE_DIR=/usr/local/hadoop/hadoop-3.4.1/lib/native/

Script to build manifest and store python files

I used the NY State Taxi data to build my objects, and it can be found here.

For my testing, I used the yellow taxi datasets.

Below is the section of my code that builds the spark catalog on OCI S3 object storage, and sets the environment using my credentials.

NOTE: It points to my endpoint and uses my access key and secret. I removed those from the script.

Python script to set the environment and hadoop catalog (Part #1)

from pyspark.sql import SparkSession

ICEBERG = "org.apache.iceberg:iceberg-spark-runtime-4.0_2.13:1.10.0"
HADOOP_CLOUD = "org.apache.spark:spark-hadoop-cloud_2.13:4.0.0"

spark = (
    SparkSession.builder
      .appName("iceberg-oci-s3")
      .config("spark.jars.packages", ",".join([ICEBERG, HADOOP_CLOUD]))
      # Iceberg catalog -> Hadoop catalog in an OCI S3 warehouse
      .config("spark.sql.catalog.oci", "org.apache.iceberg.spark.SparkCatalog")
      .config("spark.sql.catalog.oci.type", "hadoop")
      .config("spark.sql.catalog.oci.warehouse", "s3a://icebergs3/iceberg_warehouse")
      # OCI S3-compatible endpoint
      .config("spark.hadoop.fs.s3a.endpoint", "https://{my tenancy/namespace}.compat.objectstorage.us-ashburn-1.oraclecloud.com")
      .config("spark.hadoop.fs.s3a.path.style.access", "true")
      .config("spark.hadoop.fs.s3a.endpoint.region", "us-ashburn-1")
      .config("spark.hadoop.fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider")
      .config("spark.hadoop.fs.s3a.list.version", "1")
      .getOrCreate()
)


hconf = spark.sparkContext._jsc.hadoopConfiguration()
hconf.set("fs.s3a.access.key",    "{my access key}")            # OCI 'Customer Secret Key' ID
hconf.set("fs.s3a.secret.key",    "{my secret for that key}")   # the secret key
hconf.set("fs.s3a.connection.ssl.enabled", "true")

Python script to create namespace and create table (Part #2)

spark.sql("""
  CREATE NAMESPACE IF NOT EXISTS oci_db
""")

import pyarrow.parquet as pq


spark.sql("""
  CREATE NAMESPACE IF NOT EXISTS oci.tripdata
""")

spark.sql("""
  CREATE TABLE IF NOT EXISTS oci.tripdata.yellow (
  `VendorID` INT,
  `tpep_pickup_datetime` TIMESTAMP,
  `tpep_dropoff_datetime` TIMESTAMP,
  `passenger_count` BIGINT,
  `trip_distance` DOUBLE,
  `RatecodeID` BIGINT,
  `store_and_fwd_flag` STRING,
  `PULocationID` INT,
  `DOLocationID` INT,
  `payment_type` BIGINT,
  `fare_amount` DOUBLE,
  `extra` DOUBLE,
  `mta_tax` DOUBLE,
  `tip_amount` DOUBLE,
  `tolls_amount` DOUBLE,
  `improvement_surcharge` DOUBLE,
  `total_amount` DOUBLE,
  `congestion_surcharge` DOUBLE,
  `Airport_fee` DOUBLE
  )
  USING iceberg
  PARTITIONED BY  (months(tpep_pickup_datetime))
""")

Python scrypt to read parquet object/set type/append to iceberg table (Part #3)

sdf = spark.read.parquet("{parquet object}")  # Location and name of parquet object to load

from pyspark.sql import functions as F
sdf_cast = (
    sdf
    .withColumn("VendorID", F.col("VendorID").cast("int"))
    .withColumn("tpep_pickup_datetime", F.col("tpep_pickup_datetime").cast("timestamp"))
    .withColumn("tpep_dropoff_datetime", F.col("tpep_dropoff_datetime").cast("timestamp"))
    .withColumn("passenger_count", F.col("passenger_count").cast("bigint"))
    .withColumn("trip_distance", F.col("trip_distance").cast("double"))
    .withColumn("RatecodeID", F.col("RatecodeID").cast("bigint"))
    .withColumn("store_and_fwd_flag", F.col("store_and_fwd_flag").cast("string"))
    .withColumn("PULocationID", F.col("PULocationID").cast("int"))
    .withColumn("DOLocationID", F.col("DOLocationID").cast("int"))
    .withColumn("payment_type", F.col("payment_type").cast("bigint"))
    .withColumn("fare_amount", F.col("fare_amount").cast("double"))
    .withColumn("extra", F.col("extra").cast("double"))
    .withColumn("mta_tax", F.col("mta_tax").cast("double"))
    .withColumn("tip_amount", F.col("tip_amount").cast("double"))
    .withColumn("tolls_amount", F.col("tolls_amount").cast("double"))
    .withColumn("improvement_surcharge", F.col("improvement_surcharge").cast("double"))
    .withColumn("total_amount", F.col("total_amount").cast("double"))
    .withColumn("congestion_surcharge", F.col("congestion_surcharge").cast("double"))
    .withColumn("Airport_fee", F.col("Airport_fee").cast("double"))
)

sdf_cast.writeTo("oci.tripdata.yellow").append()

Summary :

Combining the 3 parts of the script, filling in the credentials along with the endpoint, and bucket, along with specifying the parquet object will load the object storage properly including the manifest file.

Investigating the resulting manifest and objects

Below is a list of the objects created in the metadata directory.

You can see that I updated the data and added more data, and each time, it created a new version of the manifest along with snapshot information.

When writing the data, I chose to partition it, below is the partitioned data. You can see that when Apache Iceberg wrote the data, it automatically created parquet objects in directories for each partition, and added to the current directory with new data.

Creating an Oracle table on top of Iceberg

Now that I have created my iceberg table, and it is stored in object storage, I can create an Oracle table that will read the manifest file.

In my example, the most recent manifest object is "iceberg_warehouse/tripdata/yellow/metadata/v7.metadata.json".

My External Table definition is below.

CREATE TABLE CUSTOMERS_ICEBERG
(
VendorID        integer,
tpep_pickup_datetime    timestamp,
tpep_dropoff_datetime    timestamp,
passenger_count        number,
trip_distance        number,
RatecodeID        number,
store_and_fwd_flag    blob,
PULocationID        number,
DOLocationID        number,
payment_type        number,
fare_amount        number,
extra            number,
mta_tax            number,
tip_amount        number,
tolls_amount        number,
improvement_surcharge    number,
total_amount        number,
congestion_surcharge    number,
Airport_fee        number
)
ORGANIZATION EXTERNAL
(
    TYPE ORACLE_BIGDATA
    DEFAULT DIRECTORY DATA_PUMP_DIR
    ACCESS PARAMETERS
    (
        com.oracle.bigdata.fileformat=parquet
        com.oracle.bigdata.credential.name=OCI_S3
        com.oracle.bigdata.access_protocol=iceberg
    )
   LOCATION ('iceberg:https://{namespace}.compat.objectstorage.us-ashburn-1.oci.customer-oci.com/icebergs3/iceberg_warehouse/tripdata/yellow/metadata/v7.metadata.json')
)
PARALLEL;

NOTE: my location is "iceberg:https".

I can now select from my new table and it will read the Iceberg manifest file.

Conclusion:

Oracle Database 23ai not only supports creating external tables on top of Parquet objects, but it also supports creating external tables on top of Apache Iceberg manifest objects.

Wednesday, September 17, 2025

Building a Cyber Vault ? Don't forget your keys

When building a cyber vault, one of the most important items to manage is encryption keys. Encrypting your data is a fundamental pillar of ransomware protection, but encryption key management is often forgotten.

Ensuring your Cyber Vault has a current copy of your encryption keys associated with your Oracle Databases is critically important for a successful restore and recover after an attack.

Starting with the Oracle Key Vault (OKV) 21.11 release, Oracle Key Vault includes a preview of the Key Transfer Across Oracle Key Vault Clusters Integration Accelerator. You can use this feature to transfer security objects from one OKV cluster to another.

You can find more detail on this feature here, and I will describe it's benefits in this post.

The diagram below shows a typical cyber vault architecture to protect Oracle Databases with the Zero Data Loss Recovery Appliance (ZDLRA) and OKV .

Transferring keys into a cyber vault with OKV

Encryption Key Architecture

Encryption key management is a critical piece of data protection, and it is important to properly manage your keys. Good cyber protection begins with proper encryption key management.

Local Wallets

With Oracle databases, the default (and simplest) location for encryption keys is in a locally management wallet file. The keys are often stored in an auto-login wallet, which is automatically opened by the database at startup making key management transparent and simple, but not very secure.

Why aren't wallets secure ?

They are often auto-login (cwallet.sso) which allows the database to open the wallet without requiring a password. This wallet file gives full access to the encryption keys.
The wallet file is stored with the database files. A privileged administrator, such as a DBA has access to both the database and the keys to decrypt the data directly from the hosts. They also have the ability to delete the wallet file.
Often the wallet file is backed up with the database, which includes an auto-login wallet. This allows anyone who has access to backups, to also be able to decrypt the data.
Securely backing up the wallet file separate from the database is often forgotten, especially when ASM is used as the wallet location. Not having the wallet file when restoring the database makes restoration and recovery impossible.

Steps you can take

Create both a passworded wallet (ewallet.p12) and local auto-login wallet (cwallet.sso). With a local auto-login wallet, the wallet can only be opened on the host where the wallet was created.
Backup the passworded wallet only (ewallet.p12). The auto-login wallet can always be recreated from the passworded wallet.
Properly store the password for you passworded encryption wallet. You will need the password to rotate your encryption keys, and create the auto-login wallet.

Oracle Key Vault (OKV)

The best way to securely manage encryption keys is with OKV.

Why ?

Keys are managed outside of the database and cannot be accessed locally outside of the database.
Access to keys is granted to a specific database instance.
OKV is clustered for High Availability, and the OKV cluster can be securely backed up.
Key access is audited to allow for early detection of access.

Encryption Keys in a cyber vault architecture

Below is a diagram of a typical cyber vault architecture using ZDLRA. Because the backups are encrypted, either because the databases are using TDE and/or they are creating RMAN encrypted backups sent to the ZDLRA, the keys need to be available in the vault also.

Not only is the cyber vault a separate, independent database and backup architecture, the vault also contains a separate, independent OKV cluster.

This isolates the cyber vault from any attack to the primary datacenter, including any attack that could compromise encryption key availability.

Encryption Keys in an advanced cyber vault architecture

Below is a diagram of an advanced cyber vault architecture using ZDLRA. Not only are the backups replicated to a separate ZDLRA in the vault, they are internally replicated to an Isolated Recovery Environment (IRE). In this architecture, the recover area is further isolated, and the OKV cluster is even further isolated from the primary datacenter. This provides the highest level of protection.

OKV Encryption Key Transfer

This blog post highlights the benefit of the newly released (21.11) OKV feature to allow for the secure transfer of encryption keys.

Periodic rotation of encryption keys is a required practice to protect encryption keys, and ensuring you have the current key available in a cyber vault is challenging.

OKV solves this challenge by providing the ability to transfer any new or changed keys between clusters.

Implementing OKV in a cyber Vault

When building a cyber vault, it is recommend to build an independent OKV cluster. The OKV cluster in the vault is isolated from the primary datacenter and protected by an airgap. The nodes in this cluster are not be able to communicate with the OKV cluster outside of the vault.

The OKV cluster in the vault can be created using a full, secure, backup from the OKV cluster in the primary datacenter. The backup can be transferred into the vault, and then restored to the new, independent OKV cluster providing a current copy of the encryption keys.

Keeping OKV managed keys updated in a cyber Vault

The challenge, once creating an isolated OKV cluster has been keeping the encryption keys within the cluster current when new keys are created. This was typically accomplished by transferring a full backup of OKV into the vault, and rebuilding the cluster using this backup.

OKV 21.11 provides the solution with secure Encryption Key Transfer. Leveraging this feature you can securely transfer just the keys that have recently changed allowing you to manage independent OKV clusters that are synchronized on a regular basis.

The diagram below shows the flow of the secure Encryption Key Transfer package from the primary OKV cluster into the vault when the air-gap is opened.

This new OKV feature provides a much better way to securely manage encryption keys in a Cyber Vault.

Summary

As ransomware attacks increase, it is critical to protect a backup copy of your critical database in a cyber vault. It is also critical to protect a copy of your encryption keys to ensure you can recovery those databases. OKV provides the architecture for key management in both your primary datacenter and in a cyber vault. The new secure key transfer feature within OKV allows you to synchronize keys across independent OKV clusters.

Friday, August 29, 2025

Oracle Master Encryption Key Management: Wallets, OKV, and OCI KMS Explained

Data Encryption Keys (DEK) are used by encryption algorithms (typically AES256) to protect data at rest in tablespaces, backups and even datapump exports. Since DEKs are stored with the data, they need to be protected. Master Encryption Keys (MEK) are stored outside the database in local wallets or in an external keystore, and MEKs are used to encrypt the DEK which provides a separation between the data that is encrypted and an external key.

DEKs are not only found on Tablespaces (TDE), but they are also found in RMAN encrypted backups, Space Efficient Encrypted backups to ZDLRA/Recovery Service and encrypted datapump exports.

The diagram below shows that relationship between MEKs and DEKs.

You can see in the diagram that MEKs are not only used to encrypt Tablespace Encryption Keys to support TDE, they are also used to encrypt RMAN backups to disk, Tape or ZDLRA/Recovery Service, along with Datapump encryption keys.

NOTE: When implementing encryption you should make sure you meet ASO licensing restrictions

Creating encrypted tablespaces or encrypted columns in tables requires ASO, unless you are leveraging an OCI Cloud Database environment (including multicloud).
Creating RMAN encrypted disk backups requires licensing.
Creating RMAN encrypted tape backups is license included and only supported when writing to Oracle products.

OSB - Oracle Secure Backup is license included for creating encrypted backups.
OCI Object Storage - The OCI cloud library requires encryption and can only be leveraged for writing backups directly to OCI object storage in OCI, or writing backups from a ZDLRA to OCI object storage in cloud or on ZFSSA
ZDLRA - Creating Space Efficient Encrypted backups is license included on ZDLRA and will create encrypted backups on ZDLRA.

Creating Datapump Encrypted backups requires licensing.

MEKs are commonly called "TDE keys", but they are much more, and I will refer to them as MEKs throughout this post.

MEK with a Wallet File

In this first section I am going to talk about how the Oracle Database manages an MEK when using a Wallet file.

Wallet settings

With wallet-based MEKs, the database stores and reads keys from a wallet file on disk (or +ASM or Exascale).

There are 2 parameters in the database (spfile) that help guide the database to the wallet file.

WALLET_ROOT - This replace the "ENCRYPTION_WALLET_LOCATION" parameter that was previous set in the sqlnet.ora file. Having it set at the database level allows multiple databases to independently share the same $ORACLE_HOME. It even allows PDBs to have individual wallet files. When using a wallet file, the WALLET_ROOT points to top directory where the encryption keys will be kept.

TDE_CONFIGURATION - This tells the database what kind of MEK will be used. For wallets you would specify "TDE_CONFIGURATION=FILE". This tells the database to use find the encryption keys in an encryption wallet.

Encryption key location

Within the WALLET_ROOT directory the encryption wallet containing the MEKs, the wallet must be stored in a subdirectory named "tde".

Within that directory you will find 2 wallet files

ewallet.p12 - The password-protected wallet (contains keys; required for updates).
ewallet.sso — the auto-login wallet (allowing no password prompt on open); The auto-login wallet can also be created as a “local” autologin wallet bound to a single host.

On startup (or when you open the keystore), the database retrieves the CDB/PDB MEKs from the wallet file and caches them in memory.

Database interaction with MEK

When using a wallet file, the auto-login wallet file is opened when the database is opened, and the database constantly interacts with this file. If no auto-login wallet file is created, you must open the wallet manually providing the password.

Viewing V$ENCRYPTION_WALLET

What you’ll typically see for FILE based wallets:

PDB_NAME        WALLET_TYPE  STATUS  WRL_PARAMETER                KEYSTORE
--------------  -----------  ------  ----------------------------  --------
CDB$ROOT        AUTOLOGIN    OPEN    {WALLET_ROOT}/tde             NONE
DB23AI_PDB      AUTOLOGIN    OPEN                                  UNITED

MEK with OKV/OCI KMS

When using an external keystore, the database communicates with the OKV/OCI KMS client libraries installed on the host in order to retrieve the MEKs.

OKV/OCI KMS settings

There are 2 parameters in the database (spfile) that help guide the database to the OKV/OCI KMS installation.

WALLET_ROOT - When using OKV/OCI KMS, the WALLET_ROOT points to top directory of the client install of the OKV/OCI KMS software.

TDE_CONFIGURATION - For OKV/OCI KMS specify "TDE_CONFIGURATION=OKV" or "TDE_CONFIGURATION=HSM". It is possible to combine both wallets and OKV/OCI KMS if there are older keys that are still stored a wallet file. In that case you would see "TDE_CONFIGURATION=OKV|FILE" for example, and the database would read older encryption keys from a wallet file along with reading the keys from OKV/OCI MKS.

Encryption key location

Within the WALLET_ROOT directory, the software to communicate with the HSM is typically installed in the "okv" or "hsm" subdirectory. The keys themselves are stored externally, and could be securely cached locally depending on your configuration.

The credential used to access OKV/OCI KMS is typically stored under the /tde directory in an auto-login wallet. This allows the database to connect to OKV/OCI KMS at startup. The external password store entries (if used) live under /tde_seps which allows up to manage keys without entering a password.

Database interaction with MEK

The MEK is always permanently stored externally. The database interacts with the external keystore through a library. This library is named liborapkcs.so. The library itself is vendor specific and is used by the database to interact with different vendors keystore,. Because there is a single library, you cannot use multiple external key store types on the same DB host.

When the database starts up, it reads the password for the external keystore out of the wallet, opens up a connection to the keystore (after authentication and encryption handshake) and caches the MEK. The database periodically will communicate with the external keystore through that library.

OKV/OCI MKS in Practice: What You’ll See

When both FILE and OKV configurations exist (e.g., during migration), you may observe dual rows in V$ENCRYPTION_WALLET showing the FILE wallet (no active master key) and the active OKV keystore:

PDB_NAME        TYPE  STATUS             WALLET_TYPE  WRL_PARAMETER
--------------  ----  -----------------  -----------  ---------------------------
CDB$ROOT        FILE  OPEN_NO_MASTERKEY  AUTOLOGIN    {WALLET_ROOT}/tde
CDB$ROOT        OKV   OPEN               OKV
DB23AI_PDB      FILE  OPEN_NO_MASTERKEY  AUTOLOGIN
DB23AI_PDB      OKV   OPEN               OKV

Best Practices & Notes

Back up only the passworded wallet when using a wallet file. (ewallet.p12) and store it securely, separate from database backups.
Use local autologin wallets if you require host binding for additional security.
When migrating to OKV/OCI KMS, consider a staged approach (OKV|FILE) so you can fall back if needed.
Always test wallet/keystore open/close operations and RMAN/Data Pump encryption workflows in lower environments before production.
Keep client libraries (liborapkcs.so and OKV client) patched and consistent across RAC nodes.

Last updated: August 29, 2025

Monday, September 29, 2025

Oracle DB 23ai supports external Apache Iceberg tables

Parquet objects

Defining external tables on parquet objects

Notes:

Apache Iceberg

Environment to build manifest

Python

PySpark & Spark runtime

Environment variables

Script to build manifest and store python files

Python script to set the environment and hadoop catalog (Part #1)

Python script to create namespace and create table (Part #2)

Python scrypt to read parquet object/set type/append to iceberg table (Part #3)

Summary :

Investigating the resulting manifest and objects

Creating an Oracle table on top of Iceberg

Conclusion:

Wednesday, September 17, 2025

Building a Cyber Vault ? Don't forget your keys

Encryption Key Architecture

Local Wallets

Why aren't wallets secure ?

Steps you can take

Why ?

Encryption Keys in a cyber vault architecture

Encryption Keys in an advanced cyber vault architecture

OKV Encryption Key Transfer

Implementing OKV in a cyber Vault

Keeping OKV managed keys updated in a cyber Vault

Summary

Friday, August 29, 2025

Oracle Master Encryption Key Management: Wallets, OKV, and OCI KMS Explained

MEK with a Wallet File

Wallet settings

Encryption key location

Database interaction with MEK

Viewing V$ENCRYPTION_WALLET

What you’ll typically see for FILE based wallets:

MEK with OKV/OCI KMS

OKV/OCI KMS settings

Encryption key location

Database interaction with MEK

OKV/OCI MKS in Practice: What You’ll See

Best Practices & Notes

About Me

Blog Archive