Why DLH.io

Built by expert cloud architects and analytics engineers, so you don't have to build your own data infrastructure to support your data demands

Databricks default unity catalog: The 1 best way to set it

The Databricks workspace packs so many features into their platform user interface it should be a case study on solid product development most other SaaS platforms should learn from. One feature to highlight regarding workspace defaults is setting the default for your Databricks unity catalog.

Not too long ago the hive_metastore was the default area for working queries, landing data etc. With the intro and focus on the unity catalog, newer Datbricks accounts/clusters are setup with a unity catalog right away. And Databricks makes that the default typically.

But what if you want or need to change the default catalog for your workspace?

Here’s how you can set the default unity catalog:

Navigate to your workspace
In the upper right hand corner of the platform click your user icon
Select “Settings” from the list of options
In the resulting Settings page, select the Advanced link from the left menu
Scroll to the “Other” section in the main area to see the Default catalog for the workspace area
This is where you will enter your other catalog, in the field available,
Click Save when readyl

As the instructions state, once you click the Save button, in order for the setting to apply you will need to restart any compute (SQL Warehouses or Clusters). :

Setting the default catalog for the workspace determines the catalog that is used when queries do not reference a fully qualified 3 level name. For example, if the default catalog is set to ‘retail_prod’ then a query ‘SELECT * FROM myTable’ would reference the object ‘retail_prod.default.myTable’ (the schema ‘default’ is always assumed).
If the default catalog is in Unity Catalog (set to any value other than ‘hive_metastore’ or ‘spark_catalog’), MLflow client code that reads or writes models will target that catalog by default. Otherwise, models will be written to and read from the workspace model registry
Creating new registered models in workspace model registry is disabled if the default catalog is in Unity Catalog (set to any value other than ‘hive_metastore’ or ‘spark_catalog’)
This setting requires a restart of clusters and SQL warehouses to take effect. Additionally, this setting only applies to Unity Catalog compatible compute i.e. when the workspace has an assigned Unity Catalog metastore, and the cluster is in access mode ‘Shared’ or ‘Single User’, or in SQL warehouses.
https://docs.databricks.com/aws/en/catalogs/default

Basically from the text summary provided, be sure to consider the fully qualified domain name (FQDN) of the three level name to reach your table in a general query or reference. And, there’s an impact for newer Databricks features that require the unity catalog instead of the legacy hive_metastore. Lastly, as mentioned above, updating the default catalog requires a restart of the compute.

Hopefully that helps as you start making more use of your Databricks environments.

If you’d like to dig deeper on this topic for the nuances, take a look at the Databricks Manage the default catalog page.

Christian Screen

View other posts by author ⇨

Download The Data Strategy White Paper

Platform

Features

by Department

Finance and Accounting

Human Resources

Sales & Marketing

IT / Data Engineering

Chief Revenue Officer

Marketing Director

Sales Team Lead

Finance / CFO

IT Director

by Application

by Use Case

Snowflake Usage Analytics

Support Services

Insights

Become a Data Engineer

Documentation

Databricks default unity catalog: The 1 best way to set it

Here’s how you can set the default unity catalog:

Made with Glorious Purpose in the Carolinas

We ❤ Data Teams.

COMPANY

PLATFORM

RESOURCES

Compare DLH.io

Use Cases

Industry Solutions

Retail / Restaurant POS

DLH.io Data Sync Technical Overview

Get Started

Tell us a bit about you

Stay Up-to-Date on AI & Data Engineering