master

What’s new

These are the docs for the Metabase master branch. Some features documented here may not yet be available in the latest release. Check out the docs for the latest version, Metabase v0.55.

Databricks

To add a database connection, click on the gear icon in the top right, and navigate to Admin settings > Databases > Add a database. Then select Databricks.

You can edit these settings at any time. Just remember to save your changes.

Edit connection details

Display name

The display name for the database in the Metabase interface.

Host

Your database’s IP address, or its domain name (e.g., xxxxxxxxxx.cloud.databricks.com or adb-xxxxx.azuredatabricks.net). This is the value of your Databrick’s compute resource’s Server Hostname.

See Compute settings for the Databricks JDBC Driver.

HTTP path

This is the Databrick’s compute resources HTTP Path value. This value is often a SQL warehouse endpoint in the format /sql/1.0/endpoints/abcdef1234567890. See Connect to a SQL warehouse.

Additionally, see Compute settings for the Databricks JDBC Driver.

Authentication

There are two ways to authenticate with Databricks. You can use a personal access token (PAT) or a service principal using OAuth (OAuth M2M).

The Databricks driver supports both options. Use the toggle to select the authentication method you want to use.

Personal access token authentication

See Personal Access Token (PAT).

Authenticate access with a service principal using OAuth (OAuth M2M)

See Authenticate access with a service principal using OAuth.

Enable multiple catalogs

Toggle on to sync multiple catalogs. If you enable this, you’ll be able to specify which catalogs to sync.

Default catalog

Required. You must specify a default catalog (so you don’t have to deal with catalog qualification in native queries).

You can’t sync Databricks’s legacy catalogs, however, including the samples or hive_metastore catalogs.

Catalogs and schemas

You can specify which catalogs and schemas you want to sync and scan. Options are:

All
Only these…
All except…

For the Only these and All except options, you can input a comma-separated list of values to tell Metabase which catalogs and schemas you want to include (or exclude). For example:

foo,bar,baz

You can use the * wildcard to match multiple schemas.

Let’s say you have three schemas: foo, bar, and baz.

If you have Only these… set, and enter the string b*, you’ll sync with bar and baz.
If you have All except… set, and enter the string b*, you’ll just sync foo.

Note that only the * wildcard is supported; you can’t use other special characters or regexes.

Additional JDBC connection string options

You can append options to the connection string that Metabase uses to connect to your database. E.g., IgnoreTransactions=0.

See Compute settings for the Databricks JDBC Driver.

Re-run queries for simple explorations

Turn this option OFF if people want to click Run (the play button) before applying any summarizations or filters in the query builder.

By default, Metabase will execute a query as soon as you choose an grouping option from the Summarize menu or a filter condition from the drill-through menu. If your database is slow, you may want to disable re-running to avoid loading data on each click.

Choose when syncs and scans happen

See syncs and scans.

Periodically refingerprint tables

Periodic refingerprinting will increase the load on your database.

Turn this option ON to scan a sample of values every time Metabase runs a sync.

A fingerprinting query examines the first 10,000 rows from each column and uses that data to guesstimate how many unique values each column has, what the minimum and maximum values are for numeric and timestamp columns, and so on. If you leave this option OFF, Metabase will only fingerprint your columns once during setup.

Model features

There aren’t (yet) any model features available for Databricks.

Danger zone

See Danger zone.

Business Intelligence

Embedded Analytics

Documentation

Learn

Analytics

Embedding

Administration

Other resources