Databricks on AWS
Before you start
To connect Databricks to Select Star, you will need...
an Databricks instance on AWS. For details, see Databricks' documentation.
Account admin permissions on the Databricks instance
Workspace admin permissions on the Databricks instance
Complete all of the following steps to see Databricks metadata, lineage, and popularity in Select Star.
1. Create a Service Principal in Databricks
What is a Service Principal?
A service principal is an identity that you create in Databricks for use with automated tools, jobs, and applications. Service principals give automated tools and scripts API-only access to Databricks resources, providing greater security than using users or groups. It also prevents jobs and automations from failing if a user leaves your organization or a group is modified. For details, see Manage Service Principal.
Add a service principal to your Databricks account
Account admins can add service principals to your Databricks account using the account console or the System for Cross-domain Identity Management (SCIM) Account API.
Add service principals to your account using the account console
To add a service principal to the account using the account console:
As an account admin, log in to the account console.
Click User management.
On the Service principals tab, click Add service principal.
Enter a name (SelectStar) for the service principal.
Click Add.
To add a service principal via REST API, see Add service principals to your account using the SCIM (Account) API .
Add a service principal to a workspace
Account admins can add service principals to identity-federated workspaces using the following:
The account console
The Workspace Assignment API
Workspace admins can manage service principals in their workspace using the following:
The workspace admin console (if the workspace is enabled for identity federation)
The workspace-level SCIM (ServicePrincipals) API
The Workspace Assignment API (if the workspace is enabled for identity federation)
Assign a service principal to a workspace using the account console
To add service principals to a workspace using the account console, the workspace must be enabled for identity federation.
As an account admin, log in to the account console.
Click Workspaces.
On the Permissions tab, click Add permissions.
Search for and select the service principal SelectStar and assign the permission level (workspace Admin), and click Save.
To add a service principle to a workspace via admin console or REST API, see Add a service principal to a workspace.
These are the minimum permissions required for Select Star to collect basic metadata and query history. Query history is also used to generate Data Lineage.
Grant SQL and Workspace access for a service principal
To grant SQL Warehouse access for a service principal using the workspace admin console, the workspace must be enabled for identity federation.
As a workspace admin, log in to the Databricks workspace.
Click your username in the top bar of the Databricks workspace and select Admin Console.
Admin Console Click Settings and select Service principals.
On the Service principals tab, click the service principal that was create in the previous steps.
Select the checkbox for Databricks SQL access and Workspace access, and click Update.
Entitlements for service principal
Grant permissions to a catalog for a service principal
Log in to a workspace that is linked to the metastore.
Click Data.
Click the catalog that needs to be granted access to, and select Permissions.
Catalog permissions in the Data Explorer UI Click Grant.
In the Grant dialog, configure the following:
Under Principals, click the dropdown, and select the Principal that you created in the previous step.
For Privilege presets, the value should be "Custom".
Check BROWSE from the list of privileges. For more information on what this privilege entails, please visit the following link.
At the bottom of the dialog, click Grant to confirm.

Grant permission to a workspace for a service principal
This step is required to show notebooks in the catalog and notebook lineage.
Log in to a workspace that is linked to the metastore.
Click Workspace and select top folder.
Click Share button.
Folder permissions in the Workspace explore UI Select the user/group, then select permission "Can view", and click Add.
Permission grant in Workspace share
2. Generate a Personal Access Token
To authenticate a service principal to APIs on Databricks, an administrator can create a Databricks Personal Access Tokens on behalf of the service principal.
Grant the Can Use token permission to the service principal.
Create a Databricks personal access token on behalf of the service principal using the
POST /token-management/on-behalf-of/tokens
operation in the token management REST API. An administrator can also list personal access tokens and delete them using the same API.
Generate a Personal Access Token
POST
https://<deployment name>.cloud.databricks.com/api/2.0/token-management/on-behalf-of/tokens/
When you want to use the Databricks API to generate a Personal Access token on behalf of a user or service principal, use this command.
Use the token value
generated from this response as API key.
Request Body
application_id*
String
UUID of the Service Principal, and can be found here - https://accounts.cloud.databricks.com/users/serviceprincipals/
comment
String
lifetime_seconds
String
Use value = -1
in order for it to live indefinitely
{
"token_value": "dapia.....", #Use this value
"token_info": {
"token_id": "4305bc67998.........",
"creation_time": 1671720121149,
"expiry_time": -1,
"comment": "Service Principal Token. API Test",
"created_by_id": 355825636633264,
"created_by_username": "[email protected]",
"owner_id": 4012126671306509
}
}
For detailed, step-by-step instructions for creating access tokens for service principals, see Service principals for Databricks automation.
3. Connect Databricks to Select Star
Go to the Select Star Settings. Click Data in the sidebar, then + Add to create a new Data Source.

Choose Databricks in the Source Type dropdown and provide the following information:

Display Name: This value is Databricks
by default, but you can override it if desired.
Workspace URL: This is the address of the Workspace. This should include the <deployment name>.cloud.databricks.com
. Deployment Name can be found in https://accounts.cloud.databricks.com/workspaces
Access Token: This is the Personal access token from Step 2, which is used to authenticate access to Databricks.
4. Choose catalog and schemas
After you fill in the information, you'll be asked to select the catalog you'd like to load into Select Star.
You can change the catalogs and schemas you have loaded if needed.
Select the catalogs and click Next.

For each catalog you selected, you'll be able to select the schemas.

Your metadata should start loading automatically. Please allow 24-48 hours to completely generate popularity and lineage.
When the sync is complete, you'll be able to explore Databricks in Select Star.
See the link below for more information on Databricks in Select Star.
Getting Started: DatabricksLast updated
Was this helpful?