DataBricks

Getting Started with Databricks Data Source Configuration

Requirements:

  • Active Databricks cluster.

  • Allow connections from DataBrain to your Databricks cluster.

    • For details on setting up IP whitelisting and ensuring secure connectivity, refer to our guide on

Allow Access to our IP
  • Choose the DataBrain Workspace to which you wish to connect the data.

Setup Guide:

  1. Ensure Cluster Accessibility:

    • Ensure your Databricks cluster is active and accessible from the machine running DataBrain.

    • Accessibility is dependent on your Databricks user privileges and network settings. The easiest way to verify if DataBrain can connect to your Databricks cluster is via the check connection tool in the UI. For detailed setup and permissions, refer to the Databricks documentation.

  2. Fill Up Connection Info:

    • Provide the necessary information to connect to your Databricks cluster:

      • Data Source Name: [Pick a name to help you identify this data source in DataBrain]

      • Databricks Cluster Details:

        • Host: [Host Endpoint of the Databricks Cluster, e.g., dbc-a1b2345c-d6e7.cloud.databricks.com]

        • Path: [Path of the data warehouse, e.g., /sql/1.0/endpoints/a1b234c5678901d2]

      • Authentication Details:

        • Token: [Access token, e.g., dapi1ab2c34defabc567890123d4efa56789]

Encryption:

  • Ensure you have SSL/TLS set up for your Databricks connection if you require encrypted connections from DataBrain for enhanced security.

Permissions:

  • Permission to read information_schema.

  • Whitelist the IP address.

  • Grant read access to the schema (usage) and tables, noting that access to only tables may not suffice in certain databases.

Replace the placeholders inside the square brackets with the actual values when filling in the details.

Locating the Configuration Details in Databricks

  1. Data Source Name:

    • This is a custom name you decide for identification within DataBrain. Choose a name that is relevant and descriptive of your Databricks source.

  2. Host:

    • Log in to your Databricks workspace. The URL in your browser's address bar is your Host Endpoint. It typically has the format dbc-a1b2345c-d6e7.cloud.databricks.com.

  3. Path:

    • This is specific to the SQL endpoint you want to connect to in Databricks. Navigate to the SQL tab in your Databricks workspace, and select the desired SQL endpoint. In the endpoint details, you should find the full path. The path typically looks like /sql/1.0/endpoints/a1b234c5678901d2.

  4. Token:

    • In Databricks, navigate to the user settings by clicking on the user icon in the top right corner.

    • Choose "User Settings" and then go to the "Access Tokens" tab.

    • Generate a new token or use an existing one. Note that once you generate a token and close the window, you cannot view the token again for security reasons. Ensure you copy and store it securely.

Last updated