BigQuery

Getting Started with BigQuery Destination Configuration

Requirements:

  • Active Google Cloud Platform (GCP) project with BigQuery enabled.

  • Allow connections from DataBrain to your BigQuery dataset.

    • For details on setting up IP whitelisting and ensuring secure connectivity, refer to our guide on

Allow Access to our IP
  • Choose the DataBrain Workspace to which you wish to connect the data.

Setup Guide:

  1. Ensure Project Accessibility:

    • Ensure your GCP project with BigQuery is active and accessible from the machine running DataBrain.

    • The accessibility is dependent on your GCP permissions and IAM settings. The easiest way to verify if DataBrain can connect to your BigQuery is via the Add a Data Source UI.

  2. Grant Necessary Permissions:

    • Read Access on Datasets, Tables, and information_schema: Grant read access permissions to the datasets, tables, and the information_schema dataset within BigQuery. This allows DataBrain to retrieve necessary information and replicate data accurately. You can assign the predefined role roles/bigquery.dataViewer to provide read access to datasets and tables.

    • Add jobs.create Permissions: Additionally, grant permissions to create jobs in BigQuery by assigning the roles/bigquery.jobUser role. This allows DataBrain to create and manage jobs for tasks such as querying data into BigQuery.

    • Note on Project & Dataset IDs:

      • Project IDs must contain 6-63 lowercase letters, digits, or dashes. Some project IDs also include a domain name separated by a colon. IDs must start with a letter and may not end with a dash.

      • Dataset IDs follow similar rules, adhering to the same length and character restrictions. It's important to maintain consistency and compliance with these guidelines to ensure proper functionality and interoperability within BigQuery.

  3. Fill Up Connection Info:

    • Provide the necessary information to connect to your BigQuery:

      • Destination Name: [Pick a name to help you identify this destination in DataBrain]

      • Project ID: [The GCP project ID for the project containing the target BigQuery dataset]

      • Default Dataset ID: [The default BigQuery Dataset ID that tables are replicated to if the source doesn't specify a namespace]

      • Service Account Key JSON: [The JSON value of the Service Account Key to authenticate into your Service Account. This is mandatory for Cloud and optional for Open-Source versions of DataBrain]

    • Note on Dataset Configuration:

      • Ensure the default dataset ID is set correctly for proper data synchronization. This is where tables will be replicated if the source does not specify a namespace.

Encryption:

  • All BigQuery connections via DataBrain are secure, leveraging GCP's built-in security features.

Permissions:

  • Permission to read information_schema.

  • Whitelist the IP address.

  • Allow job creation and provide read access to datasets, ensuring dataset IDs adhere to standard rules.

Replace the placeholders inside the square brackets with actual values when filling in the details.

Locating the Configuration Details in BigQuery

  1. Destination Name:

    • This is a custom name you decide for identification within DataBrain. Choose a name that is relevant and descriptive of your BigQuery setup.

  2. Project ID:

    • Navigate to the Google Cloud Console.

    • On the top right corner, the current project name is displayed. Clicking on it will show a dropdown with all your projects.

    • Beside each project name, there is an ID which is the Project ID.

  3. Default Dataset ID:

    • In the Google Cloud Console, navigate to BigQuery.

    • In the left sidebar, under the project name, you'll see a list of datasets. The Dataset ID is the name of these datasets.

  4. Service Account Key JSON:

    • In the Google Cloud Console, navigate to "IAM & Admin" > "Service Accounts".

    • Find the service account you want to use or create a new one.

    • Once you have the service account, click on the three dots (options) for that account, and select "Manage keys".

    • Click on "Add Key" and choose "JSON" as the key type.

    • Once created, the JSON key will be downloaded to your computer. This is the JSON value you need for the Service Account Key JSON.

Last updated