Ensure your GCP project with BigQuery is active and accessible from the machine running Databrain.
The accessibility is dependent on your GCP permissions and IAM settings. The easiest way to verify if Databrain can connect to your BigQuery is via the Add a Data Source UI.
Grant Necessary Permissions:
Read Access on Datasets, Tables, and information_schema: Grant read access permissions to the datasets, tables, and the information_schema dataset within BigQuery. This allows Databrain to retrieve necessary information and replicate data accurately. You can assign the predefined role roles/bigquery.dataViewer to provide read access to datasets and tables.
Add jobs.create Permissions: Additionally, grant permissions to create jobs in BigQuery by assigning the roles/bigquery.jobUser role. This allows Databrain to create and manage jobs for tasks such as querying data into BigQuery.
Note on Project & Dataset IDs:
Project IDs must contain 6–63 lowercase letters, digits, or dashes. Some project IDs also include a domain name separated by a colon. IDs must start with a letter and may not end with a dash.
Dataset IDs follow similar rules, adhering to the same length and character restrictions. It’s important to maintain consistency and compliance with these guidelines to ensure proper functionality and interoperability within BigQuery.
Fill Up Connection Info:
Provide the necessary information to connect to your BigQuery:
Destination Name: [Pick a name to help you identify this destination in Databrain]
Project ID: [The GCP project ID for the project containing the target BigQuery dataset]
Default Dataset ID: [The default BigQuery Dataset ID that tables are replicated to if the source doesn’t specify a namespace]
Service Account Key JSON: [The JSON value of the Service Account Key to authenticate into your Service Account. This is mandatory for Cloud and optional for Open-Source versions of Databrain]
Note on Dataset Configuration:
Ensure the default dataset ID is set correctly for proper data synchronization. This is where tables will be replicated if the source does not specify a namespace.