Requirements:

  • Active AWS account with Athena and S3 access.
  • Proper IAM permissions to access Athena and the related S3 buckets.
  • A selected Databrain Workspace where this source will be added.

Setup Guide:

  1. Ensure Athena and S3 Accessibility:
    • Make sure your Athena database is active and queries can write results to the designated S3 bucket.
    • Ensure that both Athena and S3 are accessible using the credentials you plan to use with Databrain.
    • Your S3 bucket should be in a region supported by Athena (e.g., us-east-1 is preferred).
  2. Grant Necessary Permissions: You must assign the appropriate IAM policies to the user or role you’re using. These should include:
    • Access to Athena for querying.
    • Access to Glue Data Catalog, if used.
    • Access to S3 for reading/writing query results.
  3. Fill Up Connection Info: Provide the following fields in Databrain to configure your Athena source:
    • Destination Name: A custom name to identify this Athena connection in Databrain.
      Example: Athena Destination Spec
    • S3 Region: The AWS region where your Athena query result bucket is located.
      Example: us-east-1
    • S3 Access Key ID: Your AWS Access Key ID for authentication.
    • S3 Secret Access Key: Your AWS Secret Access Key associated with the Access Key ID.
    • Database: The name of the Athena database you want to connect to.
      Example: iceberg_db
    • S3 Bucket Name: The name of the S3 bucket where Athena stores query results.
      Example: output-bucket-name

Locating the Configuration Details in AWS

  1. Destination Name: Choose any descriptive name to label your Athena connection in Databrain. This does not affect AWS resources.
  2. S3 Region:
    • Log in to the AWS Management Console and open the S3 service.
    • Select the bucket used by Athena for output.
    • Then, find the Region under the bucket’s Properties tab.
  3. S3 Access Key ID & Secret Access Key:
    • Open the IAM console and select the desired IAM User or Role.
    • Then, go to the Security Credentials tab and create or retrieve Access Keys for use in Databrain.
  4. Database:
    • Open the Athena service in the AWS console and select your desired database from the left sidebar (e.g., iceberg_db).
  5. S3 Bucket Name:
    • In the Athena console, go to Settings.
    • Then, find the Query result location (S3 URI like s3://your-bucket-name/path).
    • Use the bucket name from this URI (e.g., your-bucket-name).