CloudQuery + Databricks Integration: Complete Setup Guide

Hey everyone,

We’ve been working on running CloudQuery syncs directly within Databricks, and after hitting a few walls, we’ve got a solid setup that’s working well in production. Figured I’d share the complete walkthrough since several folks have asked about this integration.

Our team needed centralized cloud asset inventory across AWS accounts, but didn’t want to spin up separate infrastructure just for CloudQuery. Since we’re already heavy Databricks users, running everything as Jobs made sense.

The setup breakdown:

Secrets management (this part’s crucial): We’re using Databricks CLI to create a secrets scope. You’ll need these keys:

AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN
CLOUDQUERY_API_KEY  
DATABRICKS_CATALOG / DATABRICKS_SCHEMA
DATABRICKS_ACCESS_TOKEN / DATABRICKS_HOSTNAME / DATABRICKS_HTTP_PATH
DATABRICKS_STAGING_PATH

You can checkout the complete walk though here: How to Work with CloudQuery Syncs within Databricks | CloudQuery Blog

Additional Resources:

1 Like