Gather multiple AWS orgs data into same Postgres DB with CloudQuery

Gather data from multiple AWS orgs into the same database with CloudQuery

Hey there! We are gathering data from one AWS organization into a Postgres DB with CloudQuery, and we want to extend this to cover an additional AWS org. I’m not sure how we can achieve this. Currently, we use the credentials of the role attached to the EC2 instance we use to run CloudQuery, and CloudQuery uses that role to list the accounts in the org.

Hi @certain-alpaca - The best way to do this is to create a trust relationship between a role in the admin account of the new AWS Org and the role attached to the EC2 instance.

kind: source
spec:
  name: aws
  path: cloudquery/aws
  registry: cloudquery
  version: "v23.2.0"
  tables: ['aws_s3_buckets']
  destinations: ["postgresql"]
  spec:
    aws_debug: false
    org:
      admin_account:
        role_arn: "<ROLE_ARN_IN_NEW_ORG>"
      member_role_name: "<REPLACE_WITH_ROLE_DEPLOYED_TO_MEMBER_ACCOUNTS>"
    regions:
      - '*'

Hi @ben! Thanks for your answer.

Currently, my config looks like this:

org:
  admin_account:
    local_profile: ""
  member_role_name: "<member-accounts-role>"

So CloudQuery leverages the instance profile role. If I point to the role ARN in the new org, I will lose that, I guess. I’m not sure how to make both things coexist.

Also, when it comes to the tables in Postgres, I’d like to have a way to distinguish the tables of one org versus the tables from another. What’s the best way to do this? Another database? Another schema?

So if you are using the default credentials on the AWS EC2 instance, CloudQuery will automatically use those credentials when assuming a role.

If you want each org to have separate tables, then the best way to do that is to use different schemas. You can easily switch the schema in your Postgres destination by setting the search_path in your connection string:

postgres://jack:secret@localhost:5432/mydb?search_path=myschema_org_1
postgres://jack:secret@localhost:5432/mydb?search_path=myschema_org_2

Hello!

How does CloudQuery know how to list all accounts in the new org?

What would the final configuration of the aws.yml file look like for covering both orgs? Should I create two different config files (one per AWS org)?

The AWS plugin only allows a single org config. So you would do something like this:

kind: source
spec:
  name: aws-old-org
  path: cloudquery/aws
  registry: cloudquery
  version: "v23.2.0"
  tables: ['aws_s3_buckets']
  destinations: ["postgresql-1"]
  spec:
    aws_debug: false
    org:
      admin_account:
        local_profile: ""
      member_role_name: "<member-accounts-role>"
    regions:
      - '*'
---
kind: source
spec:
  name: aws-new-org
  path: cloudquery/aws
  registry: cloudquery
  version: "v23.2.0"
  tables: ['aws_s3_buckets']
  destinations: ["postgresql-2"]
  spec:
    aws_debug: false
    org:
      admin_account:
        role_arn: "<ROLE_ARN_IN_NEW_ORG>"
      member_role_name: "<REPLACE_WITH_ROLE_DEPLOYED_TO_MEMBER_ACCOUNTS>"
    regions:
      - '*'
---
kind: destination
spec:
  name: "postgresql-1"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v7.1.2"

  spec:
    connection_string: "postgres://jack:secret@localhost:5432/mydb?search_path=myschema_org_1"
---
kind: destination
spec:
  name: "postgresql-2"
  path: "cloudquery/postgresql"
  registry: "cloudquery"
  version: "v7.1.2"

  spec:
    connection_string: "postgres://jack:secret@localhost:5432/mydb?search_path=myschema_org_2"

The AWS Org integration works by assuming a role in the management account or root of the org and running ListAccounts to enumerate all of the accounts and then assuming a role in each one to actually run the sync.

If you want to hop on a 15-minute call to discuss how this might work for your organization(s), feel free to schedule a meeting on my calendar: Schedule a Meeting

Hey there!

Sorry, I was out on vacation. That makes sense, I’ll try that, and if I’m not able to make it work, I’ll schedule a 15-minute call with you as suggested. Thanks! :slightly_smiling_face:

Hi! The approach you suggested worked like a charm. :sunglasses: I’m trying to figure out now if CloudQuery is able to target specific OUs only. I didn’t see any reference to that in the docs; is that feature available?

That is great to hear! Yes, CQ can sync specific OUs.

Here is an example:

kind: source
spec:
  name: aws-old-org
  path: cloudquery/aws
  registry: cloudquery
  version: "v23.2.0"
  tables: ['aws_s3_buckets']
  destinations: ["postgresql-2"]
  spec:
    aws_debug: false
    org:
      admin_account:
        role_arn: "<ROLE_ARN_IN_NEW_ORG>"
      member_role_name: "<REPLACE_WITH_ROLE_DEPLOYED_TO_MEMBER_ACCOUNTS>"
      organization_units:
        - ou-<ID-1>
    regions:
      - '*'

The available configurations are all documented here: CloudQuery AWS Configuration Documentation

Ups, my bad then. For some reason, I didn’t see it.
Thanks!