Request for AWS plugin to sync account IDs from aws_organizations_accounts table

mutual-krill · April 9, 2024, 8:36am

Has there been any thought regarding the AWS plugin to pull the account IDs to sync from the aws_organizations_accounts table? For many organizations, new accounts aren’t added frequently, and by pulling from this table where the status is ACTIVE, suspended accounts can be excluded as well. Perhaps this could be a togglable capability that’s off by default, allowing customers to choose? This would help cut down on API rate limiting at scale.

The GCP plugin could also benefit from this. Historically, I’ve worked with customers who have tens of thousands of GCP projects, and the gcp_resourcemanager_projects table could be utilized.

I’ll put in a GitHub issue for this capability, but figured I’d ask first. For a bit of background, we run CloudQuery with a number of different jobs that sync specific tables at different cadences. If a customer is syncing every AWS table, then this option wouldn’t make sense. Perhaps it’s something that’s only supported/recognized if the target tables to sync don’t include aws_organizations_accounts.

erez · April 9, 2024, 9:03am

So, if you don’t configure projects, GCP will discover all active projects and use them in the sync. We later use that information for gcp_resourcemanager_projects.

The AWS plugin also supports accounts discovered via the org: config. See the AWS Organization Example: AWS Organization Example.

Is that what you were looking for? This is useful if you don’t split the sync into multiple jobs (e.g., a job per account).

mutual-krill · April 9, 2024, 9:05am

Yes, we’re using account discovery. All of our customers onboard at the organization level. What I’m suggesting is to have an option to only use autodiscover to populate the aws_organizations_accounts table, and for all other tables the option would pull the account ID(s) to sync from locally instead of doing the autodiscover. Hopefully, that makes sense.

erez · April 9, 2024, 9:08am

Ah, got it. So you’re suggesting to use aws_organizations_accounts as a cache for the next sync, did I get it right?

erez · April 9, 2024, 9:10am

Cool, a GitHub issue would be great then

mutual-krill · April 9, 2024, 9:16am

For further details, please refer to the issue on GitHub: cloudquery/cloudquery#17567.

Topic		Replies	Views
Request to display AWS account names instead of IDs in CloudQuery CloudQuery Plugins	11	35	December 12, 2023
CloudQuery skips organization checks when specific project ids are provided CloudQuery Plugins	4	9	January 2, 2024
Clarification needed on request_account_id in aws_organizations tables CloudQuery Plugins	4	23	May 16, 2024
Clarification needed on CloudQuery organization level tables and skip configurations CloudQuery Plugins	1	9	December 7, 2023
Gather multiple AWS orgs data into same Postgres DB with CloudQuery CloudQuery Plugins	8	15	January 5, 2024

Request for AWS plugin to sync account IDs from aws_organizations_accounts table

Related topics