CloudQuery sync time for AWS data in PG database on EKS

erez · October 20, 2023, 12:34pm

Yes,
Then you can have cloudquery sync aws-most-tables.yml and cloudquery sync aws-slow-tables.yml.
For example, some tables we support are AWS static data that almost never changes, like parameter_groups tables, so you don’t need to sync it often.
Some are slow, like accessed_details, since we call an AWS API to generate the report and then wait until it finishes.

unified-reptile · October 20, 2023, 12:39pm

cool
and to increase the concurrency I should just add

spec:
    scheduler: "shuffle"

erez · October 20, 2023, 12:45pm

kind: source
spec:
  name: aws
  ...
  spec:
    concurrency: 50000

The scheduler uses a different strategy to order the sync. As described in CloudQuery Documentation, please note that higher concurrency doesn’t always mean faster sync as you can hit AWS rate limits. So, sometimes more Go routines mean slower sync.

Also, that takes more resources from the machine, so you’d need a stronger machine.

brave-bengal · October 21, 2023, 5:24pm

Do we have a list of tables published that take a lot of time, such as cloudtrail*? I see the same issue as shared by @unified-reptile. It takes way too long and is almost unusable for me.

erez · October 23, 2023, 8:30am

Hi @brave-bengal, the list is here: CloudQuery AWS Configuration - Skip Tables

Topic		Replies	Views
CloudQuery syncing Inspector2 table is slow for AWS accounts CloudQuery Plugins	1	2	May 10, 2024
CloudQuery sync slow and failing with aws_inspector_findings and aws_inspector2_findings tables CloudQuery Plugins	2	1	March 13, 2024
How to reduce cloudquery sync time for multiple aws accounts CloudQuery Plugins	14	11	September 28, 2023
Understanding CloudQuery incremental plugin and cq_state_aws table functionality CloudQuery Plugins	47	59	June 18, 2024
CloudQuery job performance issues after upgrading to plugin version v22.14.0 CloudQuery Plugins	4	1	November 17, 2023

CloudQuery sync time for AWS data in PG database on EKS

Related topics