Yes,
Then you can have cloudquery sync aws-most-tables.yml
and cloudquery sync aws-slow-tables.yml
.
For example, some tables we support are AWS static data that almost never changes, like parameter_groups
tables, so you don’t need to sync it often.
Some are slow, like accessed_details
, since we call an AWS API to generate the report and then wait until it finishes.
cool
and to increase the concurrency I should just add
spec:
scheduler: "shuffle"
kind: source
spec:
name: aws
...
spec:
concurrency: 50000
The scheduler
uses a different strategy to order the sync. As described in CloudQuery Documentation, please note that higher concurrency
doesn’t always mean faster sync as you can hit AWS rate limits. So, sometimes more Go routines mean slower sync.
Also, that takes more resources from the machine, so you’d need a stronger machine.
Do we have a list of tables published that take a lot of time, such as cloudtrail*
? I see the same issue as shared by @unified-reptile. It takes way too long and is almost unusable for me.
Hi @brave-bengal, the list is here: CloudQuery AWS Configuration - Skip Tables