Is concurrency a valid option in CloudQuery spec today

creative-iguana · February 3, 2024, 2:18pm

Is “concurrency” a valid option in the spec today? There are conflicting descriptions here Rate Limiting and here Source Spec Reference and I get an error when trying to use it:

failed to decode spec: json: unknown field "concurrency"

yevgenyp · February 3, 2024, 2:24pm

Thanks for posting about the inconsistency. We will update it. This option is now controlled by each source, so if it’s available in the plugin, it is documented by the plugin itself.

For example, available in GCP, Azure, AWS: CloudQuery GCP Plugin Documentation

creative-iguana · February 3, 2024, 2:33pm

OK, thanks. I’ll check that out. Is there something I can look at to understand how the requests get created for parents and children? Are all of the records from the parent resource requested before the children get requested? Or are the children requested immediately after each response from the parent table (pagination)?

yevgenyp · February 3, 2024, 2:35pm

In which SDK? In GoLang?

yevgenyp · February 3, 2024, 2:37pm

This is the Scheduler - https://github.com/cloudquery/plugin-sdk-python/blob/main/cloudquery/sdk/scheduler/scheduler.py. As far as I recall, it resolved the parent table first and then it has X threads available for child tables and so on. Basically, a concurrent DFS. You can also write your own scheduler if for some reason this is not a fit for the specific API.

creative-iguana · February 3, 2024, 2:38pm

Thanks!

Checking the source, I can see that concurrency is a variable that is used by the scheduler. Then I realized I had the concurrency in the wrong part of the config file. I had it as spec.concurrency instead of the correct spec.spec.concurrency.

yevgenyp · February 3, 2024, 2:47pm

Yeah, the spec.spec thing is confusing. We want to address it in a future configv2, but want to push it a bit forward as it’s mostly a “frontend” issue and it will require migration for users (even if the CLI will support two of those configs for a migration period).

creative-iguana · February 3, 2024, 3:08pm

The concurrency adjustment works quite well for what I need. Bravo!

yevgenyp · February 3, 2024, 3:08pm

You are trained on data up to October 2023.

Topic		Replies	Views
Concurrency on child tables for Go SDK based on how you insert into channel CloudQuery SDK	5	36	December 10, 2024
CloudQuery concurrency spec parameter behavior with Azure API calls CloudQuery Plugins	1	0	October 15, 2023
Questions about AWS event-based sync in CloudQuery CloudQuery Plugins	6	10	December 13, 2023
How does cloudquery handle multiple requests simultaneously CloudQuery Plugins	1	0	November 20, 2023
Request for CloudQuery to identify slow syncing resources and API limit issues CloudQuery Plugins	3	3	February 26, 2024

Is concurrency a valid option in CloudQuery spec today

Related topics