CloudQuery skips organization checks when specific project ids are provided

For GCP, when specifying a specific project_id(s) with the project_ids option, doesn’t it make sense to have CloudQuery skip this step:

2024-01-02T14:57:53Z INF No organization_ids or organization_filter specified - assuming all organizations module=gcp-src
2024-01-02T14:57:53Z INF Listing organizations... module=gcp-src
2024-01-02T14:57:53Z INF Found 1 projects in folders module=gcp-src projects=["igneous-walker-406911"]
2024-01-02T14:57:53Z INF Found 0 organizations in folders module=gcp-src organizations=[]
2024-01-02T14:57:53Z INF Retrieved organizations module=gcp-src orgs=null

It seems like a wasted step since CloudQuery is specifically being instructed on the project(s) to sync.

The reason for that is some tables do not depend on projects; they depend on an organization, and during the initialization phase, the plugin doesn’t know what tables are going to be synced.

Let me find a few examples…

gcp_resourcemanager_folders - Depends on an Organization
gcp_securitycenter_folder_findings - Depends on a Folder

We have an open issue to better document what resources are dependent on what type of resource (project, folder, or organization): Issue #15563

That makes sense. This is an extension of what I brought up back on Dec 22nd about pausing/skipping syncing on specific accounts.

I’m managing separate configurations/connections for each member account/project. I was looking to avoid any rate limiting issues with GCP by iterating over the organization/folders hundreds/thousands of times for the individual projects.

Completely understand! I just spent some time looking into this. I will open up an issue to raise this as a feature where users should be able to disable these types of discovery calls.

In the meantime, after looking at the code, a short-term workaround could be making sure that if the service account/role that you are using to run the sync doesn’t have permission to call SearchOrganizations, the GCP API should return a 403. In that case, the call should fail and not count towards the rate limit, and the sync will continue on with only a log message indicating the call error.

GitHub Issue #15954

Thanks, Ben. I’ve :+1:’d it.