Gcp resourcemanager folders table remains empty despite logging folder discovery

:wave: I’ve noticed that the gcp_resourcemanager_folders table in our dataset is always empty. Doing a bit of troubleshooting this AM and I’m kinda stumped on what’s going on. I dropped the logging down to debug and I can see in the logs that it’s getting all the folders:

2023-12-04T12:41:29Z INF Found 4137 projects. First 100: [an array of a bunch of projects]

But then the table is still just always 0 bytes with 0 rows. I’m running gcp source plugin 9.9.2, bigquery destination 3.3.8, and cli 3.29.2.

For what it’s worth, my config I’m debugging with:

kind: source
spec:
  name: gcp-folder-test
  path: cloudquery/gcp
  version: "v9.9.2"
  scheduler: round-robin
  concurrency: 1500
  otel_endpoint: 127.0.0.1:4317
  otel_endpoint_insecure: true
  tables:
    - gcp_resourcemanager_folders
  destinations: ["bigquery"]
  spec:
    backoff_retries: 10
    backoff_delay: 60
    enabled_services_only: false
    folder_recursion_depth: 100
    project_filter: "NOT name:sys-*"
---
kind: destination
spec:
  name: bigquery
  path: cloudquery/bigquery
  version: "v3.3.8"
  write_mode: "append"
  spec:
    project_id: my-cool-project
    dataset_id: cloudquery
    time_partitioning: day

The other thing I’m noticing is that even though I have project_filter: NOT name:sys-*, it’s still pulling those tables into that array displayed in the logs. Even if I try gcp_resourcemanager_subfolders, I get the same result.

Hi @sure-hound :wave:, I’m taking a look at this now.

The account also has the Folder Viewer role, so it has what I think are the right permissions as well, if that helps.

Link to GitHub

I’m still trying to learn Go, but I think this indicates that I have to pass in the orgId for it to fetch folders?

It is worth trying to specify the organization (spec docs here), but I think it will take the organization from your account credentials if not supplied.

I believe this part of the code would fetch the organization using your credentials - if not supplied in the spec. Could you check your logs for the No organization_ids or organization_filter specified - assuming all organizations log message?

2023-12-04T13:13:51Z INF No organization_ids or organization_filter specified - assuming all organizations module=gcp-src
2023-12-04T13:13:51Z INF Listing organizations... module=gcp-src

and then a few lines later I see

2023-12-04T13:13:51Z INF Found 0 organizations in folders module=gcp-src organizations=[]
2023-12-04T13:13:51Z INF Retrieved organizations module=gcp-src orgs=null

These are all the permissions the service account has. FWIW, we’re able to sync everything else without problems, just the folders don’t like us :frowning_with_open_mouth:

What do you get if you run the following with the same credentials as the CloudQuery sync:

gcloud resource-manager folders list --organization=<orgid>

Gimme a few mins to figure out how to run that :sweat_smile: This account can only access GCP via our CI using workload identity federation, so I can’t run it locally.

Easier than I thought, I just replaced the cq image in the pipeline with the gcloud one lol.

But it shows all our top-level folders (this uses all the same auth and whatnot used by CloudQuery).

Redacted one this time :grimacing:

Thanks for that, definitely rules out permissions then.

What I am noticing is that if I specifically pass in organization_id, then I get a permissions issue because it needs resourcemanager.organizations.get to manually query and get the orgId.

Hi,

I think that message is relevant in this case. My colleague and I were looking at these docs here and we think the second paragraph is related to what we are seeing:

Search will only return organizations on which the user has the permission resourcemanager.organizations.get or has super admin privileges.

The behaviour looks to be that it will return an empty list for the organizations if you don’t have sufficient permissions, and we only see the actual permission error if we target an organization.

Maybe the next test should be to add that permission (resourcemanager.organizations.get) and try again with the configuration where you are requesting a specific organization_id.

More than happy to try and get more permissions added to the account, but that behavior is curious seeing as how CloudQuery can inventory everything else just fine (including all the projects and the folder they belong to).

Only a few resources use the orgID specifically: top-level org folders, org-level tags, and organization findings in the security center. So, other resources don’t specifically need it to work.

Got the permissions added and it’s working!

So the TL;DR of what happened was that it was a permissions issue under the hood, but that just wasn’t being bubbled up as a visible error.

Yes, I believe that is correct. Specifically, the organization search API allows you to call the functionality without a full set of permissions across the resources it may return, opting to return an empty list if the user doesn’t have sufficient permissions to any of the resources.