I’ve noticed that the gcp_resourcemanager_folders table in our dataset is always empty. Doing a bit of troubleshooting this AM and I’m kinda stumped on what’s going on. I dropped the logging down to debug and I can see in the logs that it’s getting all the folders:
2023-12-04T12:41:29Z INF Found 4137 projects. First 100: [an array of a bunch of projects]
But then the table is still just always 0 bytes with 0 rows. I’m running gcp source plugin 9.9.2, bigquery destination 3.3.8, and cli3.29.2.
For what it’s worth, my config I’m debugging with:
The other thing I’m noticing is that even though I have project_filter: NOT name:sys-*, it’s still pulling those tables into that array displayed in the logs. Even if I try gcp_resourcemanager_subfolders, I get the same result.
It is worth trying to specify the organization (spec docs here), but I think it will take the organization from your account credentials if not supplied.
I believe this part of the code would fetch the organization using your credentials - if not supplied in the spec. Could you check your logs for the No organization_ids or organization_filter specified - assuming all organizations log message?
Gimme a few mins to figure out how to run that This account can only access GCP via our CI using workload identity federation, so I can’t run it locally.
What I am noticing is that if I specifically pass in organization_id, then I get a permissions issue because it needs resourcemanager.organizations.get to manually query and get the orgId.
I think that message is relevant in this case. My colleague and I were looking at these docs here and we think the second paragraph is related to what we are seeing:
Search will only return organizations on which the user has the permission resourcemanager.organizations.get or has super admin privileges.
The behaviour looks to be that it will return an empty list for the organizations if you don’t have sufficient permissions, and we only see the actual permission error if we target an organization.
Maybe the next test should be to add that permission (resourcemanager.organizations.get) and try again with the configuration where you are requesting a specific organization_id.
More than happy to try and get more permissions added to the account, but that behavior is curious seeing as how CloudQuery can inventory everything else just fine (including all the projects and the folder they belong to).
Only a few resources use the orgID specifically: top-level org folders, org-level tags, and organization findings in the security center. So, other resources don’t specifically need it to work.
Yes, I believe that is correct. Specifically, the organization search API allows you to call the functionality without a full set of permissions across the resources it may return, opting to return an empty list if the user doesn’t have sufficient permissions to any of the resources.