CloudQuery sync issues with GCP projects and empty tables

:wave: - It’s me again, back with more stuff I broke. I’m noticing some weird syncing issues that I can’t quite figure out what’s going on. I’ll post more details in this thread, but to summarize: I can see in gcp_projects that the _cq_sync_time of a particular project is definitely getting updated on our scheduled CloudQuery runs, but I’m not seeing other tables actually being synced.

Looking at this example, the project is definitely getting synced, but the other tables aren’t showing any more syncing activity. Since this is BigQuery and it’s append-only, I’d expect to see a result returned each time that the CloudQuery sync is run.

The only thing that’s changed is we modified the name in the config file, but I don’t think that would matter. That shouldn’t be relevant, in my humble opinion. This same behavior is seen in a number of different projects (we have about 3900 GCP projects, for what it’s worth).

I discovered this because I noticed that the GCP console told me a particular bucket in this project was public, but then the gcp_storage_bucket_policies for this project was completely empty, leading me to believe that it’s not actually running.

It shouldn’t be a permissions issue; the service account has Viewer and Security Reviewer permissions at the org level (to be honest, this is probably more than what is needed) and has access to all projects in our org. I tried tailing the log file, but nothing is coming up in terms of permissions or whatever.

Here’s the config file we’re running:

  • Latest CLI version
  • Source Plugin version 9.3.3

Interesting. I will investigate and get back to you.

I was first thrown by _cq_source_name not matching, but that would mean it hasn’t successfully run/fetched since you’ve renamed the source.

Well to make it more weird… it syncs some tables but not others? Now I’m even more confused :joy:

I know sifting through the logs might be an issue with that many accounts, but how about logs?

enabled_services_only might be consuming most of the quota and you might need higher backoff_* settings.

If you can isolate the sync to a single project (and/or just gcp_storage_buckets maybe) and if it works out fine then it’s most likely quota/backoff.

I grep’d the logs but didn’t see anything at all for the project in question.
I’m just gonna break this sync up into a couple smaller jobs either way and see if that helps :crossed_fingers:

Interesting, are we sure it’s still in the correct folder specified in the config? (If it wasn’t, I’m not sure if it would show up in gcp_projects or not)

Yeah, it’s in the correct folder. I think the folder is just so large that it’s either timing out (CI job has a 3-hour limit :grimacing:) or it’s quota related.

I broke the job up into smaller jobs and it’s been running for much longer than I expected, so I’m taking that as a good sign that the sync finished, but the _cq_sync_time for this project I’m looking at in gcp_storage_buckets isn’t updating.

The only error in the logs for this project are about the osconfig inventory API not being enabled; otherwise, there’s nothing in the logs.

Would it be possible to just sync that project specifically, using project_id (instead of folder_ids) just to make sure we can fetch that normally and there’s no funny permission (or destination related, although I wouldn’t expect it) issues?

Sorry for the delay here - so I re-ran the CloudQuery sync with just the one project. The _cq_sync_time in gcp_storage_buckets still isn’t updating, and there’s no new entry in BigQuery.

The only log produced is the ERR that the osconfig API isn’t enabled in the project.

That’s just weird. Is it possible to try with another destination, e.g., file plugin maybe? Or looking at the number of resources fetched might also help. Is it 0 or a positive count?

My query that worked the other day to pull which table syncs now doesn’t work in BigQuery, so let me see if I can figure out an easy way to see which things actually synced on this last run. We do all our CloudQuery inside a CI job, so I don’t think writing a file would be super easy without exporting a CI artifact that has sensitive values in it (see details here).

But let me see if there’s a way to quickly write to GCS or something without reverse engineering our CI, lol.

Yeah, it’s still only syncing a subset of tables for this project. I’m going to delete this row from the DB and see if that forces a sync to occur.

Well… that did nothing :joy:. It’s literally just not syncing this bucket? This is so strange. The GCP account has permissions at the org level, and I can clearly see on the IAM tab of the bucket that it has view access to it. So something is happening that isn’t making it into the error logs?

Any applicable warning logs?

Updating the CI job now to debug log. Normally we only log at the error level, but YOLO I’ll turn 'em all on.

Another idea: Is the BigQuery table partitioned (or how is it partitioned)? I think partitioning may introduce delays when queried on.

yeah, it’s just literally not even trying to sync the table at all.
Our BigQuery is partitioned by day but that’s across the board and it’s managing to sync in 90% of our other projects just fine.

2023-09-21T14:50:12Z WRN the top-level `scheduler` option is deprecated. Please use the plugin-level scheduler option instead field=scheduler module=cli source=gcp-test
2023-09-21T14:50:12Z WRN the top-level `concurrency` option is deprecated. Please use the plugin-level concurrency option instead field=concurrency module=cli source=gcp-test

are the only warning messages too :confused:
it’s like it’s just ignoring the other things in the project to sync.
I’m going to specify just gcp_storage_* and see what happens then.

Are you still using enabled_services_only: true? I’m assuming the GCP storage service is enabled for the account…

Okay… I think we’re getting somewhere.
So I have enabled_services: true in the config… and when I look at the project, it doesn’t have the Cloud Storage API enabled… but the project 100% has a bucket in it.
But it looks like the Cloud Storage API doesn’t need to be enabled to create/delete buckets because in my Terraform project I have a GCS bucket I created with Terraform and the Cloud Storage API isn’t enabled there either?

enabled_services: true specifically checks for if a service is enabled before queueing the resources to be synced.

Right - but it looks like at least for GCS… it doesn’t look like that API needs to be enabled for buckets to exist or be created? I have multiple projects with buckets and that API disabled.

Looks like it… It seems possible to create buckets and even upload objects from the console with the service disabled.

Opened an issue for this to track and prioritize/discuss next steps.