Hey, how do you guys handle duplications in larger scale deployments of CloudQuery? For example, the overwrite-delete-stale works when running CloudQuery in one container. However, running 200 containers at once seems to net a lot of orphaned resources.
Does the delete process of old records occur after the sync? Would it be better to have a PostgreSQL trigger to automatically delete stale rows based on NOW() - _cq_sync_time?
overwrite-delete-stale is designed to work when running in parallel containers. The key is to ensure that each config in each container uses a unique name. More details can be found here.
I currently have each _cq_source_name set as exclusive unique names. The issue is if a sync fails, the rows are not deleted. It appears as though the delete occurs after a successful sync. Is this the case? Sometimes these rows can end up becoming orphaned in our case.
Yes, deletion only occurs after a successful sync. Only a panic or some other very serious error should result in the sync failing. Which plugin are you using that doesn’t reliably sync successfully?
Also, the next time that the sync is run, it should clean up all of the stranded records, so they shouldn’t be permanently stranded.
The AWS plugin is the one that has the problem. Now it has only occurred 2 times in ~300 syncs. The issue is that on the next successful run, the resources are not removed.