Duplicate records appearing in daily CloudQuery syncs with no log visibility

Hi CQ team,

I am running syncs once per day, but I do see duplicate records for a particular day. Sync starts every day at 12 AM and runs pretty fast, completing in about 20 minutes. Currently, I don’t have visibility into logs since the pods that I am running sync on are terminated. Could this be due to built-in retries, or am I missing something here?

Interesting, can you share more details about your setup including information like is this running on a VM or a container? What is handling the scheduling? Can you share a redacted version of the config?

Also, just so you know, you can use the --console-log flag with the --log-format json so that all of the logs will be outputted to the console, and if you are using something like ECS, the logs will be available in CloudWatch Logs.

here is the config:

kind: source
spec:
  name: "aws-${REGION}"
  registry: local
  path: /app/plugins/aws
  tables:
    - aws_dynamodb_tables
    - aws_rds_instances
    - aws_rds_clusters
    - aws_rds_reserved_instances
    - aws_secretsmanager*
  destinations: ["postgresql"]
  spec:
    concurrency: 100
    initialization_concurrency: 4
    aws_debug: false
    regions:
      - ${REGION}
---
kind: destination
spec:
  name: postgresql
  registry: local
  path: /app/plugins/postgresql
  write_mode: append
  spec:
    connection_string: ${PG_CONNECTION_STR}

I run the sync using:

/app/cloudquery sync <config_file> --log-console --no-log-file --log-format json --log-level debug

The deployment is done using a helm chart.
schedule: 0 0 * * *

CloudQuery itself cannot retry if the entire sync fails, but the Kubernetes scheduler can possibly have a retry mechanism, though I am not sure.