When running CloudQuery with PostgreSQL to S3 using the PostgreSQL and S3 plugin, I noticed that it double creates _cq_sync_time
and _cq_sync_time
as the PostgreSQL database is populated with CloudQuery data as well.
I am not clear about the use case. You are syncing data from a Postgres database to an S3 bucket and are seeing duplicated columns, specifically _cq_sync_time
. How does the data get into Postgres initially?
The data is being synced via CQ into Postgres. I then export to S3 with the S3 plugin, and I’m seeing double keys of _cq_source_name
since the S3 plugin also creates the same column names that already exist in Postgres.
For example, Plugin 1, 2, and 3 sync to Postgres. Then the Postgres Source plugin is used to export with the S3 destination plugin. This second process then doubles the columns, which causes a HIVE_CANNOT_OPEN_SPLIT
.
What version of the CLI and plugins are you using?
CLI 4.4 and Source Postgres 3.0.1 Destination S3 4.10.0
Your issue has already been fixed in the latest version of the CLI (v5.15.0). Also, I would suggest upgrading your other plugins as they are pretty old, and we have improved performance and fixed a lot of issues since they were released.