Is there an established pattern for adding and maintaining indexes to columns in postgres when the plugin runs?
Hey @alfredgamulo
First of all, thanks for reaching out and joining our community.
Now to answer your question, it mostly depends on your exact needs, that’s why we do not enforce predefined patterns.
There are no big limitations around indexes, just try not to add additional constraints or to modify the data / data types in any way as otherwise the CloudQuery CLI will try to migrate the data during the next sync and if it won’t succeed you’ll either have to migrate it manually to the specified format or to run a forced migration (drop & recreate).
Could you also let us know what source plugins, tables and columns you are interested in optimizing by adding indexes, for future reference?
Does this answer your question?
for no specific source plugin or table, but any table with a jsonb column – I’m curious if I can run something like CREATE index ON <table> USING gin (<data>)
and if there would be any issues with settings such as “overwrite-delete-stale” or any source/destination plugin version upgrades?
No, I would not expect there to be any issues as all. CloudQuery does not change indexes and as such, they won’t be a problem during migrations.
The only ones I can foresee are the forced migrations which will drop the table entirely, but those are quite rare and only occur between major version bumps of a certain plugin; we also try to maintain backwards compatibility to the best of our abilities, but where the target APIs change there’s not much we can do.
They are also documented in a plugin’s release notes on the hub (see the last major for the Github source plugin, for example).
Either way, what I’d recommend is storing all of these DDL scripts so they can be easily replicated trough a single command in the future if needed.
has there been any thought of letting us set single column indexes as part of the column definition?
Hi @Duncan_Mapes we usually leave indexes configuration to the user (it can be done via running cloudquery migrate
, then applying indexes, then running cloudquery sync --no-migrate
) as those are both use case specific and DB specific (not all destinations has the concept of indexes).
Can you share what you had in mind?
I was looking to just add a simple index on a column that we query by in a plugin, and it would be nice to be able to have an attribute on createColumn() index:true, similar to how you have the primaryKey attribute.
Got it, since indexes are a DB optimization and depend on how the tables are used, enforcing them at the source schema level might not fit for all users of the plugin.
It could make sense to have it in a specific destination spec, (e.g. a list of indexes to create per table), but that would also make it harder to migrate tables.
I would say our recommendation is still to use cloudquery migrate
then cloudquery sync --no-migrate
and do any schema manipulation that’s out of scope for the plugins
So…whatever you guys want to do, its your platform.
The Migration headache that introducing indexes might create is real, and might be a good enough reason to skip it it all together.
What I will say is that you already provide destination specific attributes. For example, the PrimaryKey attribute isn’t relevant for an S3 Destination even if it is for a RDMS.