Cloudquery tables lack foreign key relationships for improved data flexibility

concise-oryx · March 5, 2024, 9:56am

Why do the CloudQuery tables not have any foreign key relationships between them?

It might be easier for us to find the relationships between the different resource tables if we had foreign keys enabled for these tables.

erez · March 5, 2024, 10:39am

Hi @concise-oryx,

I commented on the issue here.

concise-oryx · March 5, 2024, 10:56am

failed to download plugin: failed downloading from URL [{https://github.com/cloudquery/cloudquery/releases/download/plugins-source-aws-v23.1.0/aws_windows_amd64.zip true} {https://github.com/cloudquery/cq-source-aws/releases/download/v23.1.0/cq-source-aws_windows_amd64.zip false}]. Error not found

I’m getting this error when trying to use the AWS version as v23.1.0.

erez · March 5, 2024, 10:58am

Hi @concise-oryx, can you share your CLI version and configuration content (please redact any sensitive information)?

cloudquery --version

concise-oryx · March 5, 2024, 10:59am

cloudquery version 2.5.1

erez · March 5, 2024, 11:02am

Can you please update to the latest version? That version is quite old and doesn’t support the CloudQuery registry where new versions of plugins are hosted.
You can use this guide to update: CloudQuery Quickstart

concise-oryx · March 5, 2024, 11:02am

Can I directly update it to the latest version??
Or will it have any dependencies on the AWS and Azure and the Postgres versions that I am using?
I have used this command, but still, I am getting the same version which I am currently using and unable to update it.

Command:

curl -L https://github.com/cloudquery/cloudquery/releases/download/cli-v5.8.1/cloudquery_windows_amd64.exe -o cloudquery.exe

erez · March 5, 2024, 11:09am

Can you paste the command you’re running to check the version?
It’s best to use the latest CLI with new plugin versions, but it should be compatible with old versions too. However, the latest plugin versions do require a new CLI and won’t work with an old one.

concise-oryx · March 5, 2024, 11:12am

Okay… makes sense checking on this… thanks for your immediate support @erez.
Now I am able to update the CloudQuery CLI version but unable to init the AWS plugin.

Error: 
 unexpected error from sync client receive: rpc error: code = Unknown desc = failed to sync records: failed to sync unmanaged client: skip_tables include a pattern aws_redshift_endpoint_access with no matches

But I can see that the version of AWS Source plugin which I am using is having this table “aws_redshift_endpoint_access”.

Versions I am Using:

AWS: 23.1.0
PostgreSQL: 7.0.0
CLI: 5.8.1

erez · March 5, 2024, 11:49am

Hi @concise-oryx,

The table name is aws_redshift_endpoint_accesses. I think you’re missing es at the end.

concise-oryx · March 5, 2024, 12:41pm

Hi @erez,

Now I am able to run CloudQuery for my required version, but I am getting an issue while running it.

Error:

unexpected error from sync client receive: rpc error: code = Unknown desc = failed to sync records: failed to sync unmanaged client: your configuration references the following premium tables: "aws_accessanalyzer_analyzer_findings_v2, aws_autoscaling_warm_pools, aws_backupgateway_gateways, aws_budgets_budgets, aws_budgets_actions, aws_cloudwatch_metrics, aws_cloudwatch_metric_statistics, aws_cod...". Please run `cloudquery login` or use a valid API Key which can be generated via https://cloud.cloudquery.io to allow the sync to succeed

Syncing resources… (0/-, 0 resources/hr) [0s]

erez · March 5, 2024, 12:41pm

Hi @concise-oryx,

As the error suggests, you’d need to run

cloudquery login

and authenticate to sync those tables.

See more about this in this blog post.

concise-oryx · March 5, 2024, 3:06pm

But they are premium tables, right?
Do we forcefully have to be upgraded to v25.0 in CloudQuery since all tables are premium now?

Or can we use the old versions for some time?

erez · March 5, 2024, 3:07pm

You can still use older versions.

concise-oryx · March 5, 2024, 3:08pm

And also, there aren’t any foreign key relationships between the tables in CloudQuery. Then how do we find the relation between the tables of different resources?

For example, suppose EC2 has ELbv2LoadBalancers, ElbV2Listeners, and Ec2Eips, etc. How can we find the relation between these tables?

There might be some use cases where we need the Load Balancers and Listener information of EC2 instances. How do we find the relation between them?

Thanks, makes sense.

erez · March 5, 2024, 3:13pm

Each table has a _cq_id and _cq_parent_id columns, so you should be able to use those to link between children and parents.

You can get all the relations via the cloudquery tables command, as explained in this GitHub issue.

Can you specify the table names you’d like to correlate? For example, aws_elbv2_listeners is a child of aws_elbv2_load_balancers, so aws_elbv2_listeners will have _cq_parent_id pointing to _cq_id of aws_elbv2_load_balancers.

concise-oryx · March 5, 2024, 3:35pm

So you are saying that for the child table, we have _cq_parent_id, and for the parent table, we have _cq_id. So can we perform a join directly, or should we create a foreign key relationship between them?

And also, if I need to join aws_elbv2_target_group_target_health_descriptions and aws_elbv2_listeners tables, how do I achieve this? Because both of these tables have different parents.

erez · March 5, 2024, 3:42pm

I think you should do a join. I would not create a foreign key, as syncing data from multiple APIs is not an atomic operation, so data can get stale while we sync it or even have transient errors.

What are you trying to achieve with the join between aws_elbv2_listeners and aws_elbv2_target_group_target_health_descriptions?

concise-oryx · March 5, 2024, 3:43pm

Sorry, I didn’t get this. You mean I should create new foreign keys for them?
We can get the listeners that are not attached to the target group like that?

erez · March 5, 2024, 3:45pm

I would recommend not creating foreign keys as that might make the schema too hard to modify/extend.

Looks like aws_elbv2_target_groups has a load_balancer_arns column.

Topic		Replies	Views
Question about cloudquery neo4j destination and aws entity relations CloudQuery Plugins	3	9	May 15, 2024
Error when upgrading CloudQuery version related to AWS plugin sync issues CloudQuery Plugins	1	12	March 25, 2024
Understanding premium tier limitations for cloudquery sync features CloudQuery Plugins	6	26	March 26, 2024
Inquiry about CloudQuery plugins and their current source status CloudQuery Plugins	3	19	February 9, 2024
CloudQuery update issue with aws and postgres plugins migration error CloudQuery Plugins	6	23	November 16, 2023

Cloudquery tables lack foreign key relationships for improved data flexibility

Related topics