Cloudquery python sync for aws environments not finding proper method

balanced-tick · June 5, 2024, 10:12pm

Hey y’all,

I’m looking into how to use CloudQuery to sync AWS environments’ inventories/resources in Python. I’ve done it before in the CLI, and was trying to figure out a way to do it programmatically. However, I was unable to find the right way to go about this.

I landed on the Python SDK eventually, but it seems like the cloudquery-plugin-sdk is more for creating new plugins for data sources that don’t have official CloudQuery support.

How would I go about kicking off CloudQuery syncs in Python? Is the only way to do this via YAML files and the CLI, and will I have to create/change those dynamically in code?

Appreciate the help!

yevgenyp · June 5, 2024, 10:14pm

Hey! The best way and easiest way right now would be to just to call fork-exec from Python: https://github.com/cloudquery/cq_dagster_embedded_elt. The Dagster piece is not necessary in this example but just the Python bit.

balanced-tick · June 5, 2024, 10:17pm

Awesome, thank you! I’ll take a look!

Is it accurate to say that the logic of the code is the below:

Define a string that has the same information you’d put into the source and destination YAML files when using the CLI.
Create a temporary YAML file by writing that string to it.
Use some module to help you run the cloudquery sync CLI command and point at the temp file that was created.

Does that mean wherever this code is running, you will need to install CloudQuery beforehand? And that the primary/main way to run a cloudquery sync on a supported target is via the CLI, and programmatic support is based on running the same flow described above?

yevgenyp · June 5, 2024, 10:34pm

yeah exactly. even if we were to have a native python sdk you would need to download cloudquery beforehand as cloudquery is not a python library and the python binding would just call the cloudquery process under the hood anyway.

balanced-tick · June 5, 2024, 10:37pm

Assuming you mean *would need to download CloudQuery before, right? And yeah definitely, makes sense! Just wanted to make sure I didn’t misinterpret anything, appreciate the help!!

Topic		Replies	Views
CloudQuery sync drops to shell with output mode error CloudQuery Plugins	13	0	January 29, 2024
CloudQuery Powered by documentation and access inquiry CloudQuery Plugins	9	0	March 21, 2024
Request for Data Sync service for AWS in upcoming CloudQuery version Feature Requests	6	5	January 16, 2024
Issue running cloudquery sync command in docker after auth updates CloudQuery Plugins	7	0	November 21, 2023
Preinstalling CloudQuery plugins without running a sync first CloudQuery Plugins	7	3	October 13, 2023

Cloudquery python sync for aws environments not finding proper method

Related topics