Support for streaming in source plugins

Hey everyone,
I’m evaluating implementations for configuration and metadata extraction from the main 3 cloud service providers to an internal storage we have.
We have some tables we expect to be very large, and we thought of going towards the directions of extracting them in batches or even streaming rows.
I saw CQ has great support for streaming destinations plugins such as Kafka, but I couldn’t find information about the support on the source side.
Do source plugins first load all results into memory and then stream them or stream intermediate batches \ rows directly to the destination?
Thanks a lot!
Itay.

Hi @Itay_Waisman source plugins stream the results and they don’t load everything into memory.
We apply batching both on sources side and destinations side to achieve the best performance and memory consumption.

You can read more about source batching in One change to optimize them all: a story about batching on the source side | CloudQuery Blog and for destinations, it’s destination specific and you can see the Kafka defaults in cloudquery/plugins/destination/kafka/client/spec/spec.go at 9d420c96278fe08c35b37a1afb915baf271dee59 · cloudquery/cloudquery · GitHub

Please let me know if you have further questions

2 Likes