Broken AWS link in CloudQuery guide affecting access to resources

oriented-dragon · April 21, 2024, 9:47am

The AWS link in this guide seems to be broken: CloudQuery Blog on Open Source CSPM.

yevgenyp · April 21, 2024, 11:46am

Hi , it seems to work for me. What do you see broken on your end?

ben · April 21, 2024, 4:43pm

We are fixing the broken link to the AWS Compliance Transformation. If you notice any others, please let us know so we can fix them as well!

oriented-dragon · April 21, 2024, 7:04pm

Yeah, sorry, it was late, so my “AWS link was broken” in an AWS guide wasn’t much help lol. It was this link:

https://hub.cloudquery.io/addons/transformation/cloudquery/aws-compliance-premium/

but it is now working.
I’ve been following a bunch of these guides, and while this was the first one with something broken, a few of the guides are like a cock tease that get you 90% of the way there, but if you aren’t a seasoned developer, drop you flat with no known next steps. I assume this is on purpose for some reason, but if not and you want more feedback on guides, let me know. If you intend that to be the way, then I will pass over.
Main example, the Go Source Plugin guide.

yevgenyp · April 21, 2024, 7:31pm

I don’t think it’s for a reason, probably some challenges that we are also dependent on external tools such as dbt or other transformation tools, which means some more learning curve is needed. But yes, definitely, please share where the rough edges are on the documentation?

oriented-dragon · April 21, 2024, 7:36pm

I understand external dependency issues with guides and such, so I have ignored those. These are ones related to CloudQuery strictly. Like the Go scaffold that’s currently out is ahead of the documentation for the Go Source Plugin guide, it seems. I suck with Go and am trying to learn, since it’s the only supported source/destination option right now I went with. But once you get past the version differences and figure that out, the end of the guide assumes you know enough that the client.Client section needs no explanation, but that’s where I got stuck.

FYI, this tool has me like a crackhead thinking of all the ways it can be used and what I could possibly automate, so I have been using it every day for a couple of weeks now. Mainly, I’ve been trying to use Kestra and CloudQuery as a package setup to automate/update our internal documentation, so I will try to document what works and what doesn’t as I go.

Creating 1100+ tables in a DB off of an Azure & AWS sync was kind of fun to see, lol. That said, I have found performance limitations with Postgres and certain vendors. For example, Supabase can only handle about 7 entries per second, Azure flexible Postgres can hit about 20-30/s, but an on-prem Postgres setup hits 200+ entries a second, so there’s a drastic export time difference.

Would this be mainly due to network/latency, you think? I’m wondering if I tried from an Azure VM if the difference would be noticeable or if it’s a limitation of their services.

yevgenyp · April 21, 2024, 7:58pm

Can you give a bit more details on what you mean by performance limitations? What are those performance hits and in which scenarios?

oriented-dragon · April 21, 2024, 8:58pm

Yes, so when you are doing a sync, the resource/hr as it states there usually shows resources/seconds. However, I needed a quick screenshot, so that amount there, which I assume is entries into the DB, is what has limitations on different platforms. Does that make more sense?

yevgenyp · April 21, 2024, 9:03pm

I’m not sure it’s the DB; it can be memory or concurrency options depending on your machine and on how many accounts or how big your environment is. You can find more information about this here.

If possible, I suggest doing a call next week (anything starting Tuesday should work). You can schedule it here. This will help us understand the requirements better, and we can loop in our sales engineer for support so you can get CloudQuery up and running faster.

oriented-dragon · April 21, 2024, 9:07pm

The tests were all done from the same laptop, with the only difference being the location of the Postgres DB. But a meeting wouldn’t hurt. Please send an invite to adam.witt@iaawg.com for any time Tuesday.

yevgenyp · April 21, 2024, 9:08pm

It can be network latency. In production, it is usually better to have CloudQuery and the database be close to each other.
Sounds great! Which timezone are you at?

oriented-dragon · April 22, 2024, 3:17am

So, lmao, once you get your setup tuned and cranking, MS slaps you and says stop:

RESPONSE 429: 429 Too Many Requests
ERROR CODE: TooManyRequests
--------------------------------------------------------------------------------
{
  "error": {
    "code": "TooManyRequests",
    "message": "The request is being throttled as the limit has been reached for operation type - Read_ObservationWindow_00:05:00. For more information, see - https://aka.ms/srpthrottlinglimits"
}

That’s an hour…

GIF

Needless to say, if you do a * for the tables at the tenant level, and you own 15 subscriptions, GetRekt trying to do it all at once lol.

yevgenyp · April 22, 2024, 1:26pm

I’ll add also @ben to the thread, but I think we can help with optimizing some of that. We will also weigh in on tomorrow’s meeting.

Topic		Replies	Views
Question about bug fixes for free tier in CloudQuery CloudQuery Plugins	20	7	March 15, 2024
Need help with setting up CloudQuery in AWS Org and facing issues CloudQuery Plugins	2	3	September 17, 2024
CloudQuery AWS source plugin KMS keys sync issues with readonly role CloudQuery Plugins	3	1	January 16, 2024
CloudQuery not processing multiple source plugins in config.yml CloudQuery Plugins	10	17	February 2, 2024
CloudQuery AWS plugin B2B usage limitations inquiry CloudQuery Plugins	5	2	March 1, 2024

Broken AWS link in CloudQuery guide affecting access to resources

Related topics