Hi! I’m loading CQ data in S3 and then trying to query the tables in Athena, but I’ve been running into schema issues with the tags column. Was there a fix for this issue: Discord Link?
In your config file, are you setting the athena
property equal to true
?
More information about that property can be found here: CloudQuery S3 Destination Overview
Yes, I am… The specific error I am getting is
HIVE_INVALID_METADATA: Glue table aws_ec2_instances column tags has invalid data type struct<>
Are you able to redact and then share the tags that are attached to the EC2s so we can investigate the issue further?
I don’t think I can share that information, but the tag keys contain other characters such as .
, -
, /
, :
and the values could have $
, {}
, characters. Could that be the issue?
Yes, that could be the issue. Would you mind opening up an issue here: GitHub - CloudQuery: New Issue and including some example tags that you are using so that we can try and reproduce the issue?
The tags can be made-up words, but just want to have as close to the tags that are causing issues as possible to be able to investigate.
If I set up PostgreSQL DB as the destination instead of S3, do you think I’ll still have this issue?
It shouldn’t have any problems.
Submitted the issue: https://github.com/cloudquery/cloudquery/issues/14877
Hi @valued-snapper - Thank you for opening up that issue! We found the source of the bug and fixed it in v4.8.3
of the S3 destination plugin. In order for queries to work properly, you will have to delete all of the historical data that exists; otherwise, Athena will continue to have issues parsing the data.
Let us know if you run into any other issues!