Help with defining multiple data types for a CloudQuery source plugin column

I am building a source plugin using the Python SDK. One of the fields in the response from the REST API can have one of many different types: string, integer, JSON object, datetime. How can I tell CloudQuery to accept all of these types for this column?

I have already tried string, binary, and JSON but each generates errors.

Hi Dan,

I see in the Typeform plugin we have an example of using a JSONType column.

What errors are you seeing when you try to use JSON?

Here are a few examples of errors:

json.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I think that error is coming from here where the data is being converted into a JSON scalar.

Assuming you are setting your Column to JSONType for the field that can have many different types, then I think you can force the string into JSON encoding by calling json.dumps(field) in the table resolver. For example, borrowing the code from the typeform plugin, you would do something like this:

class FormsResolver(TableResolver):
    def __init__(self, table=None) -> None:
        super().__init__(table=table)

    def resolve(self, client: Client, parent_resource) -> Generator[Any, None, None]:
        for form in client.client.list_forms():
            form['jsoncolumn'] = json.dumps(form['jsoncolumn'])  # <--- new code
            yield form

This can (and probably should be) fixed in the SDK, but it would be good to see if this approach works for you in this case.

Thanks! That is working. I needed to make a small change because some of the values are valid JSON and some are not. Here is what I used:

try:
    temp = json.loads(item['field_maybe_json'])
except (json.JSONDecodeError, TypeError):
    item['field_maybe_json'] = json.dumps(item[k])
yield item