Error on second incremental sync with CloudQuery invalid memory address or nil pointer

Hi @mariano, I’m getting the below error.

2024-07-04T08:53:52Z ERR failed JSON schema validation for spec error="jsonschema: '/table_options/aws_inspector2_findings' does not validate with https://github.com/cloudquery/cloudquery/plugins/source/aws/client/spec/spec#/$ref/properties/table_options/oneOf/0/$ref/properties/aws_inspector2_findings/oneOf/0/$ref/additionalProperties: additionalProperties 'list_findings' not allowed" invocation-id=xxxxxxxxxx module=aws-src

My configuration:

kind: source
spec:
  # Source spec section
  name: aws_xxxxxxxxx
  path: cloudquery/aws
  version: "v24.1.0"
  tables: ["aws_inspector2_findings"]
  destinations: ["postgresql"]
  backend_options:
    table_name: "cq_state_aws"
    connection: "@@plugins.postgresql.connection"
  spec:
    table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
              criterion:
                field: "severity"
                gt: 8

Can you let me know what is the correct filter_criteria I should use?

hi @funny-whale,

I tried with the following config and it seems to work, so I think this is what you’ll need:

kind: source
spec:
  # Source spec section
  name: aws
  path: cloudquery/aws
  version: "v27.5.0"
  tables: ["aws_inspector2_findings"]
  destinations: ["postgresql"]
  backend_options:
    table_name: "cq_state_aws"
    connection: "@@plugins.postgresql.connection"
  spec:
    table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
              severity:
                - value: "10"
                  comparison: "PREFIX"
                - value: "9"
                  comparison: "PREFIX"
                - value: "8"
                  comparison: "PREFIX"

As you can see, there are multiple comparisons because the AWS API is a bit limited in terms of what you can do (it doesn’t have a gt option).

That said, I couldn’t reproduce your error, so I would also recommend upgrading to the latest AWS plugin version (v27.5.0) and CLI version (5.24.0) to make sure you have all the relevant fixes.

@herman, thanks for the reply. Can you please let me know what this block does?

table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
              severity:
                - value: "10"
                  comparison: "PREFIX"
                - value: "9"
                  comparison: "PREFIX"
                - value: "8"
                  comparison: "PREFIX"

If you follow the docs on filter_criteria (link)

You’ll see that the strategy you used for filtering (i.e. using criterion and gt) doesn’t really exist.

This configuration is a similar attempt to what you tried to achieve, but using valid configuration according to the docs.

Note that there’s also a VendorSeverity property that is filtered in the same way as the Severity property. I’m not 100% sure which one is the correct one you’re looking for.

Hey @mariano, I’m unable to get data using the above code. I’ve made sure that the table is clear before running. Later, I’ve run the YAML file, and I see there is no data recorded in this table (aws_inspector2_findings). I don’t see a record also being added for this run in the cq_state_aws table. Can anyone help me on this?

Hi @funny-whale, we’ll be happy to help. Can you share the config that you used for the last run? Are there findings of level 8 or higher that you can see in the AWS console?

yeah we can see them.
Config file

kind: source
spec:
  # Source spec section
  name: aws
  path: cloudquery/aws
  version: "v27.5.0"
  tables: ["aws_inspector2_findings"]
  destinations: ["postgresql"]
  backend_options:
    table_name: "cq_state_aws"
    connection: "@@plugins.postgresql.connection"
  spec:
    table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
              severity:
                - value: "10"
                  comparison: "PREFIX"
                - value: "9"
                  comparison: "PREFIX"
                - value: "8"
                  comparison: "PREFIX"

great, thanks, let me dig into that and get back to you

@funny-whale If it’s a quick experiment (since there are no returned rows), could you try changing severity to vendor_severity? It could be that this is the field that is being set for those vulnerabilities. This would show up in the console where you see them.

I just realized, I think the config I shared earlier wasn’t quite right, it would have to be like this:

kind: source
spec:
  # Source spec section
  name: aws
  path: cloudquery/aws
  version: "v27.5.0"
  tables: ["aws_inspector2_findings"]
  destinations: ["postgresql"]
  backend_options:
    table_name: "cq_state_aws"
    connection: "@@plugins.postgresql.connection"
  spec:
    table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
              severity:
                - value: "10"
                  comparison: "PREFIX"
          - filter_criteria:
              severity:
                - value: "9"
                  comparison: "PREFIX"
          - filter_criteria:
              severity:
                - value: "8"
                  comparison: "PREFIX"

So that the filter criteria are OR, not AND. Could you try rerun with this config?

Yeah sure I can try that

Hi @ben, the code you have sent is working fine if I specify only one filter (like if I want both HIGH and CRITICAL findings it’s not working). If I try to filter only HIGH findings, the code is working fine.

Can you let me know how I can apply an OR operator here?

source.yml

kind: source
spec:
  # Source spec section
  name: aws_xxxxx
  path: cloudquery/aws
  version: "v27.5.0"
  tables: ["aws_inspector2_findings"]
  destinations: ["postgresql"]
  backend_options:
    table_name: "cq_state_aws"
    connection: "@@plugins.postgresql.connection"
  spec:
    regions: ["us-east-1"]
    table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
              severity:
                - comparison: EQUALS
                  value: CRITICAL
                - comparison: EQUALS
                  value: HIGH

destination.yml

kind: destination
spec:
  name: postgresql
  path: "cloudquery/postgresql"
  version: "v8.2.4"
  write_mode: "overwrite-delete-stale"
  migrate_mode: "forced"
  spec:
    connection_string: "xxxxxxxxxxxxxxxxx"

What is the behavior you are seeing when you specify both?

Hey @ben, I’m using the below source.yml file. Now, I’m trying to sync only critical findings, and I can see that not all findings are being synced, even for one filter.

kind: source
spec:
  # Source spec section
  name: aws_xxxxxxxx
  path: cloudquery/aws
  version: "v27.5.0"
  tables: ["aws_inspector2_findings"]
  destinations: ["postgresql"]
  backend_options:
    table_name: "cq_state_aws"
    connection: "@@plugins.postgresql.connection"
  spec:
    regions: ["us-east-1"]
    table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
              severity:
                - comparison: EQUALS
                  value: CRITICAL

I’m getting that there are only 4 critical findings, but I can see that there are 122 findings in the console. It is just taking the below (last given findings):

If I give it like this,

severity:
  - comparison: EQUALS
    value: CRITICAL
  - comparison: EQUALS
    value: HIGH

I’m getting HIGH findings being synced.

If I specify as below,

severity:         
  - comparison: EQUALS
    value: HIGH
  - comparison: EQUALS
    value: CRITICAL

I am getting only CRITICAL findings.

NOTE: Also, I can see that if I change the table_options, the old data is not being refreshed (deleted) from the table.

@funny-whale Thanks, we will take a look. Can you comment out the backend_options config section for this test and re-run a sync? Stale data will not be deleted when incremental syncing is being used, and also every sync will only fetch data since the last timestamp stored in the state table, so that might be why you’re only seeing a subset of the rows.

If you could rerun and let us know the results, that would be really useful. It sounds like there might also be a problem with the primary key uniqueness on our side; we’ll have to look into that.

Hi @herman,

Yeah, this is working by removing backend_options config.

Like,

severity:         
  - comparison: EQUALS
    value: HIGH
  - comparison: EQUALS
    value: CRITICAL

This works if I remove backend_options config. But why is it not working with incremental sync?

Is this the only incremental table you are using? If it is, can you try truncating the contents of the cq_state_aws table and rerun the syncs?

Also, for tables that are using incremental sync, CloudQuery will not delete any of the old data because incremental syncs only grab the data that has been added since the last sync was completed.

Yeah, we want to sync only the inspector2_findings table. But I’ve got to know that aws_inspector2_findings is a special case and it will refresh the DB when there is a change in table_options.

Incremental syncs use a hash of the table_options to determine if there was a previous sync. So if you change the table_options, it will sync all of the data the first time, and only in subsequent syncs will it query for just those records that have been added.

Hi @ben,

When I’m trying to add table_options and incremental_sync for the inspector2_findings table, I’m adding certain filters in table_options like getting only HIGH severity findings, but I’m getting all the findings irrespective of the filter.

source.yml

kind: source
spec:
  name: aws_xxxxxx
  path: cloudquery/aws
  version: "v27.7.0"
  tables: ["aws_inspector2_findings"]
  destinations: ["postgresql"]
  spec:
    regions: ["us-east-1"]
    table_options:
      aws_inspector2_findings:
        list_findings:
          - filter_criteria:
            severity:
              - comparison: EQUALS
                value: HIGH
  backend_options:
    table_name: "cq_state_aws"
    connection: "@@plugins.postgresql.connection"

destination.yml

kind: destination
spec:
  name: postgresql
  path: "cloudquery/postgresql"
  version: "v8.2.5"
  write_mode: "overwrite-delete-stale"
  migrate_mode: "forced"
  spec:
    connection_string: "xxxxxxxxxxxxxxxxxxx"

Could you help me understand why the filter isn’t working?

Hey :wave: I think your YAML might be incorrectly indented, and that’s why it’s being ignored :thinking:

I noticed this in your snippet

          - filter_criteria:
            severity:

If you look at previous examples of this configuration within this Discord thread, you can see severity should be indented like this:

          - filter_criteria:
              severity: