CloudQuery AWS Pricing Plugin Bug: Missing OnDemand Terms for ~1.7% of EC2 Products

# CloudQuery AWS Pricing Plugin Bug: Missing OnDemand Terms for ~1.7% of EC2 Products

## Summary

The CloudQuery `awspricing` plugin (v4.7.21) fails to import OnDemand pricing terms for approximately 26,392 EC2 products (1.7% of all EC2 products), despite these terms being present in AWS’s official Pricing API. This results in NULL prices in the `awspricing_service_terms` table for products that have valid pricing data.

## Environment

- **CloudQuery Version**: 6.30.0

- **Plugin**: `cloudquery/awspricing` v4.7.21

- **Destination**: DuckDB v6.2.3

- **Configuration**: Standard sync with `concurrency: 100`

## Reproduction Steps

1. Run CloudQuery sync with the awspricing plugin:

kind: source

spec:

  name: “awspricing”

  path: “cloudquery/awspricing”

  version: “v4.7.21”

  tables: [“*”]

  destinations: [“duckdb”]

  spec:

    concurrency: 100

2. Query for products missing OnDemand terms:

SELECT

COUNT(*) as products_with_terms,

(SELECT COUNT(*) FROM awspricing_service_products

WHERE json_extract_string(attributes, ‘servicecode’) = ‘AmazonEC2’) - COUNT(*) as products_without_terms

FROM awspricing_service_products p

JOIN awspricing_service_terms t ON p.sku = t.sku AND t.type = ‘OnDemand’

WHERE json_extract_string(p.attributes, ‘servicecode’) = ‘AmazonEC2’;

**Result**: 1,571,963 products with terms, **26,392 products without terms**

## Specific Examples: NAT Gateway Pricing

Three NAT Gateway SKUs are affected:

| SKU | Region | Usage Type | Product Exists | Terms Exist in AWS API | Terms in CloudQuery DB |

|-----|--------|------------|----------------|----------------------|----------------------|

| `KFNSSCHWD43WZK2R` | eu-west-1 | EU-NatGateway-Hours | :white_check_mark: Yes | :white_check_mark: Yes ($0.048/hr) | :cross_mark: **NULL** |

| `AVQ7WERG9GYTK6BD` | eu-west-1 | EU-NatGateway-Prvd-Gbps | :white_check_mark: Yes | :white_check_mark: Yes ($1.164/Gbps-hr) | :cross_mark: **NULL** |

| `CV5ZUBAHD2XVMVJN` | il-central-1 | ILC1-NatGateway-Hours | :white_check_mark: Yes | :white_check_mark: Yes ($0.0504/hr) | :cross_mark: **NULL** |

### Verification in AWS Pricing API

I verified these SKUs exist with valid OnDemand terms in the raw AWS Pricing API response at:```

https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json

Example from the JSON (SKU: `CV5ZUBAHD2XVMVJN`):

{

“terms”: {

“OnDemand”: {

“CV5ZUBAHD2XVMVJN”: {

“CV5ZUBAHD2XVMVJN.JRTCKXETXF”: {

“offerTermCode”: “JRTCKXETXF”,

“sku”: “CV5ZUBAHD2XVMVJN”,

“effectiveDate”: “2025-10-01T00:00:00Z”,

“priceDimensions”: {

“CV5ZUBAHD2XVMVJN.JRTCKXETXF.6YS6EN2CT7”: {

“rateCode”: “CV5ZUBAHD2XVMVJN.JRTCKXETXF.6YS6EN2CT7”,

“description”: “$0.0504 per NAT gateway Hour”,

“unit”: “Hrs”,

“pricePerUnit”: {

“USD”: “0.0504000000”

          }

        }

      }

    }

  }

}

}

}

## Affected Product Families (EC2 only)

Compute Instance:               22,573 products missing terms

Compute Instance (bare metal):   2,277 products missing terms

Dedicated Host:                  1,514 products missing terms

System Operation:                    8 products missing terms

Storage:                             4 products missing terms

CPU Credits:                         4 products missing terms

IP Address:                          3 products missing terms

NAT Gateway:                         3 products missing terms

Fee:                                 2 products missing terms

## Expected Behavior

All products in `awspricing_service_products` that have OnDemand pricing in AWS’s API should have corresponding entries in `awspricing_service_terms` with `type = ‘OnDemand’`.

## Actual Behavior

26,392 EC2 products (1.7%) have records in `awspricing_service_products` but no corresponding OnDemand terms in `awspricing_service_terms`, despite these terms existing in the AWS Pricing API.

## Impact

Users querying pricing data will get incomplete results and NULL values for affected products, leading to:

- Incorrect cost calculations

- Missing pricing data in reports

- Failed joins between products and terms tables

## Additional Notes

- The issue appears to affect specific products across multiple product families

- Products DO exist in the `awspricing_service_products` table with correct metadata

- The OnDemand terms data IS available in AWS’s API

- The plugin successfully imports 98.3% of OnDemand terms, suggesting a parsing edge case

Would appreciate any insights into what might cause certain OnDemand terms to be skipped during import. Happy to provide additional debugging information if needed.

2 Likes

Hi Corey and welcome :waving_hand: We’ll try and reproduce this on our end and let you know if we need additional information. Appreciate the detailed report, super helpful

Hi @quinnypig , a quick update we’re still looking at this (got a bit held back because of AWS being down yesterday and not being able to sync the pricing data).

What I can recommend for now is using write_mode: append to ensure data is not de-duplicated at the destination, as that might be the cause (haven’t confirmed that’s the root cause yet).

I would also point out that to correctly count products without terms you would need to account for the Reserved term type in the query, so it should look similar to the query below:

SELECT COUNT(*)
FROM awspricing_service_products p
WHERE json_extract_string(p.attributes, '$.servicecode') = 'AmazonEC2'
  AND NOT EXISTS (
        SELECT 1
        FROM awspricing_service_terms t
        WHERE t.sku = p.sku
          AND t.type = 'OnDemand'
      )
  AND NOT EXISTS (
        SELECT 1
        FROM awspricing_service_terms t
        WHERE t.sku = p.sku
          AND t.type = 'Reserved'
      );

With write_mode: append the full destination configuration should be similar to the one below

kind: source

spec:

  name: “awspricing”

  path: “cloudquery/awspricing”

  version: “v4.7.21”

  tables: [“*”]

  destinations: [“duckdb”]

  write_mode: "append"

  spec:

    concurrency: 100
1 Like

Hi @quinnypig, we believe we found the root cause of the issue.
It’s a bug in our DuckDB destination not handling conflicts correctly.
A fix for this issue is available in the latest version of the plugin v6.2.4.

An example fixed config should look like the one below

kind: destination
spec:
  name: duckdb
  path: cloudquery/duckdb
  registry: cloudquery
  version: "v6.2.4"
  send_sync_summary: true
  spec:
    connection_string: ./database.db

Please let me know if you have further issues with missing data

1 Like