Adding cluster name metadata to k8s source data for gcs bigquery queries

I’m testing out the k8s source. I’ve run it against ~4 different clusters (we have about 25 clusters). Is there a way to hydrate metadata like the cluster name into it? I send it to GCS, which uses transfer jobs to BigQuery, but I can’t distinguish the data from different clusters in my queries.

My config is:

---
kind: source
spec:
  name: k8s
  path: cloudquery/k8s
  registry: cloudquery
  version: "v5.2.6"
  tables: ["*"]
  destinations: ["gcs_k8s"]
---
kind: destination
spec:
  name: "gcs_k8s"
  path: "cloudquery/gcs"
  registry: "cloudquery"
  version: "v3.4.12"
  spec:
    bucket: "cloudquery"
    path: "k8s/${cluster_shortname}"
    format: "parquet"
    no_rotate: true

Hey :wave:.
One thing you could do is use the name as the cluster name and it will be available in a column called _cq_source_name.

Got a snippet I can steal?
I see what you mean, just not sure how to change it to your suggestion.

---
kind: source
spec:
  name: your_cluster_name
  path: cloudquery/k8s
  registry: cloudquery
  version: "v5.2.6"
  tables: ["*"]
  destinations: ["gcs_k8s"]
---
kind: destination
spec:
  name: "gcs_k8s"
  path: "cloudquery/gcs"
  registry: "cloudquery"
  version: "v3.4.12"
  spec:
    bucket: "cloudquery"
    path: "k8s/${cluster_shortname}"
    format: "parquet"
    no_rotate: true

ah. that’s easy!
one sec
nvm lol

Thanks! That was super quick. Now that this is generally working, I’ll be able to write the design review next week and hopefully get it up and running for all 25 clusters. :tada:

Nice :partying_face: Keep us posted, curious to learn how it goes!