Help needed with building projects using the JS SDK

Has anyone built much with the JS SDK?

Hi :wave:

You can take a look at an existing plugin here.

If you share your code, we can have a look too.

If you have a dependent table, its resolver will be called for each parent item, so you can expect them to be a bit slow. The SDK itself doesn’t do any API calls; only the plugin code does.

Hi @erez, thanks! Yes, I’m referencing that quite a bit.

Just trying to figure out ways to make it all more efficient. I’ve been holding the results in memory and returning them when the functions are called multiple times. Is that a good practice?

Where are you trying to source information from? You should not hold things in memory, but stream them.

Maybe a simpler example is in the MemDB plugin here (we use it internally in the SDK for testing).

So in the table resolver, you would usually have a REST API call, then stream the results.

So…I have Table A which has a list, then Table B, C, D, E which all need to iterate through themselves based on that list.

In the Airtable plugin, those calls just happen downstream of each other, “base” then pMap through each base to call the downstream:

const bases = await getBases(apiKey, endpointUrl);
logger.info(`done discovering Airtable bases. Found ${bases.length} bases`);

const allTables = await pMap(
  bases,
  async ({ id: baseId, name: baseName }) => {
    logger.info(`discovering tables from Airtable base '(${baseId}) ${baseName}'`);
    const tables = await getBaseTables(apiKey, endpointUrl, baseId);
    logger.info(
      `done discovering tables from Airtable base '(${baseId}) ${baseName}'. Found ${tables.length} tables`,
    );
    return { baseId, baseName, tables };
  },
  {
    concurrency,
  },
);

That’s what I mean when I say in memory.

I think Airtable might not be the best example as it doesn’t have any relations. I list all tables dynamically, so there is no static list of tables and relations.

Maybe you can share some code on how the tables are defined? You can see a table definition with relations in this link.

Ah, interesting.
We’re building some custom stuff with GitHub—some things your plugin doesn’t handle—but mostly it’s a learning experience.

So @erez, I have a somewhat related question…

I’m noticing my data is only “streaming” to the destination after all of my tables have been processed. Am I doing something wrong?
Or is that how it is meant to work?
I would have thought it would push it table by table as it goes.

Destinations usually batch data writes to improve performance. You can modify the setting per destination; for example, see PostgreSQL Spec for batch_size, batch_size_bytes, and batch_timeout options.

You can set batch_size: 1, and tables should get written immediately.

Ah, perfect!
Thanks!
I’m good with it; was just worried I was doing something wrong. :slightly_smiling_face:

If the data is getting to the destination, you’re doing great! :rocket:

Ha… thanks, one other question. Is it possible to access the syncOptions from within the NewClientFunction?

The table filters only when it comes to actually syncing with the destination; all my code to go fetch that data is still firing, wasting bandwidth and API limits. :slightly_smiling_face:

So you should do the fetching inside each table resolver, not when initializing the plugin.

NewClientFunction usually only initializes the client that fetches the data (e.g. a REST API client), and checks that the authentication works.

In each table resolver, you would use that client instance to fetch that specific table data.

Please let me know if that makes sense.

Gotcha
Yes, it does
Or at least… I think it does.

Sorry to keep bugging you @erez. I’m trying to do as you suggested with a child table under the “relations” array.

But I can’t see how to access the parent row; it’s a value in the resolver, but it just comes back as null?

I assume I’m missing something silly.

For example:

const tables: Table[] = [
    createTable({
      name: `${config.tablePrefix}_organizations`,
      description: `Table for list of Github organizations`,
      columns: getOrganizationColumns(),
      resolver: async (clientMeta, parent, stream) => {
        logger.info('Fetching Data from Github for organizations List');
        const records = await getOrganizationList(logger, config, apiClient);
        records.forEach((record) => stream.write(record));
        // stream.end();
        Promise.resolve();
      },
      relations: [
        createTable({
          name: `${config.tablePrefix}_organizations_members`,
          description: `Table for list of Github Org Members`,
          columns: getOrgMembersColumns(),
          resolver: async (clientMeta, parent, stream) => {
            logger.info('Fetching Data from Github for organizations Members List');
            //@ts-ignore
            const records = await getMembersList(logger, config, '', apiClient);
            records.forEach((record) => stream.write(record));
            Promise.resolve();
          },
        }),
      ],
    }),
];

In the second table, the parent variable is null. I’m not sure if I’m reading this correctly, but maybe it’s always null?

Link to code

@assuring-dogfish Let me take a look. This could be a bug.

Ok, thanks. I wondered if that might be the case.

Hi @assuring-dogfish, can you try with SDK version v0.1.14? It has a fix to this issue here.

@erez That did it!!! Thanks for the quick turnaround. I guess there’s not a ton of adoption of the JS SDK, huh? Surprised to be the one to find that.

We’ve seen a few people build plugins with it successfully. They might not have had relations, though.

well anyway, thanks for the help

Thank you for finding the bug! Please let me know if you have any other feedback.