Build a workflow to return all results for a cursor-based paginated API (Hubspot)

I need some assistance to see if we can build a workflow around returning all of the results from Hubspot's cursor paginated API.
The end goal is to hopefully return all of the data from the API and then load that data into the Retool built-in Dashboard.

This was useful, but not sure if it will work on workflows.
I also need some help with building the recursive steps for the Hubspot API data return.

Hey @Jushy_jag :slightly_smiling_face:

Building something in workflows is certainly possible!

Function blocks are super helpful because they provide the ability to make reusable requests that accept similar parameters, similar to how query.trigger({additionalScope: {}}) works. The syntax and UI for them are a bit different so I'd recommend reading the docs.

To give an example I'm going to use the public GraphQL API from anilist.co and we're just going to fetch a list of characters. The API accepts a page variable that we can use to mock the cursor.

This is what the function block looks like!

Note that it has one cursor parameter defined at the top which can be used in the query. The 1 there is just a default value for testing, it will actually get passed values from a JavaScript block:

Here we call paginatedApi as an async function, similar to how we might call query.trigger() except that instead of passing parameters as part of and additionalScope object they get passed directly to the function in the correct order.

Outside of that things function as they otherwise would! The API has a total of 5000 characters so the pageLimit shouldn't be reached but its there just for good measure :sweat_smile:

You can try importing the workflow and play around with it yourself, let me know if that helps or you have any questions!
getting all paginated results.json

Smart idea here. Looping over a static number that can be set in a variable works well.
One issue I am running into, which is probably specific to how the Hubspot API works, is that the cursor returned after every API is very large. From running some test API calls from my account, I have come to realize that the cursor could be as large as 3 million. Clearly I cannot loop through 3 millions times, that will take forever. Is there any other way around this? Maybe by calling it recursively and passing each cursor to the next API call.
Another issue is that when I loop through with the way you have shared, I tend to get a lot of duplicate data in the return. This issue is probably because the cursor does not increase linearly.

@Jushy_jag that's a great callout and something I could've clarified a bit more here. Typically with cursor-based pagination, the next cursor will be returned in the API response itself. Your code might look something more like this:

const pageLimit = 200;
let totalRecords = [];
let cursor;

for(let page = 1; page <= pageLimit; page++){
  const pageData = await paginatedApi(page);
  const newRecords = pageData.data.Page.characters;
  cursor = pageData.data.Page.pageInfo.nextCursor;
  
  totalRecords = totalRecords.concat(newRecords);
  
  if(!cursor) break;
}

return totalRecords;

Presumably, the call to your API would be different as well, will a more explicit reference to a cursor. Different APIs handle cursor-based pagination slightly differently so there's typically some additional customization that needs to be done. Good questions to ask are

  • How does my API expect and return cursors?
  • How can I tell if there's a next page?

And then build your answers into the JS logic.

I think your point about the cursor not being linear is likely spot on too. Cursors tend to be associated with the last record that has been fetched as opposed to a particular page, one of the nice things about them is that they can work independently of page size.

Do you expect to be processing 3 million rows? Do you maybe have an example response from your API that we could look at for the correct syntax?

Anyone have some example code for an API that uses Limit Offset? I'm trying to build a workflow that will call an API and return all results the API takes various parameters including fromDate, toDate, limit, and start.

More specifically I have an API that returns the following: {"data":{"success":true,"message":"Success","response":{"start":1,"limit":1000,"returned":1000,"total":2954,"data":[{}]}}}

So I need to loop through until data.response.returned = 0

Hey @kirkholmes!

Curious to know if something like this works for you:

const pageLimit = 200;
let totalRecords = []; 

for(let page = 1; page <= pageLimit; page++){ 
  const pageData = await paginatedApi(page);
  const newRecords = pageData.data;
  const totalCount = pageData.total;
  totalRecords = totalRecords.concat(newRecords);

  if(totalRecords.length => totalCount) break; 
} 

return totalRecords;