Can you clear unnecessary data to prevent timeouts in Workflows?

Hi there,

I have a workflow that needs to perform logic on stripe data in aggregate, and to get the data I need I have to make autopaginated (to fetch all our data) calls to, among others, the /customers, /invoices and /products endpoints. Together they return a lot of data, and the workflow runs very slowly, and is prohibitively unresponsive when trying to modify it in the browser.

From the bulky data returned by each endpoint, I really only need a few fields per item, and so I immediately process the data to make a list of items with only the fields I need. This list is then about 200 items long, each with about 15 fields, which seems plenty small. However, running any components downstream is still cripplingly slow. I was wondering if there might be a way to not store the large datasets initially received from the stripe queries in memory (like a transform on the data that only stores the result) to try and get this to be less clunky? The filter component doesn't work because I need all the items, just only a subset of the data in each.

Alternatively, maybe you could suggest a pattern where I can use pagination in workflows to fetch smaller datasets initially - I don't mind doing that, but I need to be able to aggregate the results of the whole workflow at the end, and it seems like even if I were to fetch, say, all customers in batches of 10, I'd still end up with the data from all the customers in memory at the end of the loop.

Any help appreciated, thanks!

I'm not sure about clearing the data.

Send the data subset to a new workflow? Retool workflow pricing recently changed to runs based. This pattern would increase your runs. Might want to review the pricing.

Process the data in a script locally then send it to a retool workflow webhook for further analysis?