Workflow Memory Leak Increased runtimes and memory usage

I'm running into an issue where my workflow's memory usage and it's runtime increases each scheduled run. And only the last JS code block and loop block where I'm sending emails and updating records is increasing in runtime. All other blocks are ms long.
image

I've been tasked with sending aprox. 1.7 million emails to our users. It's a general policy update that everyone legally needs.

The first pattern I saw was a JS code block that was increasing in runtime length each run- adding a couple seconds each time. This Js code block was running a loop with a call to the resend send batch email api endpoint, sending an email object with html markup and 50 email addresses with a subject and from field, then calling a workflow function to batch update records that were inserted into a postgres table. Here's that current code block:

const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
const timestamp = moment().format();

const results = [];
for (const [index, value] of create100EmailObjects.data.entries()) {
  // Trigger the email sending function and wait for it to complete
  const result = await sendEmailLoopBlock_lambda.trigger();

  // create objects to batch update the current chunk of emails going out
  const emailsForQuery = value.map((email) => ({
    email: email,
    status: "sent",
    sent_at: timestamp,
    error_message: "none",
  }));

  await updateUserRecords(emailsForQuery);

  results.push(result);

  await delay(150);
}

// Return all the results once the loop is done
return results;

This is all in a loop block currently with the lambda being a rest api request to resend's batch email endpoint. This was all happening initially in a JS code block and then I decided to break out this process into separate blocks not knowing if that would remove the memory issue. But that did not work either.

The first blocks in the code are querying the 1.7. mil emails from a redshift db table, then another block is running simultaneously to query the postgresql table that has records of the emails that have already been sent. I'm filtering out the user emails that have already received the email and then in another JS code block, I'm creating a large array of email arrays from 4900 emails. Each array has 49 emails to send to a bcc field in an email object that is being sent to the batch resend request. I'm looping through those 100 objects because I want to handle errors that may happen during sending.

I've tested different amount of emails for 49 during 1 run, 490, 980, 1960 to 4900. There is a slow increase in the memory used during each run per the screenshot no matter how I refactor the workflow itself.

Has anyone run into something similar? Am I just handling large amounts of data incorrectly?

Any input is appreciated! Thank you!

1 Like

Hey @micriver - that's a lot of emails!

I spent some time digging into this and have some insights to share. First, the ~86MB usage that you see here is actually a rough measure of data transferred to and from your workflow and not of memory. This comes from a time past when our pricing model for workflows was based on ingress/egress and not on the number of runs. It seems natural that there would be some degree of variation between runs, so I don't believe there is any underlying issue there.

It's much harder to say what might be causing an observed increase in runtime for that particular block. In your testing, how many times have you let the workflow execute and how much as the runtime changed? Are you able to share that data?