Workflow step stopped functioning properly

Hi,
I have a workflow that checks for new records in a postgres db and then appends them to a google sheet. It continues then to updated records and updates them in the same GS if needed.

I noticed in the last weeks that the query that pulls new records stopped working from a certain time and returned null and later on zero results for the query. I had to repull the data again and re-activate the workflow, but it happens every 1 to 2 days or so.
The workflow runs every 5 minutes (as the google sheet needs to be in sync) and i take the last successful run as the time marker to check the updated_at and created_at time of the records.
The log won't show anything (in fact when I click on the select resource query in the log, the "inputs" tab is empty - which is weird by itself.

Did anyone bump into this issue ?

Thanks,

Hi @taluk,

Thanks for reaching out! I don't know that I've seen this specific issue before :thinking:

Could you share a screenshot or export of the Workflow?

Do you have a Javascript block that is fetching the workflowContext.lastSuccessfulRun or is workflowContext.lastSuccessfulRun in the startTrigger? (Or is it referenced directly in the Retool Database block)

Also, does this workflow have a webhook return block?

Hi @Tess ,

Thanks for responding.
PFA the workflow
Orders export to GS (1).json (41.9 KB)

I have a js block that sets the interval for fetching by the workflowContext.lastSuccessfulRun

I attached two screenshots from the failure we had today where you can see that an if block complains on " Cannot read properties of null" but in the second screenshot you can see the input looks ok.

We did find out the error reproduces in times where the main table (which is the base for the view we're selecting from) is being bulk-inserted from another screen.
But even locking wouldn't explain why this is the error we're getting and why the workflow doesn't recover after the lock is released ....


Hi @taluk,

Apologies for the delay, but I wanted to check in here!

I haven't gotten other reports of this exact issue, but we do have a project in our backlog that I am tracking to add more error information to global error handlers (for troubleshooting) :crossed_fingers:

I imported your JSON export to our internal instance for a fix, but the issue didn't reproduce (even when a bulk insert was running at the same time) which makes it tricky to get a fix

I checked our recent logs and I see a couple of "Failure" logs for this workflow - are those attributed to this same issue or are they now unrelated? Recent logs of this issue still happening might help move this forward :thinking:

Thanks @Tess , appreciate the update.
We end up using a db function that returned a subset of the orders based on time from created_at which didn't end up in a lock comparing to using a view like in the original workflow. It didn't happen since and the rest of the errors are valid timeouts that happen from time to time, mainly if there are too many updates to the google sheet.

1 Like