Workflow Error but Workflow still "Pending"

So I have a workflow (A) that calls another workflow (B). This morning (about an hour ago) A failed at the step that called B. It failed after 2:10 minutes and then when A was re-run, B failed after 2:36 minutes.

So a couple of oddities here:

  1. The step in A should have timed out (& failed) after 10s - why the additional 2+ minutes?
  2. Both invocations of B are still "Pending", showing a duration of 0ms on the logs and no other info to enable analysis as to what has gone wrong. No logs at all:

How do I move forwards from here please?!?

BTW with yesterday being a Bank Holiday here in the UK, I would have expected this workflow to have nothing to actually process and therefore complete very quickly once all the checks came back as such.

Hi @mawdo81 This is a bug that our team is looking into :confused:

They're actively working on a fix, and we're hoping to have it ship within a week or so. I'll reach back out when I have an update

1 Like

Hi @Tess - did this ship? I have a workflow sitting in "Pending" right now, and not sure what can be done to force it to complete (error out is fine at this point...).

Hi @jg80

I just checked in and it sounds like the fix actually already shipped. When did this workflow get triggered? Could you share some screenshots? The fix will only resolve this issue from happening moving forward, so any workflows triggered before the fix will be stuck in the pending state.

Triggered last night - it threw back an "internal server error" but that was it. Logs don't show anything:

I'm actually refactoring and changing up the approach, so not any sort of a critical show-stopper. Just would be good to know if I can do something to avoid the issue in the future (though perhaps the shipped fix will do that for me...).

@Tess what version has this shipped in? We're seeing this as well. We're self-hosted on version 3.52.1

Hi all,

I'm triggering a workflow in a retool app via a webhook and also from another workflow and I'm also getting this 'Pending' error. The behaviour I'm observing is that when triggered from the app I see a far lower success rate then when triggered from the other workflow but it's still not perfect. I also have used the 'Import workflow' feature in the app and see different behaviour from that too, it seems to return 'internal error' a lot faster.

Not sure if the version mentioned by Tess has worked back in May but if someone could point me in the right direction as to how to debug this that would be great, thanks in advance!

Hi there,

If you both can send me the workflow run id where you're seeing pending, I can try to dig into it a bit further!

So after coming back to it after the weekend all of the 'Pending' runs have turned into failed but don't show any logs. I'll provide you with a few different runID's which I'm copying from the URL when clicking on the failed run on the run history.

  1. 557e39ae-f865-4d8b-9543-d4d117536a3f
    This was triggered by another workflow and had failed in-between many many successful runs of the workflow, there are about 7 failed runs in over 100 runs that were triggered by a single run of the other workflow.

  2. dbdc67be-1091-4c59-82c7-6f99beef0bf2
    377bcfb8-505b-47c3-aabc-20a683f2a9ec
    85cc4f6c-c691-49e5-a082-26ddf9e437fd

These are the first, middle and last runs when the workflow was triggered from an app via a webhook. They were all showing pending for several days. I have it being triggered by a webhook as the workflow app integration at the time of programming was not sending webhook parameters correctly.

  1. 41f97cb3-89d8-4cb8-836c-be1316d8a210
    f082189d-ae41-4956-afe9-23e587ca8755

I have also set up the workflow to be run via the 'import workflow' integration in the app and at the time of testing it returned 'internal error' straight away in comparison to triggering it via webhook. Above are the now failed (previously pending for at least 8 hours) runid's.

I hope this is clear, please let me know what you think and if you need any information from me at all, THANK YOU! :slight_smile:

Hi @Matthew_Nicholas Thanks! It's clear :slight_smile: I have shared this info with our team internally, and I will let you know if they have any further insights or follow up questions

1 Like

Hi Tess, any news?

Hi @Matthew_Nicholas,

I looked through the logs & reached out to our team internally. Unfortunately, I don't have much of an update to share yet :disappointed: I also see in our logs that each of these runs has been marked as failed. Our team shared that the pending state is pretty rare and has most of the time has been caused due to a webhook workflow when the webhook return block errors out.

Are you using a return block in the SERP KW Search workflow? Are you triggering any APIs that have rate limiting? The behavior of triggering over 100 runs by a single run of the other workflow could potentially be causing rate limiting :thinking:

If you can share an export of the workflow I can try to keep digging into specifics. Has the pending issue continued to persist?

Does the JSON input change when you trigger it from an app, webhook, or other workflow?

Also, hi @steve.troxell.fbg :slightly_smiling_face: :wave: I saw you were able to connect with our team internally for your specific instance :+1:

@Tess I'm having a similar problem in Retool Cloud. Workflow ID 5c9ec2f8-4e94-4cad-83b0-145697d36818. After adding a response block and triggering the workflow from an app twice, I now have two runs that have not left the "Pending" state for some time. I am also unable to cancel these runs from the Retool UI.

I can avoid this issue by deleting the response block, but I'm concerned that my organization might be billed for these indefinitely-pending workflow operations. Is there any way to clear these out on your end?

Hi @gkahen,

Thanks for chiming in. I see on your other thread it looks like they terminated around ~12 hours. Darren and I chatted with our Workflows engineering team about this. Initially, they weren't able to find any known bugs related to this pending issue or missing error logs, but if it's something we can reproduce or if it's happening more often, definitely let us know!