So I have a workflow (A) that calls another workflow (B). This morning (about an hour ago) A failed at the step that called B. It failed after 2:10 minutes and then when A was re-run, B failed after 2:36 minutes.
So a couple of oddities here:
The step in A should have timed out (& failed) after 10s - why the additional 2+ minutes?
Both invocations of B are still "Pending", showing a duration of 0ms on the logs and no other info to enable analysis as to what has gone wrong. No logs at all:
BTW with yesterday being a Bank Holiday here in the UK, I would have expected this workflow to have nothing to actually process and therefore complete very quickly once all the checks came back as such.
Hi @Tess - did this ship? I have a workflow sitting in "Pending" right now, and not sure what can be done to force it to complete (error out is fine at this point...).
I just checked in and it sounds like the fix actually already shipped. When did this workflow get triggered? Could you share some screenshots? The fix will only resolve this issue from happening moving forward, so any workflows triggered before the fix will be stuck in the pending state.
I'm actually refactoring and changing up the approach, so not any sort of a critical show-stopper. Just would be good to know if I can do something to avoid the issue in the future (though perhaps the shipped fix will do that for me...).
I'm triggering a workflow in a retool app via a webhook and also from another workflow and I'm also getting this 'Pending' error. The behaviour I'm observing is that when triggered from the app I see a far lower success rate then when triggered from the other workflow but it's still not perfect. I also have used the 'Import workflow' feature in the app and see different behaviour from that too, it seems to return 'internal error' a lot faster.
Not sure if the version mentioned by Tess has worked back in May but if someone could point me in the right direction as to how to debug this that would be great, thanks in advance!
So after coming back to it after the weekend all of the 'Pending' runs have turned into failed but don't show any logs. I'll provide you with a few different runID's which I'm copying from the URL when clicking on the failed run on the run history.
557e39ae-f865-4d8b-9543-d4d117536a3f
This was triggered by another workflow and had failed in-between many many successful runs of the workflow, there are about 7 failed runs in over 100 runs that were triggered by a single run of the other workflow.
These are the first, middle and last runs when the workflow was triggered from an app via a webhook. They were all showing pending for several days. I have it being triggered by a webhook as the workflow app integration at the time of programming was not sending webhook parameters correctly.
I have also set up the workflow to be run via the 'import workflow' integration in the app and at the time of testing it returned 'internal error' straight away in comparison to triggering it via webhook. Above are the now failed (previously pending for at least 8 hours) runid's.
I hope this is clear, please let me know what you think and if you need any information from me at all, THANK YOU!
Hi @Matthew_Nicholas Thanks! It's clear I have shared this info with our team internally, and I will let you know if they have any further insights or follow up questions
I looked through the logs & reached out to our team internally. Unfortunately, I don't have much of an update to share yet I also see in our logs that each of these runs has been marked as failed. Our team shared that the pending state is pretty rare and has most of the time has been caused due to a webhook workflow when the webhook return block errors out.
Are you using a return block in the SERP KW Search workflow? Are you triggering any APIs that have rate limiting? The behavior of triggering over 100 runs by a single run of the other workflow could potentially be causing rate limiting
If you can share an export of the workflow I can try to keep digging into specifics. Has the pending issue continued to persist?
Does the JSON input change when you trigger it from an app, webhook, or other workflow?
Also, hi @steve.troxell.fbg I saw you were able to connect with our team internally for your specific instance
@Tess I'm having a similar problem in Retool Cloud. Workflow ID 5c9ec2f8-4e94-4cad-83b0-145697d36818. After adding a response block and triggering the workflow from an app twice, I now have two runs that have not left the "Pending" state for some time. I am also unable to cancel these runs from the Retool UI.
I can avoid this issue by deleting the response block, but I'm concerned that my organization might be billed for these indefinitely-pending workflow operations. Is there any way to clear these out on your end?
Thanks for chiming in. I see on your other thread it looks like they terminated around ~12 hours. Darren and I chatted with our Workflows engineering team about this. Initially, they weren't able to find any known bugs related to this pending issue or missing error logs, but if it's something we can reproduce or if it's happening more often, definitely let us know!