Sudden Workflow Network Errors Out of Nowhere for Rest APIs

Out of nowhere, our workflows are throwing Network Timeout errors for HTTP Rest API Post requests. We know there is nothing whatsoever wrong with our API endpoints, as we have tested them thoroughly and test them after any errors. In fact, the logs don't even show any attempts from Retool to actually make the API Post request. So Retool is saying: "network timeout at " and throwing an error, but Retool isn't actually making any requests whatsoever. Then when we run the workflow manually, it works fine and there are no network requests errors. The error just happens when the workflow is triggered via a Webhook and again, Retool doesn't even make any requests. Seems like the Request step just gets hung up in the workflow and doesn't do anything. This just started happening this morning after we have run these webhooks many times.

Hey @ddsgadget - thanks for reaching out! Just for some context, are you on a cloud or self-hosted instance? If the latter, which version?

We are on the cloud

Ok great - thanks for confirming. Can you share the id of an affected workflow?

Quick update here, @ddsgadget - looking at the logs, I see a handful of workflow queries that timed out when trying to POST to your token endpoint, the most recent of which was a little less than two hours ago. Does that align with what you experienced?

How are you authenticating that particular endpoint? Is it sitting behind a SSH bastion? You can try to increase the timeout on that particular query to see if it makes a difference. The other possibility is that the credentials for your endpoint have been rotated and need to be updated on the Retool side.

First of all, please delete our endpoint from this thread. This is private! Second, yes, that does align. Finally, this is not an issue with our endpoint or a timeout on our end. This is Retool issue. There is absolutely nothing wrong whatsoever with our endpoint. there is no reason to timeout b/c the actual api endpoint returns in milliseconds. In fact, it runs fine when done manually and when Retool claims it failed there is zero request actually sent to the endpoint. The Endpoint works fine also on every other platform other than Retool. This API is running on a serverless platform that hasn't gone down in years and never has any issues. What happens is that Retool is just not sending any request at all. I've seen this bug mentioned on other topics. Retool just claims a time out in an request step, but doesn't even send any requests at all. Finally, it works now without any issues on Retool and worked fine for weeks already without any issues until this morning. So there is something up on your end.

We also use the same exact request on other workflows and never have any issues. Just suddenly out of nowhere this particular workflow periodically hangs on that step and just never sends out any request. As mentioned, it is working now again, without changing a thing. But, there is certainly some sort of bug here. Not even sure how to pinpoint the issue b/c there aren't any logs. REtool doesn't send any requests

Do you suggest maybe moving this request to a separate workflow and hitting there with a response sent back the current workflow? Would that help debugging the exact issue? I'm thinking that might be a better way anyway b/c we use this same API endpoint in a lot of workflows.

Updated - thanks for calling that out! I'm glad to hear it's working now and will continue to diagnose the root cause with our team here. :thinking:

Your suggestion could definitely help to isolate the issue, if that is the one query that Retool seemingly decides not to run. As you mention, it would also reduce duplicate code.

Do you mind sharing where you are based? We have experienced intermittent networking infrastructure issues in certain parts of the world - specifically the UK - and I'd be curious to know if this could be a manifestation of that.

Don't hesitate to update this thread if you experience the same timeout errors in the future and we'll jump right on it!

I'm in the east coast of the US. Network infrascture issue certainly seems accurate and probably the cause of this. Because it must be that the request is not getting sent out on some part of your network, as it doesn't reach us, but Retool says it was sent. So the network issue seems correct. Also, I am going to move the request to a resource as that seems easier than creating a new workflow. Is that a recommended approach or better to create a separate workflow?

I'd say move the logic into a dedicated resource! :white_check_mark:

OK. Will create a resource. And for sure this is a network issue. That makes most sense from my perspective. Somewhere along the way the actual request is not sent out to the endpoint.

1 Like

BTW, really amazing work with workflows. This is by far my favorite part of Retool now and solves so many annoying coding issues we had to deal with in the past.

1 Like

We appreciate the feedback! It's been great to roll them out for customers and to see such a wide variety of use cases.