Running Method: same behavior whether I use node main.js directly or run it through pm2.
Network: To my knowledge, there's no proxy or firewall intentionally blocking outbound HTTP/S traffic
Traceroute results:
I ran traceroutes to all the relevant IPs; all stall beyond mid-hop (most end with only * responses)
Retool connectivity: The script is still able to communicate with Retool normally outside of the RPC error, which makes me question if it's truly a connectivity issue.
since it still works, then I think you're right here
looks like an error from loopWithBackoff which could make sense, if every iteration it tries to execute some RPC function and fails, which gives you the error but instead of bubbling up the error it's ignored, then it waits (the with backoff part) and tries again. if that's the case it may not actually fail until loopWithBackoff hits its retry limit which seems to rarely happen as it keeps trying and eventually goes through..... I'd be curious to know if you increase the backoff amount (double or tripple it just so you can visually see a diff in freq even if it's small) if the errors are less frequent.
You're spot on — I think your theory about loopWithBackoff is exactly what's happening here. I bumped CONNECTION_ERROR_INITIAL_TIMEOUT_MS from 50 to 150 just to test the waters, and I’m definitely seeing a noticeable reduction in how often the error shows up. It’s still happening, but not nearly as frequently, which lines up with the idea that it’s retrying and eventually succeeding behind the scenes..
looks like you can ignore the error then, but lets see if we can think of a way to get the RetoolRPC library to ignore 503 errors. I'ma have to look into it a bit, but I think ideally we would get a reference to the underlying web request library and add an interceptor to ignore 503s.
Hi @bobthebear thank you so much for looking into this!!!
When you said that you would like to "get a reference to the underlying web request library". Is that something from the Retool side that I could help to find? Or is that unique to the deployment set up?
both should allow us to customize how we respond to certain status codes since not all APIs implement those codes correctly, or even use the right code to begin with