Intermittent / Irregular 120 Second Query Timeouts

Hi, we use Retool Cloud to host a data entry production app connected to Azure SQL Database.

Normally, app speed & performance is satisfactory; however, every so often during the day, suddenly, a query that normally takes just a couple of seconds to run will timeout at the 120 second default. Then, just as quickly, the query runtime will revert 'back to normal.'

In the example here, you can see that I am clicking a button to move from 1 company profile to the next, and each time I click, the 'StatusRefresh' query executes to retrieve data for that next profile.

I checked our Azure Database Performance logs and everything looks completely normal at the moment this timeout occurred, so it would appear to be something on the Retool side that triggered the timeout at that specific moment (12:48PM EST 4-15-24).

The issue has been occurring for quite a while, although it has become less frequent as you all have made DB performance improvements; still, the 'unpredictably' of when it may 'all of a sudden' occurs remains a source of frustration.

I would be happy to provide any further information you need via direct message.

We recently provided a JSON export of our app to your team for another issue we reported (See ticket # 75994204217675 to obtain the JSON export if needed).

Thank you!

I've seen similar issues with our RDS SQL Server connection.
Some queries will randomly hit the 120 second timeout when they usually take only a couple seconds. It's like you said though, this has become less of an issue as more updates have rolled out.

We've gotten around it in some more of our more critical apps by having an event handler on failure to call a JS Query that re-triggers the first query again.

Hello @bwdsl and @matthewej!

Thank you both for bringing up this issue, other uses have noticed it as well and our team has been working on reducing the frequency of these slow failing queries.

It's most likely not an issue on your end if the Azure DB performance logs aren't showing any patterns, but will clone your app to see if i can reproduce the issue to further triage. I am guessing it is a retool bug, as we have been making progress on reducing the frequency of these events.

As a workaround in the meantime, inside of a query's Advanced settings, under the Timeout section you can set a time limit for a query, to ensure that if it hasn't succeeded after a designated amount of time that it will fail with 'Timeout after'.

Which when combined with the workaround @matthewej mentioned, is the best way to hedge your bets against these unfortunate random queries while we work on fixing this.

You can use an event handler on failure to call a JS query that will re-trigger the first query again, using 'Timeout after' to set to a shorter time limit, instead of waiting for the max 120 seconds to hit :sweat_smile:

I will be sure to tag this in our internal team to let them know and to clone/reproduce from your app for testing as well!

+1 on this. I'm also seeing the same on some queries that should take <1s to complete. But mine only last 10 seconds before timing out, even though I've set the timeout value to 120s.

Two behaviors that seem to be consistent in causing this for me:

  1. Lots of queries running simultaneously - i.e. 4 or more triggered at the same time, such as on launch.
  2. The same query being run multiple times in a short period (within a minute). This causes a problem for data updates that I want to see reflected in the UI instantly.

Thanks for letting us know @jleem!

If you want to get a time out sooner than ten seconds I would recommend setting that in the query's advanced settings as outline below.

Also! I have some tips on how you can handle those two cases that cause less optimal query performance.

Try to stagger queries to reduce how many are being run simultaneously, for a page launch I would recommend chaining queries one after another.

Trigger the first query on launch and the following queries "On Success" inside of the "Event Handlers" dropdown in the query. This might have a slight delay if any one query gets stuck but if the queries behave fairly well on their own then this will prevent a log jam of our system being overloaded with too many simultaneous calls. And for a fail failure mid-chain you can on-failure re-run the same query.

Also look into pagination if you have a table with a large number of rows, we have an option for server-side pagination as well for increased query speed as outlined in the docs here.

For making queries multiple times in a short period of time, could I ask you more about the data source and the use case? It looks like our data streaming from a Postgres resource is currently in beta, but if you follow this forum there is a link to get into our beta testing program!

Additionally if not all the data is changing that you are requesting in rapid succession for, you can cache the data as outlined in our forum post here!