PostgreSQL database has run out of available connection slots

We would appreciate any guidelines/suggestions on what we could improve on to prevent this error?
Background:

  • Our self-hosted Retool is running into a 503 error which, when looking at the logs is due to PostgreSQL database running out of available connection slots.
  • The database connection log shows an average connection count of around 30, then huge spikes up to our limit of 90.
  • We have only recently started building on Retool, so the number of users are around max 4 at a time.
  • Our apps connect to Redshift - which we then transform the results to JSON and we use JSON to SQL to manipulate the result set from Redshift.
  • I have also set caching of the Redshift queries to every 23 hours.
  • Our devops team are looking into using variables to " minimize the connection exhaust issue".

Is there anything that we have missed that could help us prevent the spikes in connections being opened?

Another oddity... is that we have a new instance of Retool, that has no apps or users, yet it's showing 15 to 17 connections are open, why would that be?

Thanks in advance.

Hey @Lisat - thanks for reaching out and for your patience.

To start, what version of Retool are you currently running? And it would be great to get a little more insight into the nature of this particular database. Is it the primary application database or a resource that you've connected to through Retool?

Hi @Darren
Some background on our Retool on-prem instance

  • version 3.75.19
  • Running on db.t3.micro instance, which has 2 vCPUs and 1 GiB of memory
  • Redshift is a resource that we are connecting to.

The app that we are building

  • Has around 7 Redshift queries that return aggregated data with caching switched on
  • We then use JSON to SQL to do further aggregation for charts and tables so that we don't hit Redshift again.
  • We get inconsistent 503 errors
  • We also get a spinning wheel when trying save updates.

I am not sure if it's a resource issue, or we need to further refine our Redshift queries that pull the data need for the app.

Any guidance would be really appreciated.

It does feel like you might be bottlenecked by available RAM, but I'm not necessarily confident that it's responsible for your pooling/connection issues. That said, upgrading to something like a t3.medium instance might alleviate any responsiveness issues that you're seeing.

I'm still trying to understand the core of the issue, though. :thinking: Are you running out of connections to Redshift or to Retool's application database? Both can probably be addressed by tweaking the right environment variables - such as the ones mentioned here - but we need to first identify the precise nature of the problem.

Hi Darren,
Thanks for this!
Our DevOps team have now updated our instance to db.t3.small from db.t3.micro - so we are hoping this works.

We also figured out that if we reduce the amount of data we are returning to Retool from Redshift then the performance stabilises, so it seems as if we need to be more intentional about just how much data we return to Retool - especially when setting caching on the result set, as we no longer get the spinning wheel on saving with the smaller data set.

Thanks for bearing with me on this setup and my questions. It's a steep learning curve at the moment for everyone involved to ensure we have an environment that gives a good user experience.

My pleasure! I'll tentatively mark your most recent post as the "Solution", but don't hesitate to reach out if you have any additional questions. :+1: