We would appreciate any guidelines/suggestions on what we could improve on to prevent this error? Background:
Our self-hosted Retool is running into a 503 error which, when looking at the logs is due to PostgreSQL database running out of available connection slots.
The database connection log shows an average connection count of around 30, then huge spikes up to our limit of 90.
We have only recently started building on Retool, so the number of users are around max 4 at a time.
Our apps connect to Redshift - which we then transform the results to JSON and we use JSON to SQL to manipulate the result set from Redshift.
I have also set caching of the Redshift queries to every 23 hours.
Our devops team are looking into using variables to " minimize the connection exhaust issue".
Is there anything that we have missed that could help us prevent the spikes in connections being opened?
Another oddity... is that we have a new instance of Retool, that has no apps or users, yet it's showing 15 to 17 connections are open, why would that be?
Hey @Lisat - thanks for reaching out and for your patience.
To start, what version of Retool are you currently running? And it would be great to get a little more insight into the nature of this particular database. Is it the primary application database or a resource that you've connected to through Retool?
It does feel like you might be bottlenecked by available RAM, but I'm not necessarily confident that it's responsible for your pooling/connection issues. That said, upgrading to something like a t3.medium instance might alleviate any responsiveness issues that you're seeing.
I'm still trying to understand the core of the issue, though. Are you running out of connections to Redshift or to Retool's application database? Both can probably be addressed by tweaking the right environment variables - such as the ones mentioned here - but we need to first identify the precise nature of the problem.
Hi Darren,
Thanks for this!
Our DevOps team have now updated our instance to db.t3.small from db.t3.micro - so we are hoping this works.
We also figured out that if we reduce the amount of data we are returning to Retool from Redshift then the performance stabilises, so it seems as if we need to be more intentional about just how much data we return to Retool - especially when setting caching on the result set, as we no longer get the spinning wheel on saving with the smaller data set.
Thanks for bearing with me on this setup and my questions. It's a steep learning curve at the moment for everyone involved to ensure we have an environment that gives a good user experience.