Production Environment Not Working

I'm trying to get the production version of my app to consistently load and when loaded automatically update the page. I noticed the issue first when the production environment would just show up as a blank page, then found progressively more errors over time as I was able to get it to finally load the app. Namely the issues were:

  • Queries that should have been relatively quick on the page never finishing and never timing out
  • Charts that had been working fine before losing their dataset configuration once default values were added in case the values from the queries weren't present on initial page load
  • Errors with parsing code in shared queries that would only show up in the built-in browser development tools (not in the Retool errors console) when saving changes to those shared queries (this was specifically a call to a REST API with dynamically generated body content)
  • Updates to a form checkbox tree component would register in the UI for just that component itself, but would not trigger the change event to update a global variable (I added a debug test component and it seems like it can't even detect a change in the checkbox tree component itself)
  • Queries not running when manually clicking the Run button in the editor
  • The Retool debug console becoming completely unresponsive once the initial queries on page load trigger (I discovered this by adding a disable_all variable that disables all queries when true and setting it to true by default in production, then testing basic console.log statements in the Retool console before setting that variable to false, trying to run any code in the Retool console afterward doesn't return any response)
  • Default libraries like Lodash not being recognized on initial page load (lots of '_' is not defined. errors would show up occasionally if it failed to render the app but succeeded on rendering the editor with the debug console)

These issues effectively render the entire app non-functional in the production environment, however it works almost exactly as intended in the staging environment, assuming you don't load multiple tabs with the page too quickly.

Regarding reproducing it, I'm not sure when exactly this started or what the fix should be. This will seem a bit scattershot, since I think all of these issues are tied together in some way, but so far I've tried the following to get more information and fix the issues:

  • Manually adding indexes to the database (using Retool Database Postgres) to reduce query time.
  • Adding a delay/throttle to the queries that were not finishing and not timing out in case some dependencies weren't loaded
  • Tried checking the current number of database connections to see if it couldn't get ahold of a connection and that was causing the timeout configuration to be ignored because I read that the timeout setting only applies to the query time when connected to the database and that the number of database connections could be the issue, however it was far below the max connections, so it wasn't this.
  • Trying to manually stop infinitely running queries only to find out that you can't.
  • Switched to manual querying for queries dependent on the results of other queries intending to use watched inputs to automatically trigger updates once the initial state was loaded, but the watched inputs just didn't work and the success event handlers were firing before the dependencies had updated their data state variable.
  • Switched back to automatic querying with some query disable logic.
  • Added a disable_all variable which prevents all queries from running which still doesn't prevent the app from just not refusing to load sometimes in production.
  • Provided default values for several components / queries in case values aren't initialized when run on page load.
  • Ensured as many linting errors are fixed as possible, some are still appearing though these are things like '_' is not defined. and This query is unused. Delete unused queries to improve performance. I don't want to get rid of some of these queries yet because they may be necessary once the page is working in production again.
  • Using the React Developer Tools to see if the checkbox tree was updating correctly and see if it was firing events, it appeared that it was but its state wasn't being reflected in the debug text component I set up to monitor it, so there may be issues where it's updating the internal React state but the render phase isn't being triggered for some reason.
  • Fixing the REST shared query dynamic body so that it wouldn't throw parsing errors in the browser dev tools.

I've been banging my head against this for about a week or so now, and it feels like 90% of the time I'm fighting with the tool trying to do something simple only to encounter more issues within the tool itself, so if anyone has any insight on what could possibly be the cause of this let me know. Again the main problems here were that the app simply wouldn't load at all in production even if the editor would, and there were queries that ran fine in staging that would suddenly run forever in production, everything else was something that came up while trying to find the root cause of this issue and fix it.

Hi @pschall42, welcome to the forum! :wave:

I'm sorry you are experiencing all of these issues. It can be frustrating to put so much work into building an application only to encounter bugs like these. There's a lot to touch on from all the issues you've surfaced but before we approach them individually, I'll share with you what I've seen in the past.

A few weeks ago, a customer came to Office Hours reporting similar issues, queries running on page load despite the fact they were set to run manually, queries taking too long to run on page load, the editor slowing down significantly, etc. When we took a closer look at their app, it was a massive app that had a Tabbed Container, where each tab was basically a whole app on its own.

Performance of Retool apps is primarily impacted by the number of code blocks and components. Essentially, every app is a JSON object with many dependencies, the bigger the app, the more (deeply) nested dependencies it needs to update with new changes.

Please share a screenshot of the Performance tab in the Debug console. You can open the console by clicking 'Debug' at the bottom right of the screen: