Retool apps don't load for some users

@FBCRandy - We saw in that 120MB, it was almost entirely made up an array of null values, we manually stripped and reuploaded into test app in your environment. Can you see if that works for you? If so, you can try to export/reimport that into the actual app. Still digging into what led to that structure ballooning like that.

1 Like

My apps will sometimes load now - time to first query is usually around 80-90 seconds though. further interactions take about a minute each to complete

Screenshot 2025-03-24 at 1.39.16 PM

1 Like

I am seeing some improvement. I went and exported every module. Found a few that were large and went back a revision.

To the retool team: anyway you can automate that? Or potentially solve why they are large and issue updates? I'd like to not forget what I undid on 20 modules

1 Like

Still having issues that are leading to retool being totally unusable, some pages load fine, some pages just hang and then die. Seems to be totally random, but some pages seem worse than others.

Is there any root cause or thoughts on this?

2 Likes

Nathan, are you module heavy? If so check module sizes on json export. Look for any huge ones. Seems like something is making everything grow massively which causes a stall before retrieval or timeout

Support was able to restore a module for me by shedding its weight, but that still didnt make it work until I did the same thing with a offending module that ballooned as well

1 Like

I am experiencing the same issues with one of my multipage dashboards. While the dashboard in question used to not load at all, I am now able to load it very slowly (about a 30 second load time for a page that used to only take ~3 seconds).

Something I've noticed is that when I am editing the dashboard (which is still unbearably slow and nearly unusable), renaming a query/resource and attempting to run it directly afterwards will result in an error. Once it errors out, it seems to hang for ~10 seconds and then once you run it once more it'll return the same expected return as what was returned before renaming the query/resource.

Here, I renamed a Retool Database query/resource and received this return:
Screenshot 2025-03-24 175409

In this one, I ran a GraphQL resource. It should be noted that this query does not connect to any database and only communicates with our backend via Cloudflare:
Screenshot 2025-03-24 190145
Screenshot 2025-03-24 190145

1 Like

I accidentally sent a duplicate image in the message above. The corrected screenshot is attached below.

I don't use modules, just more charts/graphs.

I've managed to fix one page by reverting 500 steps, and then deleting some queries which have mysteriously broken.

Similar error messages to @wyatt.richartz

Error messages in the query creator within the editor:

  • You do not have permission to run this query
  • Query is unable to save
  • Unable to find query

The queries are not long running, retool says it finishes them everytime - but maybe something related

Hey folks - To keep you in the loop here after a ton of digging in across teams yesterday, this actually seems to be the same very obscure root cause as (but different symptom of) the source control deployment failures we're currently tracking on our status page. Ultimately, something is causing these app saves to suddenly become incredibly large, which is leading to various issues in Retool (though I can't yet say it's causing all the disparate errors noted in the thread).

Combining these efforts internally, we have some internal repro apps we're picking back up today, and I'll keep this updated as we learn more. I know this is a painful one, and really appreciate the patience as we get closer to what's causing this!

5 Likes

Hey everyone - We're working on sending out a fix this afternoon that should strip those really large arrays of nulls from the app templates whenever the app is next loaded. Still working on how those are getting set in the first place, but this should resolve the various loading/performance issues affected apps suddenly started seeing. Once you see version 3.176 in the help menu, the fix should be included.

Like I mentioned, there are likely several disparate issues being reported here, so I'm sure this won't resolve everything in the thread. But I confirmed with a query of the current ~100 affected apps in our entire DB that most folks posting here are included in that list. If you still see anything off, I'd confirm the app's JSON export is mostly made up of those null values. If not, something else is going on for which we should start a separate thread :pray:

2 Likes

Hey @Filip_Lipinski @dnursten @FBCRandy @chadk35 @jadetools @Nathan_Hall @Matt_Reilly @wyatt.richartz I'm following up to let you all know that version 3.176 is live. This version includes an app migration to remove the large volume of null values that were injected into your App Templates. Once you open your impacted apps in the editor on version 3.176, the erroneous null values should be removed resolving the resulting performance issues.

If any of you continue to see performance issues with your apps after opening them from the editor on version 3.176, please download and inspect the app JSON for an extremely large array of null values. On the off chance you continue to see these null values, please follow up here with the UUID for your app. If you do not see these null values, please create a new post describing your issue with as much detail as possibleβ€”thanks!

Our engineering team is still investigating the root cause. In the meantime, they've identified some safeguards to prevent the issue from reoccurring and are working to implement those ASAP.

2 Likes

Hi,
I'm having the same issue here.

How can it get updated to 3.176 or 178?

Hi all, Unfortunately due to a separate issue, we've had to temporarily revert the cloud version back to 3.174. Our engineers are working diligently on a fix and we'll update you all here as soon as Cloud is once more updated to a version that includes the above mentioned app migration.

@km_hads I've checked out your application and you appear to be experiencing a separate issue. I followed up with recommendations in your initial post!

Should those of us with previous issues that you've fixed avoid editing until that is released?

Hi Everett

What is the update here? I am still on 3.174 and it is breaking any App that I try to edit.

You can still edit, and if the apps were loaded (by either an editor or an end user) they should still be fine. However, while we're on 3.174 which doesn't have the fix, they could feasibly get in that state again.

And if they do while we work on the other issue (realistically the team needed for the fix is US West), just to unblock you can try manually setting/unsetting some allowed groups in the app's queries to clear it out. Or you can export, manually remove most of the null values from the file, and reimport in the editor. And here's a bash one-liner that should clean up an export and save it into a new file in case that's easier.

sed 's/null,null,null,null,null,null,null,null,//g' export.json > stripped.json

Hi Justin, I can confirm that this fix works. Thank you :pray:

4 Likes

Hey everyone :wave: Wanted to loop back now that we have the initial fix deployed, and the root cause identified with a fix for it en route.

We've been working on a pretty large migration on what we use to store each app save state in the DB. There was a subtle and overall pretty rare, feature-flagged bug that seems to have only affected the allowedGroupIds query setting when a single group was specified. Instead of storing an array like [groupId] as expected, [null,null,...,null] of size groupId was stored.

So when the internal group ID in our DB was something like 5000000, these saves grew to be incredibly large with millions of these values, and in turn caused several downstream issues for apps with such a configuration (e.g. saving, loading, deploying source control).

We really appreciate you all flagging as well as your patience as we worked through this one, and will be having some postmortems internally on what went wrong and how we can prevent these kinds of issues in the future! :pray:

2 Likes