More breaking changes and bugs

There have been more than a few breaking changes and bugs recently. The most recent bug was in setValue() this morning.

We want Retool to be quick in adding components and functionality, but we also need it to be very stable as it is increasingly responsible for the functioning of our organizations and a seemingly small slip up could result in significant disruptions. My biggest client could not fulfill orders for part of the morning.

I fully understand how difficult a task it is to make sure there are no regression bugs in an evolving platform that is used in so many different ways. And you are growing as an organization with all the challenges that brings. I do not envy you the task!

But this must be done, this balance between progress and conservatism, or else we developers and stakeholders lose trust in the platform.

I love this platform and what you great and talented people have built and are building. This is why I feel a responsibility to hold some feet to the fire here. :fire: :foot: :foot: :fire:

9 Likes

Awesome @bradlymathews!

We are implementing a new product and yesterday we launched it for our first users. I'm in love with Retool and all the possibilities to build something amazing in a short time. However, yesterday we were heavily impacted by some of the latest bugs that have surfaced.

First, something that was working in the tests stopped working and we thought we had broken it, however it was an error in the platform itself (Row doesn't get selected when we click on the action buttons in a table row).

Then, during the onboarding of new users, we noticed the problem with setState. It took a few hours before we realized that we didn't actually change anything and that it was Retool itself that broke the functionality.

It's very cool to see that Retool team is working on improvements, but it's quite worrying to realize that errors come "out of nowhere", taking away our trust and of our customers in what we are building.

We developers know that bugs happen and sometimes we only notice it when we bring the code into production. However, the frequency that we encountered some impediments surprised us. We expect the quality requirements to increase to prevent something from breaking this way. In the meantime, we'll commit to being more present in the community to let you know what we find.

4 Likes

What's going on here is Retool is doing the equivalent of dynamic linking with components. We reference 'we want this table component' and internally Retool serves the latest version of that component. We've done this for a long time on Linux, and within the past several years it's become hugely problematic with that platform. The things your code dynamically links to can and does change out from under you, causing breakages. This has become such a problem with the Linux platform we've come up with monster tools like Docker to stabilize the underlying system. Because it's such a problem, modern environments like Go, Zig and Rust ignore dynamic linking entirely and build one fat binary that doesn't link to anything else. You put that binary on the same platform, it will always run exactly as the day it was built (usually, if it was built with libc, that's a different issue).

What we need is the equivalent of building a static binary with Retool. When we generate a release, we want it to run exactly as it did the way we created the release. Always. Bugs and all. This means Retool should not only reference what major version of all the components, it should also reference the exact build of that component, probably a SHA or minor number. That way, we can almost guarantee that a version of an app will continue to run exactly as it did the day it was released.

I understand this is a significant change internally, and might make the release process more complicated than it is currently. We need this though. There's been too many under the hood updates that are frankly dangerous. I totally sympathetic that building a web app is complicated and buggy and generally crap to deal with all around. I'm grateful that Retool exists that I can give money to so that I don't have to deal with that awfulness. We need a static build of releases so that the Retool team can press forward without having to worry so much about breaking a bunch of things at once.

7 Likes

Flippen fantastic feedback from all of the above. Retool. Please. You need to get this level of stability or we will all have to leave and be sad.

3 Likes

@Rory I believe that the bones for this might already be in place. Here is a recent post from @Chris-Thompson about a beta version of the Table component:

Hey all! I just wanted to give you an update here on this. Our engineers were able to release a feature for this behind a feature flag. If anyone would like for us to toggle this on for your org, feel free to write in through support chat and we would be happy to add you to the beta!

They have done this for me previously as well. If they have a way of flagging a specific version to certain users that is a good sign they have at least some of the plumbing in place to allow version locking.

1 Like

There were two bugs I hit with the Form component last night, validation not always working and the form submit event firing twice. Fixed quickly.

And then there was the Temporary State folder issue today which others reported and was apparently also fixed quickly.

The bug extermination crew is on it, but let's give these guys a rest eh?

1 Like

Hi all,

I'm Snir, our head of engineering at Retool. Really appreciate everyone's support in using Retool and in engaging on this thread. While version pinning (which the static binary approximates to) is something we definitely want to think more about, operationalizing that for our cloud (since version-pinning would be per instance and coupled to our datastore) is difficult for us at present. That said, it's certainly a topic of consideration.

More pressing, we recognize that the rate of bugs making their way to production is creating a lot of pain here, and all of our teams are investing heavily this quarter to improve code coverage, improve automation, and invest in our core frameworks to build out parity analysis as we deploy updates. We've rotated a massive part of our engineering team onto quality and stability work, as well as improving overall app load times. Specifically on performance, we’re rewriting our core frameworks, including how we sandbox.

This will always be a challenging tension for Retool, as we're building a software development platform that can support so many permutations of apps. That said, we're committed to building a stable and reliable platform that our customers can trust. I'm optimistic that our efforts this quarter will yield noticeable improvements, but please do let me know if this persists as a recurring issue for you, so we can continue to identify and address the root causes.

7 Likes

This will always be a challenging tension for Retool, as we're building a software development platform that can support so many permutations of apps.

It's not so many, it's infinite, and a losing battle. There will always be bugs in obscure, untested combinations of components and queries. We want something where upon generating a release, it runs whatever front end pipeline is used internally for building the app and generate one giant javascript ball with everything in it, then put it on CDN. Now we can use that javascript file forever and won't be affected by any front end updates. That frees up at least half the pain of infinite permutations. This is of course excluding back end changes, though that's usually far easier to catch and fix issues with.

1 Like

This week I was hit with a breaking change. Forms with child components that do not have a form data key used to be ignored when tying the form to a GUI SQL update. Sometime this week, a change went out to production on Retool that broke this behavior. Now forms with child components with a blank form data key are erroneously included in the constructed SQL as blank column names, leading to an invalid SQL query and breaking all existing forms. This has, again, lead me to scrambling trying to debug different aspects of the system, since I have not updated the affected applications in question in weeks.

e: at the very least can we have some sort of event log on when things are pushed to production. Then we can correlate timing of new issues to possible retool deployments. Doesn't even have to be a change log, just something like 'deployed new version xxx.yy.zzz on YYYY-mm-dd HH:mm'

3 Likes

Yes! I am really surprised a released version of an app will experience bugs from the retool system development changes. I feel like a release is supposed to be an encapsulation of a point in time and forever stay that way. But, look at me. I'm still just getting a lot out of the free subscription so, I can't exactly complain! But, it feels weird to be happily working on an app, decide to do a browser refresh and suddenly everything is different! Even a retool database query with an enabled transformer is not returning results the way is was one second before I refreshed my browser. That said, I love you retool! This platform is just The most incredibly useful computer tool I have ever used.