Hi; David, CEO @ Retool here. Thank you for your feedback! We did a bit of digging, and it looks like there indeed have been multiple breaking changes on Retool Cloud over the past few weeks. I agree this is unacceptable, and we are now having a serious discussion internally about what we will do to ensure this happens much less frequently in the future.
@snir, our head of engineering, will follow up in this thread with next steps once we've formulated a plan. Thank you, again, for raising this, and for using Retool!
Another helpful thing would be to get a list of all of the recent breaking changes so we can proactively go through our apps and address them before our users encounter them and also give us a head start on debugging what might have otherwise seemed to be ghost bugs in our apps.
On prem also has had breaking regressions but there are ways to rollback at least, and the team has been responsive to issues and worked towards fixes. I do also think it’s hard to find a place to voice them or see what’s being investigated internally (is this bug just me or others facing it as well - more comms like bradly mentioned would be much appreciated). One thing that we’ve done now is having a separate retool qa cluster which we test retool updates on first then upgrade prod if it looks roughly ok (doesn’t catch everything but helps). maybe that’s possible in cloud as well - a bit annoying you have to fully resync all apps from one env to another but that’s an approach that could be simplified as well. On cloud, customers can use this synced environment to test against and know what’s coming (maybe this slows down your internal release velocity however).
Hi all, thanks again for the feedback and for being Retool customers. As David said, reliability and stability are critically important for us. For this particular incident, we’ve investigated it in-depth, and discovered that we didn’t test our logic against a broad enough range of expected outputs. We’ve put in place guardrails to make sure something like this doesn’t happen in the future, via an improved test suite and expanding our coverage in staging.
Separately, we’re also setting reliability and stability goals, and have made them a key result for our Q4 OKRs. Ensuring Retool is always stable is going to be a core investment for us going forward.
As the head of engineering, I must also note that Retool is a fairly complex platform with a lot of surface area. There are really a lot of permutations of problems that can occur, and it is not easy to automatically test (and then debug) every conceivable use case. That said, we are making immediate improvements against our test coverage, staging environments, and programmatic observability, and expect to get to several more followup action items in the near future. Thank you for the feedback!
I think that clear and rapid communication is important here as well, and that’s something we’ll be focusing on too. We’re a fairly lean team still (27 engineers total!) but we want the community to hear from us - both when we’re shipping new features and meeting your growing needs, but also when we let you down, so we can resolve it quickly.
As David said, we’re really sorry for the downtime, recognize this is not acceptable, and appreciate you all as customers and users. If you have any questions, comments, or concerns, feel free to follow up here, or reach out directly (I’m at firstname.lastname@example.org; David is at email@example.com).
Any updates on this? I run a consulting firm and use retool for various clients. When a breaking change happens it makes me and my team look unprofessional and harms our business. Not to mention the massive amounts of downtime it causes our clients. The amount and frequency of breaking changes is completely unacceptable. I'm facing one today on 9/9/2021.
I get things are moving quickly and you're managing a massive project with infinite possibilities. I honestly don't expect you to test for every option. All I'm asking for is the ability to choose what version of retool I'm using! Allow me to up update my clients retool accounts over the weekend and thus personally test out the changes in my apps. I cant keep getting surprise complaints from clients when something I know as working yesterday, suddenly stops working today due to random retool bugs in the modules and app interactions.
Id ranter be behind a few versions but have stable working apps than have apps breaking on me every few days for factors completely outside my control.
Retool is an amazing product and makes developing internal tools extremely efficient and clean, but if this isn't addressed soon we may have to consider going the more labor-intensive route of custom building our internal tools without the use of a platform like this...
I understand there's a significant cost to maintaining various LTS versions of your cloud, but if the alternative is instability then it has to be something that gets some serious consideration.
I love the direction the product is going and each new feature added is very powerful, but the consistent forced platform updates have to stop. I'll update my accounts to the newer versions asap but it has to be at a time when I can properly test the changes without causing downtime to the business.
I hate to have to report that little has changed. In the past ~month, there have been 3 silent breaking changes that have dramatically affected our users' ability to use the Retool apps they rely on. By "silent" I mean no release notes before or after the changes were made, and by breaking I mean that fairly basic, critical functionality our users relied on began failing.
Unfortunately I don't recommend Retool nearly as quickly when talking at the water cooler/meetups/wherever anymore, solely because of the constant breaking changes.