First of all I appreciate that the Workflow feature is in beta.
Main problem I'm having is that my workflows seem to succeed happily when I run each step in succession but as soon as I leave it to the scheduler, the jobs fail and the logs don't tell me what happened.
This isn't just happening on one workflow but rather all of my workflows. The pattern is that they all fail when calling an external API, in this case, ServiceFusion. It's quite possible that the process of calling the external API is different when run by the scheduler rather than by the user but the issue I have with Retool is that the Logs are not doing their job.
I've tried various things like reducing the frequency of the workflow and also reducing the quantity of data it handles. I've not managed to find a clear pattern. Sometimes I will get the workflow to work consistently for hours only to log back in the next day to find it failed again. Not sure if this is some sort of authentication issue or whether it's just that the some data is throwing an error.
So yeah... please can someone offer some guidance as to how to see the actual problem. The only real error I have to go on is one that appears in the console but it's clearly a Retool thing and nothing to do with my code.
GET /api/resources/76c46546-d8b6-4381-8760-9d74ecb66fa7/schema?environment=production&action=workflows 400
Uncaught (in promise) Error: cannot fetch schema
at Generator.next ()
at a (app.4478f322cdc0edb68fb8.js:2:2826946)
Have you tried using a global error handler in your workflows? @stefancvrkotic made a good example in this thread.
I'd appreciate this as well if workflows can have this documented in the logs but if they are prioritising other features, use the workaround above.
I do have one but currently all it does is write to the console (which doesn't seem to happen) so I could beef this up a bit, thanks for the suggestion.
Logging in today I noticed this in the console, again not sure if it's related...
fs.js:4 POST /api/refreshtoken net::ERR_NETWORK_CHANGED
fs.js:4 Uncaught (in promise) TypeError: Failed to fetch
Implemented the Slack error reporting and tested it to make sure it works.
Still no error.
This definitely looks like a bug. I’ve filed it and added a couple tickets to it already, will add this one as well. Hopefully we can get this addressed soon🤞 in the meantime, I’ll ask about any timelines we can share!
As a quick sanity check, might this be an issue of Workflows running on the server side when running automatically vs running on the client side when manually run? Does this auth work consistently when manually run vs when scheduled/automatic?
Sadly not, I've yet to establish any reliable way of replicating the issue, it's intermittent.
I can't be sure without actually seeing the error message but I'm all but convinced the error is actually not in Retool at all and it's actually the third party API refusing or timing out and Retool is not handling the error properly / visibly
Understood. Well do let us know if anything changes from from the Retool side!
Am following along with this thread closely.
I am also having a similar issue where when scheduled, the workflow is failing but when individually running the components in order it works completely fine with no issues.
Some context to the workflow if it helps:
- Similar to @Ross_JWHI Ross, am pulling from an API. Happens once a day and grabs transactions that occurred the day before and writes to a postgres DB
- The error that comes up is indicating that ALL transactions are duplicating when trying to write to our DB for storage even though there are only a fraction of records trying to be written that day.
- The fact that the workflow works when manually executing each element but fails when scheduled indicates to me that it may be out of my control
- The workflow was working with no issues a week ago with no issues however now consistently fails each day.
Happy to offer more information if it helps the investigation.
I just wanted to check back in here and let y'all know that we've done some work on Workflows to merge the behavior between Run and Preview! Hopefully the behaviors should be more aligned now Please do let me know if you have any feedback!
Upvoting the general theme here: observability in workflows is non-existent. In the spirit of implementing my own log (because workflow logging does not capture my console.log statements) .... I thought I would add this bit of data to my log lines:
Makes sense right? If workflows have an ID I should capture it so there is at least a possibility of correlating my log back to the workflow log. YaY! Where did I get this idea? Right there in the workflow logs. The very first two lines:
[Tue July 11th 2023 13:55:03.867 pm] Starting run for workflow: 17b54976-59f2-484f-849d-969cd9407fab
[Tue July 11th 2023 13:55:03.867 pm] Workflow Run Id: 7c502940-b924-4d9f-a247-ff35414478d7
Oh but wait!
17b54976-59f2-484f-849d-969cd9407fab is the ID of the workflow itself. I need the ID for the distinct run I'm logging. No problem...it's right there:
Workflow Run Id: 7c502940-b924-4d9f-a247-ff35414478d7
The "Run Id". Excellent! Except for one detail. It's not in the context:
Where is it? It's not hiding inside things like "lastRun" which has timestamp stuff. It's not in currentRun.
This reminds me of the context object in apps which given this information:
Makes "production" available at runtime but not the thing you really need....the actual version tag.
@Tess @victoria Pro Tip: Make observability a priority and a ton of technical support load will disappear, because developers will be able to find problems themselves. Anything you do to enhance observability will produce windfall ROI. Furthermore, if the software is horrible and buggy (I don't recommend it), observability is even more important. Developers will be able to help find bugs faster and their reports will be more meaningful and valuable to you.
Wow that was quick. I notice a huge improvement in the workflow logs today. They include console log statements from within the workflow code! YaY! An epic improvement! Thank-you.