I have a workflow with some fairly large DynamoDB queries. When running the workflow in the workflow builder, the entire process takes ~85s, with the a certain query running around ~25s.
I saved, deployed, and the triggered from my Retool app. When triggering from the app, the same query takes almost 10 minutes.
I compared the run from the builder to the run from the app, and both are processing the same amount of data, there isn't any difference in the JSON returned.
I was able to trigger it once from the app and it ran quickly, however I can't seem to repeat it. I ran 2x from app and it timed out, but the 3rd time it ran within 90 seconds. 4th and 5th time it times out, so it seems intermittent
Also thank you for the thorough detail you have in your follow-up posts. Seeing the timing discrepancies, I wonder about how your loop nodes are constructed and processed. Even though they certainly should behave the same whether run manually or triggered, there seems to be something in this step which at least we can investigate.
Its input is an array of objects where each object has a pkey property. It's being passed ~2k objects, which I understand is large, but the loop step runs in ~30s when run in the builder (and the one time it ran successfully when triggered from the app).
If helpful, below are the IDs for the workflow runs: Successful run triggered from app: Starting run for workflow: e60acc18-7fa1-4849-9bb4-d106eae30001 in environment production
Workflow Run Id: f287bf48-c82b-4f7a-8eef-d82bdfa0257d
Successful run triggered from builder: Starting run for workflow: e60acc18-7fa1-4849-9bb4-d106eae30001 in environment production
Workflow Run Id: 9694f2bc-5d18-47e8-8e9d-385999249b78
Failed run triggered from app: Starting run for workflow: e60acc18-7fa1-4849-9bb4-d106eae30001 in environment production
Workflow Run Id: 7ef17519-c06f-4ec1-831a-05f5a0fd88d3
Other odd thing is that this loop block runs essentially the same query, except for a different skey. This loop block runs consistently >30s in all runs, including failed, and it's processing twice as many records
I duplicated the workflow and triggered it from the app, on first run it ran as desired. However, I can't replicate this, new runs triggered from the app are still timing out
I feel like I'm breaking something - it's just so weird that the queries take different amounts of time intermittently. The getFTUX query is processing the same number of records as getPaymentInfo, which in the last run only took 11s. No real rush, lmk if I can help by providing any more info, thanks again!
Thanks for the extra follow-up -- Is it possible your DynamoDB resource is throttling connections? I'm not sure of your specific configuration here but it does seem to be related to the frequency and timing of your attempts.
I did a little searching (very light, non extensive) around throttled DynamoDB and came across a StackOverflow thread which discusses how your DB resource configuration might need extra provisioning:
Thanks @pyrrho ! I checked with our AWS admins and they said we have on-demand scaling enabled and shouldn't be throttling, but makes sense this might be contributing.
I'm kind of hesitant to think this is the case though, since every run I've ever done from the builder runs quickly (I've done many), it only has this issue when triggered from the app - lmk if you have any more thoughts, thanks!
Very strange! Best I can offer at this point is tagging some friendly Retool reps that I often see assisting with the more technical/behind the scenes know-how: @Tess@Darren@AbbeyHernandez
I changed the workflow to use a much smaller dataset just so I could continue building without being blocked by the larger queries, but the same behavior is happening.
It always runs in ~17s in the builder. I can trigger it from the app sometimes, and it runs as expected, but intermittently it still takes around 10 minutes and the workflow times out. There seems to be an issue triggering a workflow from an app in general, unrelated to the amount of data being processed.
I don't know if any updates have been made, but since yesterday the queries are now running as expected every time when triggered from the app, no changes on our end.
If I use the play button to open user view, it still times out, but it's working for others on the team in user view so I think this is pretty much resolved. Thanks for your help @pyrrho !