Hi Team,
I'm using self hosted retool. The problem is I have to query a huge snowflake table with data ranging from 40k to 200k rows and 36 columns.
I have written the JS code to get the data in chunks/batches.
The aim is to get data with batches of size 2000 (flexible on batch size), convert to base64 (.csv) and upload as multipart to S3 .
But I'm unable to process this.
Problems:
The entire workflow either fails by skipping all blocks (Without any proper logs of failure - pretty annoying)
Workflow gives an 413 payload too large error sometimes.
I'm unable to understand why is the workflow failing without any proper logs. Moreover even the event handler is not recording any error for this case ?
For the second case of error, if the query is batched why is it giving an payload error ? The loop seems to wait before continuing to next iteration. In that case why do I have this error with such a low limit(2000) ?
The JS block has timeout of 40mins (2400000 ms), even the function getDataFromMaster.
Can you confirm that you're not actually attempting to upload anything as of yet? Your screenshots make it look like you haven't implemented that but the error message you're getting in that last image mentions "upload". I want to make sure we're isolating any errors before bringing S3 into the picture.
Assuming that the batched fetches are at fault, you might need to update the CLIENT_MAX_BODY_SIZE environment variable in your https-portal container. Depending on your deployment, it's possible that you'll also need to update this value in your load balancer and/or ingress controller!
Let us know if you have any follow-up questions or run into any additional issues!
Hi @Darren, thanks for the reply.
The s3 uploder is not implemented in this workflow.
Isn't batch size of 2000 small , Doesn't each iteration get rid of the data it is dealing with at its end ?
Is the data being added up for each iteration ? We noticed we were only able to get data until 80k rows by setting the total manually to 80k. Beyond that it was not able to.
Actually we updated the CLIENT_MAX_BODY_SIZE. Not sure if it has been updated in the ingress and the load balancer.
Will confirm it from the Devops.
Good to know - that definitely helps to narrow down the list of potential issues.
Default limitations in your load balancer or ingress controller can be fairly harsh, so it's not super surprising that you'd run into issues even with the batching that you're doing. I think the CLIENT_MAX_BODY_SIZE for nginx is 1M by default, for example.
You bring up a good point about the data, too. The same thought crossed my mind yesterday. By default a single workflow can utilize up to 258M of memory, I believe, and I don't think data is being cleaned up between iterations. I'd be surprised if 80,000 rows represented that much data, though. There is an environment variable you can adjust to provision your workflow with more memory.