Bulk upsert to PostgreSQL

Hi there,
We're planning to use Retool Workflows in order to bulk upsert data from one location to PostgreSQL.

Our only concern is this following statement from the documentation:

If it is a unique key, Retool will iterate through the array and, for each member, merge the new values with the existing database record

Does anyone know how it works behind the scenes?
Does Retool actually iterate over the array and performs an INSERT+ON CONFLICT statement per member individually, or does it build a massive INSERT+ON CONFLICT statement for all members at once?

According to the amount of time it takes, it seems like Retool does it one by one, which isn't very good for us.

If so - does anyone have a workaround for that? We want to perform one massive insert if you will.

Hello, the docs say bulk update is in a single transaction but not explain the upsert.

I just test it just now, it seem bulk upsert work in a single transaction too.
This means you can insure the integrity of data.

the above table include at least one row is correct. After I run above query, I found there is none of record inserted.
Then I delete the incorrect rows and run the query, the rows added.

Hope this can help.

Hey Anson, thanks!
I had a weird experience with this though.
I tried to upsert 2,000 rows to an empty table. So in practice, the performed action is an insert of 2,000 rows.

It took over 120 seconds and the query timed out on the Retool Workflow, however, the 2,000 rows indeed were inserted to the database.

I have 2 issues with this that raised the question on how the bulk upsert feature works:

  1. I think running an upsert of 2,000 rows for over 120 seconds is highly unlikely and super slow, which made me think that Retool iterates over each row.
  2. The workflow failed as far as Retool knows, but the rows were inserted, which causes a transparency issue

It seem issue of network.
Which type retool you are using? - cloud or self-host
where is your Postgres hosted? USA? asia? china?
Where are you browse retool? USA? asia? china?
Using VPN?

Retool Cloud,
USA PostgreSQL (AWS),
Browsing Retool from EU, no VPN.

Edit:
I just tried to upsert 1,000 rows multiple times, all times it timed out, but all the times, the rows were inserted to the database (after the timeout).

It feels like Retool "gives up" but the transaction remains in the database.

It makes no sense that an upsert of 1,000 rows takes over 120 seconds. When using a Loop to upsert those 1,000 rows one by one, it takes around 5-10 seconds to complete.

This seems like odd buggy behavior, failed queries should also roll back properly or report success since, as you mentioned, that failure message can be misleading. Are you noticing any errors in the upload when you do the loop? Or are there any rows that don't get inserted? Do you also happen to have something close to a minimum threshold that will make the query time out?

I am experiencing the same issue - but only during a "run" (scheudled or manual). A bulk upsert of ~1,600 records times-out after 120 seconds but the records are in fact upserted. If the step is manually triggered, it completes without timeout/error.

Hi Aaron,

The workaround we ended up using was to create the upsert operation as a function (in the left navigation bar) instead of a workflow step.

Then, with JavaScript in a workflow step, we (giving an example of 10,000 rows):

  1. Broke down the 10,000 rows to 20 batches of 500 rows each
  2. Fired 5 batches in parallel using the upsert function we created, at once, waiting for the slowest of them to finish
  3. Kept firing groups of 5 batches until we ended it all.

We found 500 rows per batch to be the "sweet-spot" for us when considering database load, performance, and retool workflow efficiency, but you can change it as you wish.

This will basically perform 20 smaller upsert operations instead of 1 huge one, and for, this resolved multiple issues we've had with this bulk upsert.

Hope this helps