Inserting document into vector via resource query results in timeout

Hi @kent,

Question:
In this case, would it be best to create a Vector document for each PDF? Or upsert all PDFs into one Vector document?

My case:
I have about 2000 rows with 40 columns. (containing employee lifecycle info such as onboarding & offboarding)

  1. I tried to use the workflow Upsert document, but kept getting timed out.
  2. Tried uploading manually, but getting "failed to fetch" error on the web.

Thank you @kent , we were actually looking at this as well so perfect timing.

Unfortunately we did some tests this afternoon and we're facing that same time out issue as @Tugi_Baasansuren when inserting a document inside a vector.

Thanks!
Jerome

Hey @Jeje

I was able to insert each row as its own document. Used Retool workflow to loop over each row.
2000 rows into 2000 documents.

According to my tests so far, the query run time is definitely shorter and seems bit more accurate, but can't confirm yet.

Hey @Tugi_Baasansuren and @Jeje! I broke out these posts into a separate topic just to keep everything as organized and searchable as possible.

As it sounds like you've discovered, processing larger documents can take a bit of time and potentially cause queries to timeout. It's recommended that you configure a longer timeout via the query settings or break your data up into discrete documents.

One other thing to keep in mind - we enforce a rate limit on the OpenAI embeddings API that is used to create these vectors. I believe it's 1000 calls per hour, but I can double check that.

Thank you @Tugi_Baasansuren @Darren

Fantastic, we followed that approach and got the success message after inserting a single row inside the vector as a test - now the next question is, how do you list documents inserted that way with success?

When we go to the vector itself we can see the document fine, but when we refresh the app and try to query the document for that vector, this is the error message we get:

  • message:"Missing values for parameters in: source"

Sorry - i am not getting it. What do you mean by this?

Are you using Import Workflow, which inserts the PDF into the app?
And you want to get Success/Failure notification on how the workflow went on the app?

Hi @Tugi_Baasansuren thank you for your help, much appreciated!

  1. Our understanding is that we need to first use Retool AI to convert our PDF to text so we first convert our document to text and load the text inside a table using retool postgres database.

  2. Now heading to the workflow, we load the text we've uploaded inside the table and we run the query to insert that text inside the vector and we get a confirmation message that the insert worked fine.

  1. We look at our vector and we can see that the document has been uploaded with success.

  1. We go back to our main app and this time we use the "List Document" feature to give an interface to external users that would want to see which files have been uploaded / or let the user delete them if needed. This is where we're getting an error message (the odd thing is that the error message has changed since this morning, it now says "Missing values for parameters in: vectorActionDynamicNamespace")

Thank you!

Interesting - thanks for the detailed writeup, @Jeje. :thinking:

It looks like you've done everything correctly and I would expect the List documents action to work as expected. Everything behaves normally in my own testing. When you view the uploaded document via the UI, do you see that its content has been chunked correctly?

No problem @Darren

I think so yes, I'll just put the content of that vector (when looking at the content manually) below and redact the text part as I've used an internal doc. Do you see something wrong in the document structure?

[{"id":1,"text":"redacted","client_name":"test","file_name":"test.pdf"}]

Was the Vector name different before? I see it is called "Test" now.

I don't know, but I'd try changing the Name of the Vector, with something tangible. Saying this because I am assuming that "Test" might be already defined namespace or ID on the Retool side. :man_shrugging:

We deleted and created a few vectors, but just in case I renamed the vector to something else... and the old error from this morning is back! :flushed:

Ah bummer.
Tried the Vector document insert by workflow and the UI upload - both errors huh?

Curious to know how you converted the pdf into txt - possibly the issue could be there?

"Missing values for parameters in: source" : I am translating it as some parameter is required but missing OR incorrectly formatted??

I guess to test this:

  1. You could convert the PDF into Text from your computer with Acrobat or your available tool.
  2. Upload the converted text file into your Vector and see if the Vector action can list it.

Yes bummer indeed!

Yes we tried both workflow and the UI upload too.

Yes that's what we thought as it could have been the retool internal functions doing the converting having issues - however we decided to just do a simple test manually by adding text manually to our Retool Database and the result below is exactly the same, attaching all the steps we followed below for the team to review:

Add a simple test row with a dummy content:

Load that via the workflow, tried both upsert and insert, also tried to multiple document names just in case, including the one included in the database:

Full view of the chunks this time:

Same error when loading the document:

Maybe something was added recently and we need a way to add a source parameter somewhere :thinking:

That is odd. Not sure whats exactly happening on your end.

I just tested:

  1. Manually copy & pasted - what you exactly have in your content chunks
  2. Uploaded to Vector doc

[{"id":1, "text":"This is my test text" , "client_name": "test" , "file_name":"uploaded_file.pdf"}]

I can: :man_shrugging:

  • List the document
  • Get document chunks

Really strange, thanks so much for checking on your end @Tugi_Baasansuren

Is there something we could do as a next step @Darren to resolve this issue? Maybe a setting that our org might have and it's playing with our vectors?

Super weird, deleting the app and creating a new one to redo this from zero fixed the issue! I can now list documents! Thanks again @Darren @Tugi_Baasansuren

2 Likes