Vector Store Does not Chunk Documents

Steve7771 · May 24, 2025, 2:23pm

My goal: Upload documents to the vector store to help AI Chat understand my database schema.
Issue: Uploaded documents are converted to only one chunk and therefore not very useful for AI.
Steps I've taken to troubleshoot: Different document formats, recreating vector store.
Additional info: Cloud

I feel like I'm missing something. Shouldn't uploaded documents be chunked automatically?

Infinitybht · May 26, 2025, 4:51pm

Hello Steve, and welcome to Retool Community!

The Retool-managed Vectors DB dose indeed chunks the text. Though this happens automatically, it is still limited to the document's formatting (it is better to use a plain text document, and/or clear articles with headings and paragraphs).

This means that Retool AI models use embeddings to determine the meaning and connection between objects, such as blocks (chunks) of text.

In simple words, the AI model will decide which parts of the text can be grouped together into e.g: paragraphs, so that each chunk will be embedded into the DB separately, and therefore you can use them later to improve your prompt.

As in this screenshot, I've updated a simple text document, BUT it was inserted in multiple chunks (I've used the query "Get Document Chunks" to return the chunks of the document).

Give this a try and let me know!

Steve7771 · May 28, 2025, 2:49pm

Hi Infinitybht and thank you for your response.
I uploaded documents in numerous different formats (including plain txt) and was also using Get document chunks to identify whether it was actually chunking the data. Unfortunately, it always returned 1 chunk.

My documents contained either:

Blocks of SQL with a short explanation above each query with line breaks between each section.
Blocks of SQL tables including name and fields.

I eventually pivoted and used the UpsertDoc function to create a new doc for each block, which was a painful endeavor, but my LLM is now working well with the granular chunks.

Any idea why this data would not chunk automatically? Here is a small example of the content:

Table: account_owners
- id: bigint unsigned
- client_id: bigint unsigned
- user_id: bigint unsigned
 
 Table: accounts
- id: bigint unsigned
- cmt_id: bigint unsigned
- name: varchar(255)
- created_at: timestamp
- updated_at: timestamp
- deleted_at: timestamp

Darren · May 29, 2025, 1:31am

Welcome to the community, @Steve7771! Thanks for reaching out. And thanks for providing some additional context, @Infinitybht.

This stumped me for a bit, as the behavior you're describing certainly wasn't aligning with the chunking algorithm as I last saw it. After talking to the team, though, I have some additional context. We updated the chunking algorithm relatively recently to take advantage of changes to the specific embedding models that we use. Previously, chunks were limited to 1-2k total characters but they are now capped at around ~30kb, which is in line with the maximum allowed by our models. A 2-3mb file, for example, will be chunked into ~80 pieces.

It's important to note that these changes shouldn't have any effect on the efficacy of your semantic searches! We did, however, find and fix an unrelated issue last week that specifically impacted documents with a small number of chunks. If you noticed any degradation in your RAG queries, it's possible that it was the actual culprit.

If you can, I'd recommend re-vectorizing your schema documents to verify that everything is working as expected.

Darren · July 10, 2025, 10:50pm

Let me know if you have any additional thoughts or context here, @Steve7771!

Steve7771 · July 11, 2025, 2:49pm

Hi Darren, Thanks for the update. The workaround I originally used has been effective and I haven't had a chance to retry without it. Interested to see if it helps.

However, currently I'm kind of limited by this separate metadata issue.

Darren · July 15, 2025, 1:12am

Ah got it. Thanks for providing that context. I'll keep an eye on that particular ticket and let you know when I see any updates.

Glad to hear everything is working, though!

Topic		Replies	Views
Transform Retool-Hosted Database Table into Retool Vector 💬 App Building javascript , sql	7	84	March 27, 2025
Upsert to Vector Store with dynamic metadata 💬 Feature Requests	7	75	September 24, 2025
Inserting document into vector via resource query results in timeout 💬 Queries and Resources bug	16	76	April 17, 2025
Retool Vector Limit 💬 Queries and Resources	12	88	October 14, 2025
Vector Storage and AI Chat Box Ignoring Vectors 💬 App Building	5	39	September 5, 2025

Vector Store Does not Chunk Documents

Related topics