Long delay in AI response when using vectors

exxscher · October 16, 2024, 10:20pm

I've been using Retool for a few weeks, especially the AI actions. I've noticed that when I use a vector as a source of information, the response from the Chatbot AI component is quite slow (around 10 seconds).

I use the streaming option to generate the response as it is being processed, but it still takes a long time until the first character is generated and started to be typed.

When I don't use vectors, the response is much faster, around 1-2 seconds.

Is this normal? Has anyone managed to fix it somehow? Are there plans for improvements?

Jack_T · October 21, 2024, 10:28pm

Hello @exxscher,

Apologies for the slowness, we are definitely looking to improve the performance speed of AI tools.

I would imagine that there are some processes going on behind the scenes for vectors that are taking more time to get the data to the LLM model for output generation.

How much data is the vectors you are using?

exxscher · October 23, 2024, 1:25am

@Jack_T, the amount of data in my vector is minimal, about 2 pages in a pdf file. The slow response occurs regardless of the size of the vector, as I tested with a much larger one, and the response in all cases is around 10 to 12 seconds.

I also tested on a new application, and with different AI models.

I have been testing on different days, for about 2 weeks now, to test if it was a temporary problem.

Jack_T · October 24, 2024, 5:22pm

Thank you for the extensive testing!

That is odd, we definitely do not want users to have to wait that long. Are you self hosted or on the cloud?

What temperature do you have the query set to? Could you share a screenshot of your query set up?

I just did some tested with a 13 MB vector PDF on the history of France and it seemed to work within a second or two using GPT-4o-mini and temperature of 1.

This was my set up. If you have any steps I can copy to reproduce the slow response speed, that would be very helpful for me to reproduce this behavior and share with our engineering team so we can fix this

Or if you could DM me a video of the behavior as well that might provide more clues to me and the team!

exxscher · November 1, 2024, 2:14am

Hi, sorry for the delay, it was a busy week.

I just recorded a video, where I did the following steps:

I generated a web application.
I used the Chat AI component.
I configured the query with a vector*.

I generated the vector from the url Changelog | Retool Docs

Video:

Test:

I asked a question in the Chat AI, and the answer took 13 seconds to start generating in the interface.
I removed the vector from the configuration, and the answer started generating after 2 seconds.

Observations:

When using larger or smaller vectors, the response is just as slow.
When not using vectors, the response is usually quite fast, even with much more complex questions.

Greetings!

Jack_T · November 7, 2024, 10:25pm

Hello @exxscher!

No worries on the delay, thank you for providing the video and detailed documentation this helps us a lot.

I filed a ticket for out engineering team to look into vector performance.

I was told that unfortunately this time delay is expected for vectors currently We are looking to get this improved as soon as possible, it seems to be a bottleneck that is related to our app interfacing with vectors regardless of their size.

For now, if your main concern is response time I would recommend avoiding using vectors . Using vectors when you are ok with a slower response and need that response to rely on specific data stored in a vector that a user/LLM would not have access to otherwise.

Topic		Replies	Views
GPT Query Response Time Optimization 💬 Queries and Resources ai	9	1400	March 14, 2024
Transform Retool-Hosted Database Table into Retool Vector 💬 App Building javascript , sql	7	51	March 27, 2025
Using vector in chat component - not working 💬 App Building ai	9	82	February 3, 2025
OpenAI: Upsert to Vector timeout 💬 Queries and Resources	1	32	September 19, 2024
Error uploading Bulk Documents to Vector AI 💬 Queries and Resources bug	7	468	March 6, 2025

Long delay in AI response when using vectors

Related topics