so I am exploring Retool AI on Retool on premise.
I create a Vector from a URL, like:
Then insert an https URL, e.g. https://thenodeinstitute.org and click 'Fetch':
I then get this:
...and after few seconds, this:
Any clue on why this?
I have been using the new retool AI and would like to add a vector DB of data from a URL. It adds the URL fine then never finishes. The firewall of the machine doesn't detect any https traffic either.
I've looked in the docker compose logs and I can see it starts the job but no other logs / errors.
Am I missing anything? Retool has external access to the internet etc. I've tried different URLs.
@Glen_Baars do you know if Retool Workflow is needed for the crawler to work?
I'm not sure, I have workflows installed and working.
The only difference is that mine never moves from "We're crawling the URL's"
maybe @francis12 could help?
Hello @Glen_Baars and @theidleguest,
We're currently aware of an issue associated with on-premise URL crawler that causes vector creation (only URL-based) to fail. We are working on a fix and hope to get a new on-premise version pushed as soon as possible! I will keep this thread updated.
this error seems also to exist with the cloud version URL crawler. Any updates on the fix?
@jpmin Just checked cloud and it still seems to be working. Was their a particular url that was giving you issues?
@joeBumbaca Thanks for the reply! It is this one:
(Issue is with all URLs from the S3 Bucket)
I let users upload PDFs to an S3 Bucket and then want to add those URLs to a vector.
Hey @joeBumbaca @francis12 the issue it still present in the on-premise version. Any updates?
@theidleguest No updates yet, will ping this thread as soon as we do. Thanks
Hello @joeBumbaca, there's an error being thrown while trying to fetch the URL