I’m using chat gpt4o to extract text from product labels (image uploaded to retool storage). But I’m wondering which lm is faster for this task. Has anyone experience with this?
I read online that an OCR model might be better suited for this task and perhaps faster. But implementing AI is so much easier with Retool.
Any suggestions? Now it takes more than 6 seconds to process.
At the moment I’m using retool storage. This might also cause delays. I would use twicpics.com if it would help. Unfortunately this is only faster after the first load after caching.
By the way.. twicpics is very easy to set up and has a free tier up to 3gb. I recommend it if you are using retool storage. Instant cdn zero effort.
Hey @Steven_W ! We've had great success with AWS Rekognition.
It's fast and accurate for scene text, especially compared to other OCR and scene text models I've tried where the actual portion of the text is very small relative to the entire input image.
CLIP4STR I found had state of the art accuracy as of a month or two ago, but struggled with images beyond 2MP for small text -- cropping helped enormously.
Did you get AWS Rekognition working in retool? It’s not available as a resource or llm option right? You could implemented from scratch or used it for another project outside retool?
No problem and yeah it will likely be a call to the Rekognition JS SDK or HTTP API from within Retool, tho I haven't tried boto3 in Retool yet. We have a simple lambda using python boto3
Try https://freeparser.net. It’s a free website that combines OCR and AI to extract structured data from PDFs, images, and other document types. It works well for receipts, invoices, and similar documents, and you get some free credits to test it out without login.
I’ve used it for extracting text from scanned documents. And it does a pretty solid job at identifying key information. You just upload your file, and it processes everything for you like auto detect the data fields. The batch upload and extraction feature is my favorite. You can upload multiple files and get the results of same data schema.