The new Agents functionality seems promising and I have a use case in mind, processing documents.
I've stumbled upon an example of document processing from an agency with tools like Retool, Appsmith etc, a short video how it works is shown here and I'm not sure if it has been created with Retool, although it seems possible to me as the components look similar.
Would it be possible to create something in Retool and how would need to use the bounding box together with a table and some AI?
My idea is to use an AI query or Agent to extract data from an uploaded PDF file. But as extracted data might be wrong or from the wrong place, I'm looking dor a solution to extract data from specific areas. Any ideas how something could be achieved?
That's right, the video you linked is indeed a Retool application!
We had to get pretty creative with the existing bounding box component, using a bit of JS and CSS magic to make everything work just right. But the goal you're describing should be feasible!
If the documents you are processing are in the same format, you can tackle this more efficiently by mapping the location of the fields using a template and sending those locations along with the document to OCR processing e.g. using a third party API such as ABBYY.
Alternatively you can use Agents to achieve this for both documents with the same format and different formats. A great feature of Agents is that you can build in evaluations and processes for human intervention e.g. in your case where the extracted data is wrong.