Bounding box use

The new Agents functionality seems promising and I have a use case in mind, processing documents.
I've stumbled upon an example of document processing from an agency with tools like Retool, Appsmith etc, a short video how it works is shown here and I'm not sure if it has been created with Retool, although it seems possible to me as the components look similar.

The video is this one: https://cdn.prod.website-files.com/644b7d98dd3b6a02da39c63b%2F673cde6af77731bbdba2a4f2_TemplateEditorCompressed-transcode.mp4
(Credits to https://www.sixthgeneration.io/)

Would it be possible to create something in Retool and how would need to use the bounding box together with a table and some AI?

My idea is to use an AI query or Agent to extract data from an uploaded PDF file. But as extracted data might be wrong or from the wrong place, I'm looking dor a solution to extract data from specific areas. Any ideas how something could be achieved?

@avr , wasn't aware that you are on this forum. And just show your other post Convert documents to structured data with Retool Agents

Did you build the other document processing tool also in Retool?

Hi @mbruijnpff :wave:

That's right, the video you linked is indeed a Retool application!

We had to get pretty creative with the existing bounding box component, using a bit of JS and CSS magic to make everything work just right. But the goal you're describing should be feasible!

If you're interested in our lessons learned using the bounding box component, you can read more about it in this post: Digitising comic books with bounding boxes.

Good luck!

Hi @mbruijnpff,

If the documents you are processing are in the same format, you can tackle this more efficiently by mapping the location of the fields using a template and sending those locations along with the document to OCR processing e.g. using a third party API such as ABBYY.

Alternatively you can use Agents to achieve this for both documents with the same format and different formats. A great feature of Agents is that you can build in evaluations and processes for human intervention e.g. in your case where the extracted data is wrong.

Here is a demo that shows human intervention in a Retool Agent - Expense Management Agent Demo

1 Like