OpenAI Vision - Support for Multiple Images and URLs vs. Base64

Hi, the 'Generate text from image' action in a Retool AI workflow block has 2 limitations:

  1. The image must be provided as base64 rather than a URL
  2. Only one image may be provided

Neither of these two constraints apply to Open AI's underlying vision platform, in fact, per their documentation, providing image URLs is preferred.

It would be nice to relax these constraints and allow a more native pass-through to the Open AI platform. This could also include adding support for the recommended resolution flag, which allows the developer to specify low resolution (cheaper) vision processing.

1 Like

Hello @fenix!

Thank you for bringing this to our attention. I agree we should improve the AI workflow block for a more native pass-through, as these are not limitations on the Open AI platform.

I also like the idea for a resolution flag for specifying image quality to give further control over vision processing.

I can make a feature request for both of these and keep you updated on any news I hear!