Cannot pass multiple images to GPT-4-Vision

  • Goal: I'm trying to get GPT-4-Vision to evaluate multiple images and select the best one.

  • Steps: I'm using the AI Action "Generate text from image" and I'm able to get it working with a single base 64 string, but I cannot pass multiple base64 strings in the same field, and there is no option to add a second field (which is what would conform with Open AI docs

When I pass multiple in I get the error: Image content must be a Base64 encoded string.

Hi @Cal

You need to call the AI Action multiple times, one for each image data.

Here a quick example:

Suppose you have a fileButton1 component you can use to select a bunch of images.

In a Button you want to use to trigger the AI Action, put this javascript code:

for(let img of fileButton1.value){
  const imgdata = img.base64Data
  
  const text = await qAIVision.trigger({
    additionalScope:{imgdata}
  })
}

Here qAIVision is the query name configured as you described, the only change is that you have to put in the Image content the param {{ imgdata }}

Basically the above example does call the AI Action for each image and return the text.
It's a very basic example, you can extend according to your use case.

Hope this help.

Thanks for your response!

I'm wanting to call the AI Action once for all images, not once per image. This is because they need to be evaluated together, such that the model can select one of them.

AI Vision is able to create text out of an image.

If you want to call the action once, you have to pass a single image, that means you have to create an image on the fly by composing it with all your images.

But if you want to make a selection, let's say, picking one image between many according to the user request, it's better to use text IMHO.

So, if I were you, I'd convert all the images as text as I shown before, then, with another AI Action, I'd instruct the model to choose the right "text", so I can pick the right image up that is linked to that text.

Hope this help