Stable Diffusion 3 API Image2Image Not Working

I'm trying to create a proof of concept web app based on text-to-image and image-to-image generation with Retool utilizing Stable Diffusion 3.

During this development, I am using and referencing the StabilityAI REST API documentation from their website: Stability AI - Developer Platform

I have been able to get text-to-image generation working. However, image-to-image generation is the part that I cannot get working.

To keep it concise, I am stuck with two main errors.

  1. When uploading the image to Retool, the data is converted to base64Data. Stable Diffusion is looking for a string in binary format input. When I input it into Stable Diffusion API image field, the error output is:

{
"errors": [
"image: input not instance of Blob"
],
"id": "000000000000000065e60b40b87fcedc",
"name": "bad_request"
}

  1. Reading the API document, the string input should be binary format. However, when I convert the base64Data into binary format, I get a 400 status error stating the payloads cannot be larger than 10MiB in size.

=================================
string

The image to use as the starting point for the generation.

Supported formats:

  • jpeg
  • png
  • webp

Supported dimensions:

  • Every side must be at least 64 pixels

Important: This parameter is only valid for image-to-image requests.
==================================

I'm completely stuck on how to proceed forward with image-to-image generation with Stable Diffusion 3. Apologies if this is a simple problem, but my background is in hardware and not in software. Any help is deeply appreciated. Thank you.

1 Like

Hey @bluc - welcome to the community!

This sounds like an interesting use case. What is the size of the original image that you're uploading? The 10MB limit is being enforced by the Stable Diffusion 3 API endpoint, so there's not anything to be done on the Retool end if the original image is too large.

You should be able to more efficiently package the binary data by converting it to a true blob, as outlined here. Let me know if you have any issues implementing that!