I'm trying to create a proof of concept web app based on text-to-image and image-to-image generation with Retool utilizing Stable Diffusion 3.
During this development, I am using and referencing the StabilityAI REST API documentation from their website: Stability AI - Developer Platform
I have been able to get text-to-image generation working. However, image-to-image generation is the part that I cannot get working.
To keep it concise, I am stuck with two main errors.
- When uploading the image to Retool, the data is converted to base64Data. Stable Diffusion is looking for a string in binary format input. When I input it into Stable Diffusion API image field, the error output is:
{
"errors": [
"image: input not instance of Blob"
],
"id": "000000000000000065e60b40b87fcedc",
"name": "bad_request"
}
- Reading the API document, the string input should be binary format. However, when I convert the base64Data into binary format, I get a 400 status error stating the payloads cannot be larger than 10MiB in size.
=================================
string
The image to use as the starting point for the generation.
Supported formats:
- jpeg
- png
- webp
Supported dimensions:
- Every side must be at least 64 pixels
Important: This parameter is only valid for image-to-image requests.
==================================
I'm completely stuck on how to proceed forward with image-to-image generation with Stable Diffusion 3. Apologies if this is a simple problem, but my background is in hardware and not in software. Any help is deeply appreciated. Thank you.