-
Goal: Pass Image URL to google vision to perform OCR.
-
Steps: 1. Tried to resize images from 2048x2048 to 128x128.
2. Tried streaming of images using io.BytesIO().
3. Converted Class based to function based code, in order to make memory efficient.
But, Facing "The code terminated for an unknown reason. Potentially, the memory limit 256 Mibs was breached.". -
Details:
def ocr_image(image_url):
"""
Extract text from an image using Google Vision API
"""
# Fetch the image from the URL
response = requests.get(image_url)
image_bytes = io.BytesIO(response.content)
# Open and convert the image to ensure proper format
image = Image.open(image_bytes).convert("RGB")
buffered = io.BytesIO()
image.save(buffered, format="JPEG")
# Base64 encode without adding any headers or additional formatting
base64_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
# Create an Image object for Google Vision
image_for_vision = vision.Image(content=base64_image)
# Perform text detection
response = vision_client.text_detection(image=image_for_vision)
if response.error.message:
raise Exception(f"Google Vision API Error: {response.error.message}")
# Return the detected text (just the description of the first result)
return response.text_annotations[0].description if response.text_annotations else "No text found."
- Screenshots: