[bug] python workflows lose logs on some errors

trz-justin-dev · September 3, 2024, 2:42am

i haven't figured out exactly what types of errors prevent print() statements from appearing in Retool workflow logs after a run.

a workaround is simply to put crude debugging breaks throughout your code and honing in on which line breaks. for me, they seemed to be os.subprocess.run() and related types of operations.

just documenting here because it's become really annoying, and either i'm doing something wrong or many other people are also experiencing this.

Darren · September 9, 2024, 5:19pm

Hey @trz-justin-dev!

I assume you're primarily asking about Python code blocks. Do you have an example block that errors out without logging? It would be useful to have a starting point for testing.

trz-justin-dev · September 13, 2024, 7:42pm

hey @Darren ! thanks for your reply.

sorry to dump this ugly code on you -- it was very much experimental chatGPT code to see if I could do unit tests inside Retool Workflows. also to get HTML of pages in our app to send email alerts.

here is code which shows zero log output on error. I believe it's line 64, where the command returns a value but i don't assign it to any variable:

subprocess.run(['ar', 'x', dest_path], cwd=BASE_DIR, check=True)

Erroneous codeblock1.py

import os
import requests
import subprocess
import tarfile
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
from PIL import Image
from io import BytesIO
import time
import base64

# Static Values
BASE_DIR = "/tmp"
LIB_DIR = os.path.join(BASE_DIR, "lib")
BIN_DIR = os.path.join(BASE_DIR, "bin")
GECKODRIVER_DIR = os.path.join(BASE_DIR, "geckodriver")
FIREFOX_URL = "https://download.mozilla.org/?product=firefox-latest&os=linux64&lang=en-US"
FIREFOX_TAR_PATH = os.path.join(BASE_DIR, "firefox-latest.tar.bz2")
FIREFOX_EXTRACT_PATH = os.path.join(BASE_DIR, "firefox")
LIBRARIES = {
    "gtk3": "https://rpmfind.net/linux/centos-stream/9-stream/AppStream/x86_64/os/Packages/gtk3-3.24.31-5.el9.x86_64.rpm",
    "libdbusmenu": "https://rpmfind.net/linux/epel/9/Everything/x86_64/Packages/l/libdbusmenu-16.04.0-19.el9.x86_64.rpm",
    "libdbusmenu-jsonloader": "https://rpmfind.net/linux/epel/9/Everything/x86_64/Packages/l/libdbusmenu-jsonloader-16.04.0-19.el9.x86_64.rpm",
}
URL = "http://your-retool-app-url"
USERNAME = "your-username"
PASSWORD = "your-password"

# Ensure the directories exist
os.makedirs(LIB_DIR, exist_ok=True)
os.makedirs(BIN_DIR, exist_ok=True)

def print_system_info():
    """Prints debug information about the OS and Python environment."""
    print("===== SYSTEM INFORMATION =====")
    print(f"Operating System: {os.uname().sysname}")
    print(f"OS Version: {os.uname().release}")
    print(f"Machine: {os.uname().machine}")
    print(f"Processor: {os.uname().machine}")
    print(f"Current Working Directory: {os.getcwd()}")
    print(f"PATH: {os.environ.get('PATH')}")
    print(f"LD_LIBRARY_PATH: {os.environ.get('LD_LIBRARY_PATH', '')}")
    print("===== END OF SYSTEM INFORMATION =====")

def download_file(url, dest):
    """Utility function to download files."""
    response = requests.get(url, stream=True)
    if response.status_code == 200:
        with open(dest, 'wb') as f:
            for chunk in response.iter_content(1024):
                f.write(chunk)
    else:
        raise Exception(f"Failed to download {url}")

def install_lib(lib_name, url):
    """Download and extract required libraries using ar and tar."""
    dest_path = os.path.join(BASE_DIR, lib_name + ".rpm")
    print(f"Downloading {lib_name} from {url}...")
    download_file(url, dest_path)

    # Extract the RPM using ar
    print(f"Extracting {lib_name}...")
    subprocess.run(['ar', 'x', dest_path], cwd=BASE_DIR, check=True)

    # Extract the data.tar.xz from the RPM
    subprocess.run(['tar', '-xf', 'data.tar.xz', '-C', LIB_DIR], cwd=BASE_DIR, check=True)

    # Clean up
    os.remove(dest_path)
    os.remove(os.path.join(BASE_DIR, "data.tar.xz"))

def install_dependencies():
    """Install all the required libraries."""
    for lib, url in LIBRARIES.items():
        install_lib(lib, url)

    # Update LD_LIBRARY_PATH to include the extracted libraries
    os.environ["LD_LIBRARY_PATH"] = f"{LIB_DIR}/usr/lib64:{LIB_DIR}/usr/lib:{os.environ.get('LD_LIBRARY_PATH', '')}"

    # Verify that libraries are accessible
    print("Verifying library installations...")
    try:
        subprocess.run(['ldd', '--version'], check=True)
        print("Libraries successfully installed and accessible.")
    except subprocess.CalledProcessError:
        raise Exception("Failed to verify library installations.")

def install_firefox():
    """Installs Firefox in BASE_DIR."""
    print("Installing Firefox...")

    print(f"Downloading Firefox from: {FIREFOX_URL}")
    download_file(FIREFOX_URL, FIREFOX_TAR_PATH)

    print("Download complete. Extracting Firefox...")
    os.makedirs(FIREFOX_EXTRACT_PATH, exist_ok=True)
    with tarfile.open(FIREFOX_TAR_PATH, "r:bz2") as tar:
        tar.extractall(path=FIREFOX_EXTRACT_PATH)

    # Use the extracted Firefox binary directly from BASE_DIR
    firefox_bin_path = os.path.join(FIREFOX_EXTRACT_PATH, 'firefox', 'firefox')
    print(f"Firefox installed successfully in {FIREFOX_EXTRACT_PATH}.")

    return firefox_bin_path

def verify_firefox_installation(firefox_bin_path):
    """Verifies that Firefox is installed correctly by running it in headless mode."""
    print("Verifying Firefox installation...")
    subprocess.run([firefox_bin_path, '-headless'], check=True)
    print("Firefox is installed and functioning correctly.")

def setup_selenium(firefox_bin_path, geckodriver_path):
    """Sets up Selenium with Firefox and geckodriver."""
    print("Setting up Selenium with Firefox...")

    # Add GECKODRIVER_DIR to PATH
    # os.environ["PATH"] += os.pathsep + os.path.dirname(geckodriver_path)
    # print(f"Updated PATH: {os.environ.get('PATH')}")

    options = Options()
    options.headless = True  # Run in headless mode (no GUI)
    options.binary_location = firefox_bin_path

    service = Service(geckodriver_path)
    driver = webdriver.Firefox(service=service, options=options)
    
    print("Selenium setup complete.")
    return driver

def take_screenshot(driver, url, username, password):
    """Logs into the Retool app and takes a screenshot, returning it as a base64 string."""
    print(f"Navigating to URL: {url}")
    driver.get(url)
    
    print("Filling in login credentials...")
    driver.find_element(By.ID, ':r0:-form-item').send_keys(username)
    driver.find_element(By.ID, ':r1:-form-item').send_keys(password)
    
    print("Locating the sign-in button by type='submit'...")
    try:
        submit_button = driver.find_element(By.XPATH, "//button[@type='submit']")
        print("Sign-in button found.")
    except Exception as e:
        print(f"Could not find the sign-in button: {e}")
        return None

    print("Clicking the sign-in button...")
    submit_button.click()

    print("Waiting for the page to load...")
    time.sleep(5)  # Adjust the sleep time as needed for your app to load completely

    print("Taking screenshot...")
    screenshot = driver.get_screenshot_as_png()

    print("Converting screenshot to base64...")
    buffered = BytesIO()
    image = Image.open(BytesIO(screenshot))
    image.save(buffered, format="PNG")
    base64_image = base64.b64encode(buffered.getvalue()).decode('utf-8')

    print("Screenshot successfully taken and encoded.")
    return base64_image

# Start of script execution
print("Starting the process...")
print_system_info()

# Install required libraries and Firefox
install_dependencies()
firefox_bin_path = install_firefox()

# Verify Firefox installation by running it in headless mode
verify_firefox_installation(firefox_bin_path)

# Assuming geckodriver is pre-installed and available in GECKODRIVER_DIR
geckodriver_path = os.path.join(GECKODRIVER_DIR, "geckodriver")

# Set up Selenium with the installed Firefox and geckodriver
driver = setup_selenium(firefox_bin_path, geckodriver_path)

try:
    base64_screenshot = take_screenshot(driver, URL, USERNAME, PASSWORD)
    if base64_screenshot:
        print("Process complete. Base64 data generated.")
        print(base64_screenshot)
    else:
        print("Process failed to complete.")
except Exception as e:
    print(f"An error occurred: {e}")
finally:
    driver.quit()
    print("Driver quit, process ended.")

here is code which outputs print() statements. I believe this is because I assigned the result of the command to a variable:

    result = subprocess.run(command, shell=True, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

Successful logging codeblock1.py

import os
import subprocess
import requests
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.firefox.options import Options
from PIL import Image
from io import BytesIO
import time
import base64

# Static Values
BASE_DIR = "/tmp"
GECKODRIVER_DIR = os.path.join(BASE_DIR, "geckodriver")
URL = "http://your-retool-app-url"
USERNAME = "your-username"
PASSWORD = "your-password"

# Ensure the directories exist
os.makedirs(GECKODRIVER_DIR, exist_ok=True)

def print_system_info():
    """Prints debug information about the OS and Python environment."""
    print("===== SYSTEM INFORMATION =====")
    print(f"Operating System: {os.uname().sysname}")
    print(f"OS Version: {os.uname().release}")
    print(f"Machine: {os.uname().machine}")
    print(f"Processor: {os.uname().machine}")
    print(f"Current Working Directory: {os.getcwd()}")
    print(f"PATH: {os.environ.get('PATH')}")
    print(f"LD_LIBRARY_PATH: {os.environ.get('LD_LIBRARY_PATH', '')}")
    print("===== END OF SYSTEM INFORMATION =====")

def run_command(command):
    """Run a system command and print stdout and stderr."""
    # try:
    result = subprocess.run(command, shell=True, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    print(f"Command: {command}")
    print(f"stdout: {result.stdout}")
    print(f"stderr: {result.stderr}")
    # except subprocess.CalledProcessError as e:
        # print(f"Command failed: {e}")
        # print(f"stdout: {e.stdout}")
        # print(f"stderr: {e.stderr}")
        # raise

def install_firefox():
    """Installs Firefox using apt."""
    print("Updating package lists...")
    # run_command("apt-get update")
    # raise

    print("Installing Firefox and dependencies...")
    run_command("apt-get install -y firefox")

    print("Verifying Firefox installation...")
    run_command("firefox --headless --version")
    print("Firefox is installed and functioning correctly.")

def setup_selenium(geckodriver_path):
    """Sets up Selenium with Firefox and geckodriver."""
    print("Setting up Selenium with Firefox...")

    # Add GECKODRIVER_DIR to PATH
    # os.environ["PATH"] += os.pathsep + os.path.dirname(geckodriver_path)
    # print(f"Updated PATH: {os.environ.get('PATH')}")

    options = Options()
    options.headless = True  # Run in headless mode (no GUI)

    service = Service(geckodriver_path)
    driver = webdriver.Firefox(service=service, options=options)
    
    print("Selenium setup complete.")
    return driver

def take_screenshot(driver, url, username, password):
    """Logs into the Retool app and takes a screenshot, returning it as a base64 string."""
    print(f"Navigating to URL: {url}")
    driver.get(url)
    
    print("Filling in login credentials...")
    driver.find_element(By.ID, ':r0:-form-item').send_keys(username)
    driver.find_element(By.ID, ':r1:-form-item').send_keys(password)
    
    print("Locating the sign-in button by type='submit'...")
    try:
        submit_button = driver.find_element(By.XPATH, "//button[@type='submit']")
        print("Sign-in button found.")
    except Exception as e:
        print(f"Could not find the sign-in button: {e}")
        return None

    print("Clicking the sign-in button...")
    submit_button.click()

    print("Waiting for the page to load...")
    time.sleep(5)  # Adjust the sleep time as needed for your app to load completely

    print("Taking screenshot...")
    screenshot = driver.get_screenshot_as_png()

    print("Converting screenshot to base64...")
    buffered = BytesIO()
    image = Image.open(BytesIO(screenshot))
    image.save(buffered, format="PNG")
    base64_image = base64.b64encode(buffered.getvalue()).decode('utf-8')

    print("Screenshot successfully taken and encoded.")
    return base64_image

# Start of script execution
print("Starting the process...")
print_system_info()

# Install Firefox
install_firefox()

# Assuming geckodriver is pre-installed and available in GECKODRIVER_DIR
geckodriver_path = os.path.join(GECKODRIVER_DIR, "geckodriver")

# Set up Selenium with the installed Firefox and geckodriver
driver = setup_selenium(geckodriver_path)

try:
    base64_screenshot = take_screenshot(driver, URL, USERNAME, PASSWORD)
    if base64_screenshot:
        print("Process complete. Base64 data generated.")
        print(base64_screenshot)
    else:
        print("Process failed to complete.")
except Exception as e:
    print(f"An error occurred: {e}")
finally:
    driver.quit()
    print("Driver quit, process ended.")

Obviously writing good code won't produce this "missing logs" bug, but alas...
Bad habits formed by writing python inside a Jupyter notebook I think

Darren · September 16, 2024, 5:42pm

Nice - this is the selenium setup you referenced the other day! We can't always write perfect code, sadly. If that were the case, we wouldn't need unit tests. My guess is that stdout is being overwritten and not properly reset when hitting certain errors.

I'll verify that I can reproduce this and then pass it on to the dev team!

Darren · September 16, 2024, 6:02pm

Maybe I'm doing something wrong, but the erroneous code block seems to successfully log output for me.

trz-justin-dev · September 16, 2024, 8:44pm

ah good to know -- so the block logs will show, just not the workflow run logs:

Darren · September 16, 2024, 9:34pm

Ah I wasn't even looking at the logs in the workflow run history - got it. But yeah, running the workflow block seems to output normally. I'll still go ahead and report that for you!

trz-justin-dev · September 19, 2024, 5:04pm

thank you @Darren !

i’ll lean on block logs more in the future — and try to write better code

lauren.gus · October 4, 2024, 8:30pm

Hey @trz-justin-dev ! It looks like you don't have the info logs toggled on in your workflows run history - once you toggle that on you should see those logs
Screenshot 2024-10-04 at 4.27.06 PM

Screenshot 2024-10-04 at 4.28.10 PM

trz-justin-dev · October 22, 2024, 5:12am

woops. thanks @lauren.gus !