How to insert the output of a python block in workflow into the retool databse

b_d_2025 · March 3, 2025, 9:55pm

Goal: I built an app to track all of my job applications. I created a workflow to do the following:
1.Call a SQL query to list all the data I need to do a supervised regression analysis with python

Use a python code block to do the analysis showing the probability of each application to turn into an offer
Insert the output of the python code into the retool database
Use that data to rate each application's probability of success in a retool table

Here is the SQL query that pulls the data I need:

WITH InteractionHistory AS (
    SELECT 
        i.application_id, 
        j.company_name, 
        j.field, 
        j.position, 
        j.salary_low, 
        j.salary_high, 
        j.favourite, 
        i.interaction_number,
        i.status,
        COUNT(i.id) OVER (PARTITION BY i.application_id) AS total_interactions,
        MAX(i.interaction_number) OVER (PARTITION BY i.application_id) AS latest_interaction_number,
        MAX(i.date) OVER (PARTITION BY i.application_id) AS last_interaction_date
    FROM interactions i
    JOIN job_search_02 j ON i.application_id = j.id
)
SELECT 
    application_id,
    company_name,
    field,
    position,
    salary_low,
    salary_high,
    favourite,
    total_interactions,
    latest_interaction_number,
    last_interaction_date,
    -- Count how many times an application moved past "Applied"
    COUNT(CASE WHEN status IN ('Moving to the next step', 'Waiting for an update') THEN 1 END) AS total_interviews,
    -- Count how many times an application reached final stages but didn't get an offer
    COUNT(CASE WHEN status = 'No offer' THEN 1 END) AS total_final_rounds,
    -- Track rejections (may help in ML analysis)
    COUNT(CASE WHEN status = 'Rejected' THEN 1 END) AS total_rejections
FROM InteractionHistory
GROUP BY application_id, company_name, field, position, salary_low, salary_high, favourite, total_interactions, latest_interaction_number, last_interaction_date;

Here is the sample output:

{"data":[{"application_id":109,"company_name":"Knowledgehook","field":"Ed-tech","position":"Lead Product Manager, Educators","salary_low":null,"salary_high":null,"favourite":false,"total_interactions":"2","latest_interaction_number":1,"last_interaction_date":"2025-01-31","total_interviews":"0","total_final_rounds":"0","total_rejections":"0"},{"application_id":56,"company_name":"Xplor","field":"Fintech","position":"VP Product","salary_low":null,"salary_high":null,"favourite":false,"total_interactions":"2","latest_interaction_number":1,"last_interaction_date":"2024-12-02","total_interviews":"0","total_final_rounds":"0","total_rejections":"1"},

this is the python code block that does the analysis:

import pandas as pd
import numpy as np
import json
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Convert SQL query result to DataFrame
df = pd.DataFrame(fetchData["data"])

### ✅ Step 1: Handle Missing Data ###
df['field'].fillna("Unknown", inplace=True)
df['position'].fillna("Unknown", inplace=True)

df['salary_low'] = pd.to_numeric(df['salary_low'], errors='coerce')
df['salary_high'] = pd.to_numeric(df['salary_high'], errors='coerce')

df['salary_low'].fillna(df['salary_low'].median(), inplace=True)
df['salary_high'].fillna(df['salary_high'].median(), inplace=True)

df['favourite'] = df['favourite'].astype(int)

# ✅ Convert interaction counts to integers (fix TypeError)
df['total_interactions'] = pd.to_numeric(df['total_interactions'], errors='coerce').fillna(0).astype(int)
df['total_interviews'] = pd.to_numeric(df['total_interviews'], errors='coerce').fillna(0).astype(int)
df['total_final_rounds'] = pd.to_numeric(df['total_final_rounds'], errors='coerce').fillna(0).astype(int)

### ✅ Step 2: Define Target Variables ###
df['interview_likelihood'] = (df['total_interviews'] > 0).astype(int)  # Likelihood of getting an interview
df['offer_likelihood'] = (df['total_final_rounds'] > 0).astype(int)  # Likelihood of getting an offer

### ✅ Step 3: Convert Categorical Fields ###
df = pd.get_dummies(df, columns=['company_name', 'field', 'position'], drop_first=True)

### ✅ Step 4: Define Features ###
X = df[['latest_interaction_number', 'salary_low', 'salary_high', 'favourite', 'total_interactions', 'total_interviews', 'total_final_rounds']]

### ✅ Step 5: Train Separate Models for Each Prediction ###
# Train Interview Prediction Model
X_train, X_test, y_train, y_test = train_test_split(X, df['interview_likelihood'], test_size=0.2, random_state=42)
interview_model = LogisticRegression()
interview_model.fit(X_train, y_train)
df['interview_probability'] = interview_model.predict_proba(X)[:, 1]

# Train Offer Prediction Model
X_train, X_test, y_train, y_test = train_test_split(X, df['offer_likelihood'], test_size=0.2, random_state=42)
offer_model = LogisticRegression()
offer_model.fit(X_train, y_train)
df['offer_probability'] = offer_model.predict_proba(X)[:, 1]

### ✅ Step 6: Return Proper JSON Object ###
output_json = df[['application_id', 'interview_probability', 'offer_probability']].to_dict(orient="records")

# 🔥 FIX: Explicitly return a JSON object, NOT a string
return {"data": output_json}

here is the sample output:

{"data":{"data":[{"application_id":109,"interview_probability":0.03438953636701784,"offer_probability":0.01923380571479206},{"application_id":56,"interview_probability":0.03438953636701784,"offer_probability":0.01923380571479206},{"application_id":12,"interview_probability":0.12723304594296977,"offer_probability":0.0521071338641757},

Here is my SQL query to insert the output into the retool DB but it doesn't work.

WITH input_data AS (
  SELECT jsonb_array_elements('{{ pythonBlock.output.data }}'::jsonb) AS row
)
INSERT INTO job_probability (application_id, interview_probability, offer_probability)
SELECT 
  (row->>'application_id')::int,
  (row->>'interview_probability')::float,
  (row->>'offer_probability')::float
FROM input_data
ON CONFLICT (application_id) 
DO UPDATE SET 
    interview_probability = EXCLUDED.interview_probability,
    offer_probability = EXCLUDED.offer_probability;

What am I doing wrong? Can anyone guide me?

MiguelOrtiz · March 3, 2025, 10:06pm

Hey @b_d_2025,

So the first thing that I noticed, without going into the details of your SQL query, is the reference, I would have assumed you need to reference code3.data, right?

b_d_2025 · March 3, 2025, 10:08pm

thank you for cathing that. I was trying different options and connected the wrong blocks but the results is still teh same with the proper naming.

MiguelOrtiz · March 3, 2025, 10:30pm

Have you tried removing the quotes before and after the curly brackets?

b_d_2025 · March 3, 2025, 11:30pm

Nope didn't solve.

MiguelOrtiz · March 4, 2025, 1:14am

Sorry, I meant {{ code3.data }} (not sure why you're adding output but it doesn't seem to be how your python block returns the data)

b_d_2025 · March 4, 2025, 2:55am

still doesn't work

I am open to any method that will insert the JSON output of this python code

import pandas as pd
import numpy as np
import json
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Convert SQL query result to DataFrame
df = pd.DataFrame(fetchData["data"])

### ✅ Step 1: Handle Missing Data ###
df['field'].fillna("Unknown", inplace=True)
df['position'].fillna("Unknown", inplace=True)

df['salary_low'] = pd.to_numeric(df['salary_low'], errors='coerce')
df['salary_high'] = pd.to_numeric(df['salary_high'], errors='coerce')

df['salary_low'].fillna(df['salary_low'].median(), inplace=True)
df['salary_high'].fillna(df['salary_high'].median(), inplace=True)

df['favourite'] = df['favourite'].astype(int)

# ✅ Convert interaction counts to integers (fix TypeError)
df['total_interactions'] = pd.to_numeric(df['total_interactions'], errors='coerce').fillna(0).astype(int)
df['total_interviews'] = pd.to_numeric(df['total_interviews'], errors='coerce').fillna(0).astype(int)
df['total_final_rounds'] = pd.to_numeric(df['total_final_rounds'], errors='coerce').fillna(0).astype(int)

### ✅ Step 2: Define Target Variables ###
df['interview_likelihood'] = (df['total_interviews'] > 0).astype(int)  # Likelihood of getting an interview
df['offer_likelihood'] = (df['total_final_rounds'] > 0).astype(int)  # Likelihood of getting an offer

### ✅ Step 3: Convert Categorical Fields ###
df = pd.get_dummies(df, columns=['company_name', 'field', 'position'], drop_first=True)

### ✅ Step 4: Define Features ###
X = df[['latest_interaction_number', 'salary_low', 'salary_high', 'favourite', 'total_interactions', 'total_interviews', 'total_final_rounds']]

### ✅ Step 5: Train Separate Models for Each Prediction ###
# Train Interview Prediction Model
X_train, X_test, y_train, y_test = train_test_split(X, df['interview_likelihood'], test_size=0.2, random_state=42)
interview_model = LogisticRegression()
interview_model.fit(X_train, y_train)
df['interview_probability'] = interview_model.predict_proba(X)[:, 1]

# Train Offer Prediction Model
X_train, X_test, y_train, y_test = train_test_split(X, df['offer_likelihood'], test_size=0.2, random_state=42)
offer_model = LogisticRegression()
offer_model.fit(X_train, y_train)
df['offer_probability'] = offer_model.predict_proba(X)[:, 1]

### ✅ Step 6: Return Proper JSON Object ###
output_json = df[['application_id', 'interview_probability', 'offer_probability']].to_dict(orient="records")

# 🔥 FIX: Explicitly return a JSON object, NOT a string
return {"data": output_json}

output sample:

{"data":{"data":[{"application_id":109,"interview_probability":0.03438953636701784,"offer_probability":0.01923380571479206},{"application_id":56,"interview_probability":0.03438953636701784,"offer_probability":0.01923380571479206},{"application_id":12,"interview_probability":0.12723304594296977,"offer_probability":0.0521071338641757},{"application_id":4,"interview_probability":0.11620126714007273,"offer_probability":0.03643948466868332},{"application_id":101,"interview_probability":0.07477738363180561,"offer_probability":0.04106110164066394},{"application_id":88,"interview_probability":0.03438953636701784,"offer_probability":0.01923380571479206},{"application_id":99,"interview_probability":0.03438953636701784,"offer_probability":0.01923380571479206},{"application_id":193,"interview_probability":0.03438953452912177,"offer_probability":0.01923380455400257}]}}

into the job_probability table in the corresponding columns: application_id, interview_probability, offer_probability

MiguelOrtiz · March 4, 2025, 12:27pm

Thanks for sharing the data sample.

One way of achieving this is with a loop block, see below:

The output sample is structured like this:

[
  {
    "application_id": 109,
    "interview_probability": 0.03438953636701784,
    "offer_probability": 0.01923380571479206
  },
  {
    "application_id": 56,
    "interview_probability": 0.03438953636701784,
    "offer_probability": 0.01923380571479206
  },
  {
    "application_id": 12,
    "interview_probability": 0.12723304594296977,
    "offer_probability": 0.0521071338641757
  },
  {
    "application_id": 4,
    "interview_probability": 0.11620126714007273,
    "offer_probability": 0.03643948466868332
  },
  {
    "application_id": 101,
    "interview_probability": 0.07477738363180561,
    "offer_probability": 0.04106110164066394
  },
  {
    "application_id": 88,
    "interview_probability": 0.03438953636701784,
    "offer_probability": 0.01923380571479206
  },
  {
    "application_id": 99,
    "interview_probability": 0.03438953636701784,
    "offer_probability": 0.01923380571479206
  },
  {
    "application_id": 193,
    "interview_probability": 0.03438953452912177,
    "offer_probability": 0.01923380455400257
  }
]

which I think you'll achieve if you update your query block's final statment to return output_json

b_d_2025 · March 4, 2025, 2:43pm

Thank you!! it works.

Topic		Replies	Views
Workflow retool database block jsonb error "invalid input syntax for type json" when trying to populate jsonb column 💬 Workflows resource-connection , sql , table	14	1567	May 13, 2024
Loop an SQL query on each identifier in the JSON array 💬 Workflows sql	13	325	June 9, 2025
Part of JSON with SQL query being converted to [object Object] 💬 Queries and Resources javascript , sql	8	528	April 30, 2025
Workflow SQL INSERT INTO error: syntax error at or near "$1" 💬 Queries and Resources sql , easter-egg-hunt	4	2702	May 14, 2024
Looping through table data in a workflow 💬 App Building	15	662	October 21, 2024

How to insert the output of a python block in workflow into the retool databse

Related topics