Workflows runs have empty data and logs in the run history

When running a retool workflow, neither data nor logs are being persisted in the run history. Evaluating blocks individually works fine. As depicted in the image below, the block code1 evaluates successfully and emits the data defined. When running the "entire" workflow by clicking the Run button in the upper right corner, the run succeeds, but the run history does not show any data or logs:

I have looked into these related topics, validated our configuration, and both upgraded and downgraded our retool deployment.

Our deployment configuration:
Kubernetes v.1.27
Retool Helm Chart v6.0.11
As hinted in one of the related topics, I've reproduced the issue with these Retool versions: 3.14.5, 3.14.22, 3.26.4
License: Retool Team
Database: Postgres
Workflow and Code Executor are enabled
Temporal: Self-managed external cluster with Temporal Cloud

More detailed chart values
    config:
      licenseKeySecretName: retool
      licenseKeySecretKey: LICENSE_KEY
      encryptionKeySecretName: retool
      encryptionKeySecretKey: ENCRYPTION_KEY
      jwtSecretSecretName: retool
      jwtSecretSecretKey: JWT_SECRET
      postgresql:
        host: <...>
        port: 5432
        db: retool
        user: retool
        ssl_enabled: true
        passwordSecretName: retool
        passwordSecretKey: POSTGRES_PASSWORD
    image:
      repository: "tryretool/backend"
      tag: "3.26.4"
    postgresql:
      enabled: false
    persistentVolumeClaim:
      enabled: true
      accessModes:
        - ReadWriteOnce
      size: 15Gi
      storageClass: "our-storage-retain"
    codeExecutor:
      enabled: true
      image:
        repository: tryretool/code-executor-service
        tag: "3.26.4"
      replicaCount: 1
    workflows:
      enabled: true
      temporal:
        enabled: true
        host: <...>-staging.ceorx.tmprl.cloud
        port: 7233
        namespace: <...>-staging.ceorx
        sslEnabled: true
        sslCert: ...
        sslKey: from-secret
        sslKeySecretName: retool
        sslKeySecretKey: TEMPORAL_CLIENT_SECRET_KEY

Hey @FinviaBot - There is some retention logic in place, but the logs should definitely persist for a bit. I'd be curious what your browser's network dev tools shows as the response to the workflowRun/getLog endpoint when you load a given run's history.

If that's empty, I'd want to check the workflows_run table for that run ID. The blobDataDeletedAt column would correspond to them being purged, and we could try to check the container logs to see if anything jumps out around when/why that is occurring.

Actually looks like this should be configurable in settings -> advanced!

Thanks you for getting back to me! Our Workflows Data Retention is configured exactly as shown in your screenshot.

The call to /api/workflowRun/getLog?runId=e6d35043-d80c-458a-9674-99ab2ed7691a returns {"logs":[],"status":"SUCCESS"}.

And SELECT * FROM workflow_run WHERE id = 'e6d35043-d80c-458a-9674-99ab2ed7691a' yields:

[
  {
    "id": "e6d35043-d80c-458a-9674-99ab2ed7691a",
    "workflowId": "d99c3e7e-aad5-47ff-8bc1-c0a92be8cad6",
    "status": "SUCCESS",
    "logFile": null,
    "createdAt": "2024-02-01 16:43:16.152+00",
    "updatedAt": "2024-02-01 16:43:18.939+00",
    "createdBy": 1,
    "inputDataSizeBytes": "0",
    "outputDataSizeBytes": "0",
    "completedAt": "2024-02-01 16:43:18.379+00",
    "workflowSaveId": "31615c14-d127-407a-a797-6ae3ffb1cfc1",
    "triggerType": "manual",
    "blobDataDeletedAt": null,
    "triggerId": null,
    "environmentId": "d8136d4c-b99f-4d97-b87d-9b8d03e2d0b6",
    "callingRetoolEvent": null
  }
]

No prob! Well hmm... what about that example's run logs in workflow_run_logs? Think if it was purged based on retention we should see something like the below in the UI. Also curious if anything jumping out in the container logs before writing to that table if it's empty.

The workflow_run_logs table is empty, there is no data in it at all. Your screenshot accurately represents what I see in our UI...

Container logs are not suspicious. The logs attached are filtered for the run id from above.

retool-code-executor

retool {"jobId":"6963de6a-2481-4a0b-ba94-239766702319","level":"info","message":"Running the workflow","timestamp":"2024-02-01T16:43:17.045Z","workflowId":"d99c3e7e-aad5-47ff-8bc1-c0a92be8cad6","workflowRunId":"e6d35043-d80c-458a-9674-99ab2ed7691a"}
retool {"jobId":"6963de6a-2481-4a0b-ba94-239766702319","level":"info","message":"Successfully ran workflow","timestamp":"2024-02-01T16:43:18.303Z","workflowId":"d99c3e7e-aad5-47ff-8bc1-c0a92be8cad6","workflowRunId":"e6d35043-d80c-458a-9674-99ab2ed7691a"}

retool-workflow-worker

retool {"activityId":"1","activityType":"setWorkflowRunInProgressV2","attempt":1,"isLocal":false,"label":"activity","level":"info","message":"Activity started - setWorkflowRunInProgressV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"manual-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"1600341b-07a7-4914-8793-9c7324ba0882","timestamp":"2024-02-01T16:43:16.514Z","workflowType":"RunWorkflowOnpremV2"}
retool {"activityId":"1","activityType":"setWorkflowRunInProgressV2","attempt":1,"durationMs":405,"isLocal":false,"label":"activity","level":"info","message":"Activity completed - setWorkflowRunInProgressV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"manual-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"1600341b-07a7-4914-8793-9c7324ba0882","timestamp":"2024-02-01T16:43:16.920Z","workflowType":"RunWorkflowOnpremV2"}
retool {"activityId":"2","activityType":"runBlocksMinimalPayloadVm2","attempt":1,"isLocal":false,"label":"activity","level":"info","message":"Activity started - runBlocksMinimalPayloadVm2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"manual-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"1600341b-07a7-4914-8793-9c7324ba0882","timestamp":"2024-02-01T16:43:16.982Z","workflowType":"RunWorkflowOnpremV2"}
retool {"activityId":"2","activityType":"runBlocksMinimalPayloadVm2","attempt":1,"durationMs":1334,"isLocal":false,"label":"activity","level":"info","message":"Activity completed - runBlocksMinimalPayloadVm2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"manual-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"1600341b-07a7-4914-8793-9c7324ba0882","timestamp":"2024-02-01T16:43:18.317Z","workflowType":"RunWorkflowOnpremV2"}
retool {"activityId":"3","activityType":"finalizeWorkflowRunV2","attempt":1,"isLocal":false,"label":"activity","level":"info","message":"Activity started - finalizeWorkflowRunV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"manual-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"1600341b-07a7-4914-8793-9c7324ba0882","timestamp":"2024-02-01T16:43:18.378Z","workflowType":"RunWorkflowOnpremV2"}
retool {"activityId":"3","activityType":"finalizeWorkflowRunV2","attempt":1,"durationMs":344,"isLocal":false,"label":"activity","level":"info","message":"Activity completed - finalizeWorkflowRunV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"manual-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"1600341b-07a7-4914-8793-9c7324ba0882","timestamp":"2024-02-01T16:43:18.723Z","workflowType":"RunWorkflowOnpremV2"}
retool {"activityId":"1","activityType":"updateWorkflowUsageAggregateV2","attempt":1,"isLocal":false,"label":"activity","level":"info","message":"Activity started - updateWorkflowUsageAggregateV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"after-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"bfd9c7a2-d0d5-476b-804f-4ad53f26209d","timestamp":"2024-02-01T16:43:18.887Z","workflowType":"ExecuteAfterWorkflowTasksV2"}
retool {"activityId":"1","activityType":"updateWorkflowUsageAggregateV2","attempt":1,"durationMs":58,"isLocal":false,"label":"activity","level":"info","message":"Activity completed - updateWorkflowUsageAggregateV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"after-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"bfd9c7a2-d0d5-476b-804f-4ad53f26209d","timestamp":"2024-02-01T16:43:18.945Z","workflowType":"ExecuteAfterWorkflowTasksV2"}
retool {"activityId":"2","activityType":"emitWorkflowExecutionEventsV2","attempt":1,"isLocal":false,"label":"activity","level":"info","message":"Activity started - emitWorkflowExecutionEventsV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"after-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"bfd9c7a2-d0d5-476b-804f-4ad53f26209d","timestamp":"2024-02-01T16:43:19.078Z","workflowType":"ExecuteAfterWorkflowTasksV2"}
retool {"activityId":"2","activityType":"emitWorkflowExecutionEventsV2","attempt":1,"durationMs":23,"isLocal":false,"label":"activity","level":"info","message":"Activity completed - emitWorkflowExecutionEventsV2","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"after-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"bfd9c7a2-d0d5-476b-804f-4ad53f26209d","timestamp":"2024-02-01T16:43:19.102Z","workflowType":"ExecuteAfterWorkflowTasksV2"}
retool {"activityId":"3","activityType":"deleteBlobAndLogDataIfNoRetentionPolicy","attempt":1,"isLocal":false,"label":"activity","level":"info","message":"Activity started - deleteBlobAndLogDataIfNoRetentionPolicy","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"after-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"bfd9c7a2-d0d5-476b-804f-4ad53f26209d","timestamp":"2024-02-01T16:43:19.317Z","workflowType":"ExecuteAfterWorkflowTasksV2"}
retool {"activityId":"3","activityType":"deleteBlobAndLogDataIfNoRetentionPolicy","attempt":1,"durationMs":10,"isLocal":false,"label":"activity","level":"info","message":"Activity completed - deleteBlobAndLogDataIfNoRetentionPolicy","namespace":"retool-staging.ceorx","taskQueue":"workflows","taskToken":"<redacted>","temporalWorkflowId":"after-workflow-execution-e6d35043-d80c-458a-9674-99ab2ed7691a","temporalWorkflowRunId":"bfd9c7a2-d0d5-476b-804f-4ad53f26209d","timestamp":"2024-02-01T16:43:19.327Z","workflowType":"ExecuteAfterWorkflowTasksV2"}

Hmmm digging through the code based on the logs, looks like the settings -> advanced value gets stored in organizations.workflowRunRetentionPeriodMins in the Postgres DB. That actually wasn't set for mine until I hit save in the UI even though it appears to already be applied. Can you try that and rerun once it's for sure set there?

I can confirm this behavior. Explicitly updating the workflowRunRetentionPeriodMins via settings -> advanced created that entry in the organizations table. Running a job with that value being set didn't make a difference though:

In the browser, /getLog returned {"logs":[],"status":"SUCCESS"}
The database table workflow_run_logs is still empty
Logs of the Kubernetes pods are not suspicious and the respective row in workflow_run is:

[
  {
    "id": "0439165c-42c5-41ff-9934-d2e57b9b4036",
    "workflowId": "d99c3e7e-aad5-47ff-8bc1-c0a92be8cad6",
    "status": "SUCCESS",
    "logFile": null,
    "createdAt": "2024-02-02 07:49:44.311+00",
    "updatedAt": "2024-02-02 07:49:46.329+00",
    "createdBy": 1,
    "inputDataSizeBytes": "0",
    "outputDataSizeBytes": "0",
    "completedAt": "2024-02-02 07:49:46.021+00",
    "workflowSaveId": "31615c14-d127-407a-a797-6ae3ffb1cfc1",
    "triggerType": "manual",
    "blobDataDeletedAt": null,
    "triggerId": null,
    "environmentId": "d8136d4c-b99f-4d97-b87d-9b8d03e2d0b6",
    "callingRetoolEvent": null
  }
]

Alright, let me check with some other people on the behavior here across versions, as I know it's something we've been changing here not too long ago. You're still on 3.26.4 right? This definitely should work and does in other deployments I've spun up, just a matter of figuring out what's going wrong here.

And just to document my thoughts, the deleteBlobAndLogDataIfNoRetentionPolicy activity looks to always be logged in general. But within that, it should only delete the run's logs if organizations.workflowRunRetentionPeriodMins === 0. However, that same code path would set workflow_run.blobDataDeletedAt when that is executed. So that seems to imply the activity isn't deleting the logs (good), but they're never being set for this run in the first place (bad) :thinking:

Can you DM a zip of the full logs across the Retool containers that includes an instance of a workflow being triggered?