Upgrade to 3.52.1-stable fails on ECS/Fargate

Hi !

Attempts to deploy 3.52.1-stable on ECS/Fargate fail on the deployment of the 'retool' container with:

June 06, 2024 at 17:08 (UTC+2:00)	sed: can't read ./dist/mobile/*.js: No such file or directory	retool
June 06, 2024 at 17:08 (UTC+2:00)	not untarring the bundle

for reference, my task definition is as follows:

  RetoolTask:
    Type: AWS::ECS::TaskDefinition
    Properties:
      NetworkMode: awsvpc
      Cpu: !Ref RetoolVCpu
      Memory: !Ref RetoolMemory
      Family: "retool"
      TaskRoleArn: !Ref "RetoolTaskRole"
      ExecutionRoleArn: !Ref "RetoolExecutionRole"
      RequiresCompatibilities:
        - FARGATE
      ContainerDefinitions:
        - Name: "retool"
          Essential: "true"
          Image: !Ref "Image"
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref "CloudwatchLogsGroup"
              awslogs-region: !Ref "AWS::Region"
              awslogs-stream-prefix: "SERVICE_RETOOL"
          Environment:
            - Name: NODE_ENV
              Value: production
            - Name: SERVICE_TYPE
              Value: MAIN_BACKEND,DB_CONNECTOR,DB_SSH_CONNECTOR
            - Name: "FORCE_DEPLOYMENT"
              Value: !Ref "Force"
            - Name: POSTGRES_DB
              Value: !If [CreateDatabase, "hammerhead_production", !Ref DBName]
            - Name: POSTGRES_HOST
              Value:
                !If [
                  CreateDatabase,
                  !GetAtt [RetoolRDSInstance, Endpoint.Address],
                  !Ref DBHost,
                ]
            - Name: POSTGRES_SSL_ENABLED
              Value: "true"
            - Name: POSTGRES_PORT
              Value: "5432"
            - Name: POSTGRES_USER
              Value:
                !Join [
                  "",
                  [
                    "{{resolve:secretsmanager:",
                    !Ref RetoolRDSSecret,
                    ":SecretString:username}}",
                  ],
                ]
            - Name: POSTGRES_PASSWORD
              Value:
                !Join [
                  "",
                  [
                    "{{resolve:secretsmanager:",
                    !Ref RetoolRDSSecret,
                    ":SecretString:password}}",
                  ],
                ]
            - Name: JWT_SECRET
              Value:
                !Join [
                  "",
                  [
                    "{{resolve:secretsmanager:",
                    !Ref RetoolJWTSecret,
                    ":SecretString:password}}",
                  ],
                ]
            - Name: ENCRYPTION_KEY
              Value:
                !Join [
                  "",
                  [
                    "{{resolve:secretsmanager:",
                    !Ref RetoolEncryptionKeySecret,
                    ":SecretString:password}}",
                  ],
                ]
            - Name: LICENSE_KEY
              Value: !Ref RetoolLicenceKey
            - Name: SANDBOX_DOMAIN
              Value: !Ref SandboxDomain
            - Name: FORWARDABLE_SAME_DOMAIN_COOKIES_ALLOWLIST
              Value: access_token_cookie,refresh_token_cookie
            - Name: DISABLE_INTERCOM
              Value: "true"
            - Name: ALLOW_SAME_ORIGIN_OPTION
              Value: "true"
            - Name: POSTGRES_SSL_REJECT_UNAUTHORIZED=
              Value: "false"

            # # Remove below when serving Retool over https
            # - Name: COOKIE_INSECURE
            #   Value: "true"
          PortMappings:
            - ContainerPort: 3000
              # HostPort: '80'
          Command: ["./docker_scripts/start_api.sh"]

Any clues ?

thanks :wink:

jfp

Hi, any idea about the cause of this error ? Our upgrade is blocked.

Thanks !

jfp

HI,

still have my upgrade issue. I hope that 3.52.2-stable would help, but I still have the issue.

I now understand that the message "sed: can't read ./dist/mobile/*.js: No such file or directory" is not the root cause and is a side effect.

Looking for other error messages I found the following ones, but nothing obvious for me:

Database migrations are up to date.
RetoolDB credentials not found, skipping setup...
Environment variables:
[long list...]
Code executor (http://localhost:3004) not healthy or unreachable
license check http response code: 200
[Master] Detected 2 cpus, starting 2 workers
[Worker] Worker 63 started and listening on 3001
"kernelOutput": "Command failed: dmesg -T| grep -E -i -m1 -B100 'killed process'\ndmesg: read kernel buffer failed: Operation not permitted\n",
 "message": "[Master] Worker 63 died (code null, signal SIGKILL). 1 workers left",
[Master] replacing worker
./docker_scripts/start_api.sh: line 66: 23 Killed node --openssl-legacy-provider --no-experimental-fetch bundle/main.js

I'm desperate ¯\(ツ)

thanks !

SIGKILL indicates that the Retool workers breached a memory allocation and was terminated. There's a call out on our current page release page that upgrading to v3.52 will increase the memory usage of your Retool deployment. You may need to adjust the memory limits when upgrading.

What version are you upgrading from? Also can you share your current memory and CPU allocations to help situate us with your deployment and to help us understand the urgency/flexibility on your end?

Thanks :pray:

Hi @AbbeyHernandez

Thanks for the tip, you solved my problem :wink: !

On my test instance, I had a rather low memory setting: 2048MB. I changed this to 4096 and the update went fine. VCPU was 1024 (1 vCPU in Fargate context), and I moved this to 2048 also.

For the record I was migrating from 3.33.30-stable.

Thanks again !

jfp

1 Like

Fantastic! So glad to hear it :slight_smile:.