Getting On Premise Deployment to work with SSL

stephenmarsh · August 24, 2021, 6:43pm

I am deploying on an EC2 on an AWS VPC. Everything works fine if I create an EC2 with a public DNS address, and no SSL. But I have been unable to put the EC2 instance behind a load balancer with an SSL Route 53 end point in front - no combination of settings in docker.env and docker-compose.yml seem to work. I am terminating the SSL at the load balancer - it forwards HTTP traffic to the EC2 instance on port 3000. No matter what I put in for the "DOMAINS", BASE_DOMAIN" and "COOKIE_INSECURE" seems to work. I can log into retool, navigate around the screens, but if I try (for example) to execute the REST query in the Country Search provided sample the result is a "Run Failed" status with a message reading " error:"Unknown error"".

mark · August 24, 2021, 9:05pm

Hey there Stephen,

Are you getting any logs in the Retool container, which may provide some additional details, when these queries fail to run?

stephenmarsh · August 24, 2021, 9:33pm

I'm not sure if I'm looking in the right place, but looking at what I think are the api and jobs logs seem to only show memory usage messages - nothing that looks like an error.

mark · August 24, 2021, 9:35pm

Hmm how are you checking those logs? It sounds like you are seeing what I would expect, but I am surprised no errors are showing with failed queries

kent · August 24, 2021, 11:51pm

Hey Stephen, I have an instance set up with SSL, behind a load balancer, running via docker-compose on an EC2 instance.

Here's an overview of my set up:

On the instance

docker.env:

...
DOMAINS=mysubdomain.mydomain.com -> http://api:3000`
...

docker-compose.yml:

...

https-portal:
    
    ...
    environment:
      STAGE: 'production'

...

EC2

Application load balancer listening on :443 and routing to mytargetgroup
mytargetgroup routing traffic to :3000 of the EC2 instance's elastic IP

Route53

Hosted zone for mydomain.com
A Record mapping mysubdomain.mydomain.com to $myloadbalancerurl.us-east-1.elb.amazonaws.com

Hopefully that's helpful, let me know if you have any questions.

stephenmarsh · August 25, 2021, 1:36pm

I do a "docker container ls" and then a "docker logs -f xxxxxx" for the different containers running. And actually this morning I do see an error, although not directly connected to the problems showing up in the web app. The log for the htttps-portal container shows these messages (which seem to happen at time of launching the containers):

021/08/25 01:21:01 [emerg] 178#178: bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)

I am taking a look at why port 80 seems to be in use. Will post back when I find out what might be going on

stephenmarsh · August 25, 2021, 2:28pm

Thanks so much - good to know that it is possible! This sounds very much like what I am doing - what do you set COOKIE_INSECURE to in your configuration?

Stephen.

kent · August 25, 2021, 3:24pm

Port 80 already being in use is definitely something I'd look into.

I have COOKIE_INSECURE=false

stephenmarsh · August 25, 2021, 3:59pm

ok - I have the configuration as Kent describes. Looking the log for https-portal as it starts up it shows:
Verifying staging.mycompany.com...
Traceback (most recent call last):
File "/bin/acme_tiny", line 197, in
main(sys.argv[1:])
File "/bin/acme_tiny", line 193, in main
signed_crt = get_crt(args.account_key, args.csr, args.acme_dir, log=LOGGER, CA=args.ca, disable_check=args.disable_check, directory_url=args.directory_url, contact=args.contact)
File "/bin/acme_tiny", line 149, in get_crt
raise ValueError("Challenge did not pass for {0}: {1}".format(domain, authorization))
ValueError: Challenge did not pass for staging.mycompany.com: {u'status': u'invalid', u'challenges': [{u'status': u'invalid', u'validationRecord': [{u'url': u'http://staging.mycompany.com/.well-known/acme-challenge/uzVYs8V9eRxsEM2Pwy-byRcNpbfomumHouHfl0rKxJ4', u'hostname': u'staging.mycompany.com', u'addressUsed': u'35.182.69.66', u'port': u'80', u'addressesResolved': [u'35.182.69.66', u'3.97.128.163']}], u'url': u'https://acme-v02.api.letsencrypt.org/acme/chall-v3/25028809470/JfXSyA', u'token': u'uzVYs8V9eRxsEM2Pwy-byRcNpbfomumHouHfl0rKxJ4', u'error': {u'status': 400, u'type': u'urn:ietf:params:acme:error:connection', u'detail': u'Fetching http://staging.mycompany.com/.well-known/acme-challenge/uzVYs8V9eRxsEM2Pwy-byRcNpbfomumHouHfl0rKxJ4: Timeout during connect (likely firewall problem)'}, u'validated': u'2021-08-25T15:43:46Z', u'type': u'http-01'}], u'identifier': {u'type': u'dns', u'value': u'staging.mycompany.com'}, u'expires': u'2021-09-01T15:43:45Z'}

Failed to sign staging.mycompany.com, is DNS set up properly?

Failed to obtain certs for staging.mycompany.com

(not actually "mycompany.com")

Any thoughts? I think the DNS is set up correctly - it is the same as several other services running in the same AWS account.

Stephen.

mark · August 25, 2021, 6:39pm

Hmm, I have seen that error come up before when letsencrypt can't access the internet. If you're deploying Retool on a VPC that cannot access the public internet, LetsEncrypt won't be able to perform the challenge necessary to provision a certificate. In this case, you'll need to manually add your certificates.

stephenmarsh · August 25, 2021, 7:55pm

The ec2 does have internet access, I did a "curl https://www.google.com" without a problem. It does seem to have something to do with letsencrypt process not being able to validate our cert, but not sure what the exact problem might be. Should I go down the path of manually adding the certificates even though the server has internet access?

mark · August 25, 2021, 8:33pm

As a test, it would be good to try if you have certs you can use, but I would like to figure out why letsencrypt is having trouble signing

stephenmarsh · August 25, 2021, 9:41pm

When we say "access to the internet" do we mean that the EC2 can go out to the internet, or that there can be inbound access FROM the internet as well? I've been doing some reading about letsencrypt and certbot, and it seems like for the process to work the letsencrypt service needs to make an http (not https) request to the server. In my case (and I think with Kevin's configuration too) I don't see how this would work, as Route53 only forwards 443 traffic to the load balancer, and the load balancer only forwards http traffic to port 3000 on the EC2 machine - nothing could ever be sent to port 80 on the EC2. How would that letsencrypt test ever work in this configuration? Does the EC2 need a public IP (maybe just temporarily) to get past the set up required?

kent · August 25, 2021, 10:03pm

Hey @stephenmarsh - forgive me, I wasn't as clear as I could have been in my first post. The instance I was referencing is used to test a lot of different things, and that resulted in me including some information that isn't actually relevant to what you're trying to do.

You don't need the https-portal container at all. When you add a listener to your Application Load Balancer, there is a "Default SSL certificate" setting. You can choose the "From ACM (recommended)" option, and select a certificate or "Request a new ACM certificate". [screenshot attached]

Your set-up should otherwise be correct. You should be able to comment out (or delete) the https-portal service in docker-compose.yml and restart the containers with sudo docker-compose up -d --remove-orphans.

This way, we're terminating SSL at the load balancer, and passing traffic to :3000 on your instance, where it will hit Retool.

stephenmarsh · August 26, 2021, 2:21pm

Well now I have to apologize to you guys. You were right - the configuration is fine. Since you were so confident that it should be working I started to sniff around at some other things in our set up, and low and behold we have AWS WAF enabled on the load balancer, with a bunch of default AWS rules. Any requests going back to the Retool API that have a URI in the body were getting blocked - so for example any time i was trying run a REST query the WAF was blocking it.

Thanks for your help on this, and sorry again for me not being aware of this this earlier.

Stephen.

Topic		Replies	Views
SSL Certificate Setups not working (Letsencrypt or Manual) 💬 Self Hosted Retool	10	1814	June 9, 2023
Nsjail aws metadata service 💬 Workflows welcoming-replies	16	93	January 13, 2025
Not resolving port 80 (http-portal container) 💬 Self Hosted Retool	2	98	December 2, 2024
Unable to start retool (new installation) 💬 Self Hosted Retool	14	606	May 2, 2024
Self hosted install keeps redirecting to login page 💬 Self Hosted Retool	9	1178	March 4, 2024

Getting On Premise Deployment to work with SSL

Failed to sign staging.mycompany.com, is DNS set up properly?

Related topics