1. My goal:
Uninterrupted availability of our self-hosted Retool instance, independent of external service outages.
2. Issue:
We repeatedly experience full Retool outages that correlate in time with Cloudflare incidents. Investigating Docker logs, we found that during downtime the Retool container fails to resolve p.tryretool.com via DNS β suggesting our self-hosted instance has a runtime dependency on a Cloudflare-proxied Retool endpoint. We would not expect a self-hosted deployment to be affected by Cloudflare availability.
5. What I've tried so far:
Correlating outage timestamps with Cloudflare incident reports β the timing matches consistently. We have not yet found a configuration option to disable or isolate the dependency on p.tryretool.com.
Questions:
What is p.tryretool.com used for? Is this telemetry/analytics (e.g. Segment)?
Is p.tryretool.com proxied through Cloudflare?
Does Retool degrade or block when DNS resolution for p.tryretool.com fails?
Is there a supported way to disable this dependency, e.g. via DISABLE_TELEMETRY=true or similar?
Hi @RaykGlasenapp, thank you for the details and sharing the error message!
p.tryretool.com is Retool's analytics and usage reporting endpoint. Self-hosted instances send usage data here in the background on user actions (page loads, query runs,
etc.). It is listed as a required egress target in our self-hosted network requirements ( Self-hosted Retool egress connections | Retool Docs ).
A DNS resolution failure for p.tryretool.com causing a full instance outage is unexpected - worth digging into for sure! Could you share the following?
Docker daemon logs: looking for repeated [resolver] failed to query external DNS server errors for p.tryretool.com within a short time window
Retool container logs: looking for ETIMEDOUT, ECONNREFUSED, or socket hang up errors, or a sudden drop in output (indicative of a stalled event loop)
systemd-resolved logs: looking for timeouts or repeated failed upstream queries
One additional data point from our side: the outages appear to coincide with periods where Cloudflare is experiencing issues.
Since p.tryretool.com is fronted by Cloudflare, weβre wondering whether upstream Cloudflare instability could be triggering retry storms, blocked requests, or degraded DNS behavior inside the Retool services.
To clarify our concern: even if analytics delivery fails, we would expect the platform to continue operating normally with telemetry degraded β not for the instance itself to become unavailable.
Are there any known dependencies in the self-hosted stack where failed requests to Cloudflare-backed Retool endpoints can cascade into application instability, resource exhaustion, or request blocking?