[Docker/Traefik] Resolving Intermittent Frontend Loading and 504 Errors Behind a Traefik Reverse Proxy (Update 2025-11)

We encountered a frustrating issue in our Dockerized app environment where the frontend service was intermittently failing to load, and backend API calls were stuck or returning 504 Gateway Timeout errors. Our infrastructure consists of:

Frontend container (React app)
Backend API container (Go server, accessible at /api/...)
Database container (PostgreSQL)
Traefik reverse proxy managing HTTPS termination and routing

Despite having valid TLS certs and seemingly working routes, users experienced broken pages, incomplete frontend rendering, and stuck fetch requests to /api endpoints.

Initial Clues

Running an external TLS check looked fine:

curl -v https://app.example.com/api/health

This returned a valid TLS response, confirming the public certs and HTTPS entrypoint were working correctly. Yet, something was still breaking inside the network.

Frontend logs showed:

Failed requests to /api/* endpoints
Random stuck loading spinners

Traefik logs eventually revealed:

504 Gateway Timeout

This pointed to Traefik being unable to reach the backend container consistently.

Root Cause: Misconfigured Docker Networks and Dynamic Subnet Assignment

Upon inspecting the running containers, we noticed a mismatch in network configuration:

The frontend, backend, and database were connected to app_network
The frontend and backend were also connected to traefik_network
Traefik itself was connected only to traefik_network

The traefik_network was explicitly configured with a static subnet (e.g., 172.24.0.0/16), whereas app_network was left to be auto-created without a specified subnet. This means Docker could assign a new subnet range to app_network every time the app stack was restarted.

As a result, depending on the order of container creation and network assignment, some containers were assigned to the wrong subnet, or routing across networks became unreliable. Because Traefik was only on traefik_network, any attempt to reach services only correctly reachable via app_network could fail — leading to the intermittent frontend load failures and 504 errors.

Solution 1

The solution was simple and effective: we removed app_network entirely and consolidated all services (frontend, backend, database) under a single, shared traefik_network.

By doing this:

All services were reachable via service name resolution inside the same network
No dependency on dynamically assigned subnets or cross-network resolution
Traefik could always consistently route to backend services without 504s

We also added a router label to explicitly bind the backend API route to the TLS entrypoint:

labels:
  - "traefik.http.routers.backend.entrypoints=websecure"

After these changes, the intermittent issues completely disappeared.

Why This Works

Traefik uses Docker’s internal DNS and network discovery to resolve service names. But this only works reliably if the reverse proxy and services are attached to the same network. Any mismatch or reliance on cross-network bridging (especially when Docker assigns new subnets on restart) introduces instability.

By using a single shared network (traefik_network) for all services, we ensured:

Consistent container DNS resolution
Predictable routing
Elimination of network-related race conditions

Solution 2 (Update on 2025-11) ⭐Recommended

One of our readers, Dan, left me a message saying there’s another solution that works without giving up separate networks. I tested it out and it works perfectly. Thanks, Dan, for the great suggestion!

Traefik provides another label, traefik.docker.network, which lets you specify the Docker network to use for routing. By adding this label to both the frontend and backend services, you can keep using separate networks while making sure Traefik knows exactly which one to route through for each service. Here’s an example for a backend service:

labels:
  - traefik.enable=true
  - traefik.docker.network=traefik_proxy      # specify the traefik overlay network

With this label, Traefik will always use the specified network to reach the service, avoiding cross-network issues that can lead to 504 errors. Meanwhile, you still get to keep your separate networks for better isolation.

Key Takeaways

Intermittent frontend issues are often caused by unreliable internal routing — not frontend code
504 Gateway Timeout from Traefik is a clear signal that the backend service is unreachable
Define entrypoints explicitly for clarity in routing behavior
Using traefik.docker.network label allows flexible multi-network setups without routing issues

Final Confirmation

After applying the fixes, both external and internal checks passed:

# From outside
curl -v https://app.example.com/api/health

# From inside Traefik
docker exec -it traefik curl http://backend-api:8080/api/health

Both returned successful responses with minimal latency, confirming stable routing and TLS termination.

Enjoyed this article? Support my work with a coffee ☕ on Ko-fi.

Initial Clues#

Root Cause: Misconfigured Docker Networks and Dynamic Subnet Assignment#

Solution 1#

Why This Works#

Solution 2 (Update on 2025-11) ⭐Recommended#

Key Takeaways#

Final Confirmation#