[Docker/Traefik] Resolving Intermittent Frontend Loading and 504 Errors Behind a Traefik Reverse Proxy

Traefik routing issue between frontend and backend containers

We encountered a frustrating issue in our Dockerized app environment where the frontend service was intermittently failing to load, and backend API calls were stuck or returning 504 Gateway Timeout errors. Our infrastructure consists of:

  • Frontend container (React app)
  • Backend API container (Go server, accessible at /api/...)
  • Database container (PostgreSQL)
  • Traefik reverse proxy managing HTTPS termination and routing

Despite having valid TLS certs and seemingly working routes, users experienced broken pages, incomplete frontend rendering, and stuck fetch requests to /api endpoints.

Initial Clues

Running an external TLS check looked fine:

curl -v https://app.example.com/api/health

This returned a valid TLS response, confirming the public certs and HTTPS entrypoint were working correctly. Yet, something was still breaking inside the network.

Frontend logs showed:

  • Failed requests to /api/* endpoints
  • Random stuck loading spinners

Traefik logs eventually revealed:

504 Gateway Timeout

This pointed to Traefik being unable to reach the backend container consistently.

Root Cause: Misconfigured Docker Networks and Dynamic Subnet Assignment

Upon inspecting the running containers, we noticed a mismatch in network configuration:

  • The frontend, backend, and database were connected to app_network
  • The frontend and backend were also connected to traefik_network
  • Traefik itself was connected only to traefik_network

The traefik_network was explicitly configured with a static subnet (e.g., 172.24.0.0/16), whereas app_network was left to be auto-created without a specified subnet. This means Docker could assign a new subnet range to app_network every time the app stack was restarted.

As a result, depending on the order of container creation and network assignment, some containers were assigned to the wrong subnet, or routing across networks became unreliable. Because Traefik was only on traefik_network, any attempt to reach services only correctly reachable via app_network could fail — leading to the intermittent frontend load failures and 504 errors.

The Fix

The solution was simple and effective: we removed app_network entirely and consolidated all services (frontend, backend, database) under a single, shared traefik_network.

By doing this:

  • All services were reachable via service name resolution inside the same network
  • No dependency on dynamically assigned subnets or cross-network resolution
  • Traefik could always consistently route to backend services without 504s

We also added a router label to explicitly bind the backend API route to the TLS entrypoint:

labels:
  - "traefik.http.routers.backend.entrypoints=websecure"

After these changes, the intermittent issues completely disappeared.

Why This Works

Traefik uses Docker’s internal DNS and network discovery to resolve service names. But this only works reliably if the reverse proxy and services are attached to the same network. Any mismatch or reliance on cross-network bridging (especially when Docker assigns new subnets on restart) introduces instability.

By using a single shared network (traefik_network) for all services, we ensured:

  • Consistent container DNS resolution
  • Predictable routing
  • Elimination of network-related race conditions

Key Takeaways

  • Intermittent frontend issues are often caused by unreliable internal routing — not frontend code
  • 504 Gateway Timeout from Traefik is a clear signal that the backend service is unreachable
  • Always simplify your networking: one shared network is more reliable than multiple bridges
  • Define entrypoints explicitly for clarity in routing behavior

Final Confirmation

After applying the fixes, both external and internal checks passed:

# From outside
curl -v https://app.example.com/api/health

# From inside Traefik
docker exec -it traefik curl http://backend-api:8080/api/health

Both returned successful responses with minimal latency, confirming stable routing and TLS termination.


If this post helped you to solve a problem or provided you with new insights, please upvote it and share your experience in the comments below. Your comments can help others who may be facing similar challenges. Thank you!
Buy Me A Coffee
DigitalOcean Referral Badge
Sign up to get $200, 60-day account credit !