Workshop

A Network Edge Philosophy

Published on: 2025-04-07

By: Ian McCutcheon & Draft refined with assistance from Google AI

Routing Deep, NATing Late: A Network Edge Philosophy

My IT career started back in 1993, and I've seen the network edge evolve dramatically. I remember the early days of access lists blocking a few ports, shifting eventually to stateful firewalls meticulously allowing only specific traffic. The approach flipped entirely. Recently, I've been reflecting on how we handle edge security today, especially with firewalls managing complex NAT configurations, DMZs, and intricate routing.

Over the years, I've worn many hats: router guy, WAN guy, LAN guy, firewall guy, load balancer guy – often several at once. One constant in technology is surprise; solving the unexpected problems is probably what keeps many of us engaged.

Through this journey, I've developed a personal set of guiding principles for designing the network edge – that critical point where the internet, untrusted networks, and our securely delivered services converge. These are my internal compass; reality, of course, always finds ways to surprise us. Still, I find these principles valuable:

My Core Principles for the Network Edge:

  1. Leverage Natural Routing as Deep as Possible.
  2. Perform NAT Only at the Last Possible Moment.
  3. Keep NAT Simple ("Lazy NAT").

(A quick note: My focus here is primarily on traffic originating from untrusted networks like the internet towards internal services. Internal NAT scenarios might benefit from these ideas, but I haven't applied as much thought there, so it's outside the scope of this post.)

1. Deep Routing: Let Routers Route

What do I mean by "deep routing"? Simply put: allow the original destination IP address to persist as far into your network infrastructure as logically possible.

I know some network veterans might balk at the idea of public IP addresses appearing in internal routing tables or DMZs. Perhaps there's wisdom there, or maybe it's a holdover from times when organizations had vast public /16s assigned to all internal systems, NAT wasn't as seamless, and stateful firewalls were less mature. My take is that we shouldn't automatically dismiss routing public IPs past the edge firewall if it simplifies the overall design, provided robust security policies are enforced regardless.

Consider a typical DMZ setup with firewalls protecting load balancers which front your application servers. Deep routing means your edge routers and firewalls route the incoming traffic using its original public destination IP all the way to the load balancer. The load balancer, sitting in the DMZ (protected by firewalls), becomes the natural point to handle the Destination NAT (DNAT), translating the public IP to the private IP of the selected backend server. You've used routing effectively right up to the point where translation is unavoidable.

2. Late NAT: Translate Only When Necessary

This principle flows directly from Deep Routing. If you're routing deep, you are inherently delaying NAT. In our DMZ example, the load balancer is the "last possible moment" in the path where you must perform DNAT because the backend servers use private addresses.

Why delay? It keeps the configuration on upstream devices (routers, edge firewalls) simpler. Their job is routing and security policy enforcement, not complex address translation if it can be handled more appropriately downstream.

(Could you route public IPs even further, maybe directly to servers? Yes, in some specific scenarios – like one case I recall where public IPs landed directly on backend servers to satisfy a tricky vendor requirement. I'd call that "uncommon depth," facilitated by robust surrounding security. But for most typical web service deployments, the load balancer is the practical and logical place to draw the line for DNAT.)

3. Lazy NAT: Keep Translation Simple

Okay, so Deep Routing and Late NAT help avoid unnecessary NAT. But what about when NAT is required? This is where "Lazy NAT" comes in. It's about the how, not just the when and where.

Modern firewalls are incredibly capable NAT devices. I know this well. My point isn't that they can't do it; it's often that they don't need to add complexity here.

Handling Outbound Traffic (Source NAT)

While the focus so far has been on inbound traffic and DNAT, what about outbound connections initiated from your internal networks? My approach here complements the inbound strategy: let the firewalls handle outbound Source NAT (SNAT).

The edge firewall is the natural choke point for outbound traffic leaving your trusted environment. Applying SNAT here (often PAT to conserve public IPs) provides policy control, hides internal addressing, and aligns with the firewall's role as the security boundary. The "Lazy NAT" principle still applies – keep the SNAT policy as simple as needed, relying on the main firewall rules for actual traffic enforcement.

A Note on Load Balancer Deployment: Routed vs. Proxy Mode

How you deploy your load balancers significantly impacts your ability to follow the Deep Routing principle effectively.

When possible, design for inline/routed load balancer deployments. It aligns best with leveraging natural routing and minimizing unnecessary NAT complexity.

Why This Philosophy?

Why advocate for this approach? Three main reasons:

  1. I Abhor Technical Debt: Unnecessary complexity (like mandatory LB SNAT in proxy mode, or convoluted bi-directional NAT rules which often signal design issues) introduced today becomes a maintenance headache tomorrow.
  2. Challenge Legacy Assumptions: Just because something was considered wise or necessary in the past doesn't mean it holds true today. Technology evolves, and our designs should reflect current capabilities.
  3. Operational Simplicity: Aligning function with device (routers/firewalls route and enforce policy, LBs balance and optionally NAT when required) makes troubleshooting and changes more straightforward.

By letting routers route, deploying load balancers intelligently, and performing NAT only when and as simply as needed, we can build more manageable, resilient, and operationally efficient network edges. Security policy remains paramount, but it doesn't need to be entangled with unnecessarily complex address translation schemes.


Glossary of Terms

For readers who might appreciate a quick definition of some terms used above: