Sunday, September 28, 2014

An AWS VPN: Part III, Routing

The last two posts in this series (part I, IPSec and part II, L2TP) covered enough to get an L2TP/IPSec VPN connection up, to the point where arbitrary traffic can be exchanged between the client and the server.  But, there's a missing feature yet.  Remember this picture from part II?

If we assigned a third network for the VPN, how does the client know that the protected network is even back there to send traffic to?  It's not the remote end of the VPN link ( so it'll get routed via the gateway and fail at some point.

The answer is that someone conveniently threw in an extra hack for us: the VPN client sends a DHCPINFORM message over the L2TP connection, and the server just has to respond to it with a few vital options.


The normal DHCP protocol involves Discover, Offer, Request, Acknowledge, and possibly much later, Release.  Each of these pretty much flows into the next; it's possible for multiple servers to respond to Discover with an Offer, so in those cases the client gets to pick one to Request specifically.  The Acknowledge serves to tell the client that its Request was okay—the server hasn't given out that address in the meantime, and the server is now aware not to give out the address until the lease expires or the client Releases it.

The Inform message was added to enable clients to ask the server about options they need that weren't sent with the Offer, without disturbing the underlying IP address leases.  It's this informational, non-state-changing message that's piggybacked for further L2TP configuration.

Classless Static Routes

There's one DHCP option in particular that will prove particularly useful to us, and it goes by the name of classless static routes.  Many years ago, Microsoft implemented this option inside the space reserved for vendor private extensions.  It proved useful enough to standardize, at which point it got a new number since it wasn't vendor private, and then a bunch of people on the Internet complain that Microsoft "did their own thing."  Yes.  Yes, they did.  In the space that was reserved for that very thing.

The point of the option is that its value is a netblock, and a router to use for reaching it—which doesn't have to be on that netblock.  So, we can deliver a message like "Send traffic for via" and (as long as doesn't conflict with the private network at the client side) traffic for the hosts on the protected network will be sent via the VPN server.

But, it won't quite work out of the box.

More Packet Games

The VPN server needs to be set up to handle forwarding IP packets, or else the packets will be dropped on the floor.  Most Linux distributions do not expect to be set up as network equipment, so they assume they're only going to see packets to/from themselves, and everything else is an attack.  However, the capability is simply disabled in software, so it's a pretty easy fix.

In /etc/sysctl.conf:
net.ipv4.ip_forward = 1
I neglected to mention this in the first version of part I, but there's something else in this file that makes the ipsec verify command happier.  Still in /etc/sysctl.conf:
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
These two options disable sending or processing of ICMP REDIRECT messages, which are used in the event that multiple routers are visible to a host and it sends traffic to the wrong one.

The sysctl.conf only takes effect on boot, so we also have to write the new values out to make them take effect immediately:
sudo sysctl -w net.ipv4.ip_forward=1
I like the sysctl form of the command instead of writing files in /proc/sys, because it logs the actual value used as well, unlike the echo 1 | sudo tee ... trick.

Now that we can forward packets, things still aren't quite right.  How do other hosts on the protected network know to route back through the VPN server to reach the client?  They don't.  Normally, I'd change the global routing table at the gateway, but of course I don't have access to the AWS network gear to do that.  And, I don't really want "all of AWS" sending traffic back to my VPN client, just my own resources within.

The disappointing answer is that we need more NAT: the VPN server needs to masquerade outgoing connections.  (This happens POSTROUTING, hence the need to enable forwarding; otherwise the packets can't reach this stage.)
iptables -t nat -A POSTROUTING -o eth0 -s -j MASQUERADE

With the rule in place, we just need to turn on the iptables service with chkconfig iptables on and save the ruleset with service iptables save.  At least as long as Amazon Linux keeps sysvinit.

Handling DHCP

With the actual routing set up, all we need is to get a DHCP server to handle the DHCPINFORM message from the client.  This would be straightforward, except that ISC DHCP—the one in the standard dhcp package—doesn't work.  It won't listen on non-broadcast interfaces.  There are a few threads from years ago, with vague handwavy "well it needs to work some broadcast-specific magic to work!" or "it doesn't support the ppp protocol!" kinds of answers, but apparently nobody's worked on them in the meantime.  It really doesn't work.

dnsmasq, on the other hand, is pretty awesome about this.  It will happily listen and reply on PPP interfaces.

The configuration looks like this (I set up /etc/dnsmasq.d/90-l2tp.conf for this and enabled the includedir directive in /etc/dnsmasq.conf, so that the package manager can stay clear(er) of my changes):
# keep bogus-priv OFF, upstream does know about 10.x.x.x addrs
# 3: router, 6: DNS, 121/249: static routes
We avoid forwarding unqualified hostnames (like "foo") to AWS for resolution, and we avoid answering any DHCP or DNS traffic to us from AWS itself.  Next, we let dnsmasq know it's authoritative for the networks it applies to (with eth0 excluded, that's the loopback and PPP interfaces).  The DHCP range is set up with the special keyword static to disable actually allocating addresses.  Again, they're handled by L2TP already.

Finally, we can declare the options we want to send.  Option number 3 is the router option; I believe its original use is to declare the gateway on a regular DHCP network.  It's deprecated in favor of classless static routes, and any client that understands the latter ignores the former.  But, I still wanted the fallback.

Option 6 specifies a DNS server to use; this doesn't seem to have an effect in my configuration, but I wish it did.  (I'll come back to that soon.)

The final option pair (121 and 249) are the RFC and Microsoft identifiers for the classless static route option, respectively.  Here, we're specifying 10.x.x.x is reachable via our own IP,

With all that up and running, we have the this structure in effect:

That's actually a lot of moving parts.  It amazes me that it works as well as it does, since very little of it was originally specified with this specific scenario in mind.


I haven't been able to get the client to automatically resolve names over the VPN.  The names of resources I want to access behind AWS are set up with ALIAS records in Route 53, which return the public IP to the Internet—and the private IP inside of AWS.  In other words, it's a classic split-horizon setup.  But, for some reason, the VPN clients seem to be unaware of the possibility that the network they're being connected to might have split-horizon DNS, so they keep resolving all the names publicly.

I can route private-addressed traffic just fine, but I have no solution for getting those private addresses from the DNS names.

The "obvious" solution to this problem, of switching on the "Send All Traffic" option at the client, trades one devil for another: public traffic gets routed to the server just to be bounced back out to the Internet, consuming double AWS bandwidth, adding latency, and potentially limiting throughput.  Also, all open connections drop any time the VPN connects or disconnects.

I'm worried I'm missing something, because it seems like after ten years, we should have some VPN protocol that was designed to be a VPN that gracefully handled this scenario.  But, I haven't heard of anything if there is, and anything that would fit the bill is clearly not included in popular client systems already.


tcpdump was completely useless at this layer.  It can't see the interface before it exists, and by the time tcpdump can start running on it, the DHCPINFORM has already come and gone.

Most helpful was the DHCP server log; second-most-helpful was writing an iptables rule to -j LOG traffic flying by for the server.  That, at least, confirmed the request and reply were connecting properly.

1 comment:

ciciban said...

Re internal dns problem, how about using forwarder in dnsmasq that sends the requests to the internal aws resolver usually .2 of the vpc, ie in this case?