As most readers are aware, there are really only two layer 4 protocols that people have to deal with on a daily basis: UDP and TCP. The primary difference is that UDP is a datagram protocol, which means that your packets are just sent off to the far end, whereas TCP is a connection-oriented stream protocol.
There’s actually another way to look at this: session-oriented vs. session-less communications. I’m sure there is research literature on this, and I’ll rewrite this if someone can point me at the appropriate terms. Until then, however, I’m going to somewhat arbitrarily define a session-less communications as one which:
- Contains only two packets: a request, and a response. That is, you send a query packet, the server sends a response, and that’s it.
- Uses transactions capable of being processed independently
Now, with a TCP socket, when you actually call connect()
, your TCP stack sends a SYN packet (“hey, I want a connection”), the destination sends a SYNACK back (“hey, I see you want a connection, I want one too”), and then your stack sends an ACK back to the server (“OK, let’s do this”). All that happens transparently, before your application request/response is sent. This means a TCP connection is never session-less as I’ve defined it: you’ve got an absolute minimum of four packets (and a practical minimum of 5) in order to do anything useful.
Similarly, no VoIP, IPTV, and very few financial multicast applications are session-less: even though they’re built using UDP, and so don’t have session-oriented features built into the layer 4 protocol, the applications themselves build some sort of session internally—there are multiple packets required to do actual work with these applications, so they’re session-oriented, as defined.
Intuitively, applications which use session-less communications are the ideal users for anycast: you send your request, you get your response, and it’s a one-off. If your server dies and there’s a failover, it will be transparent to your application, since each transaction is independent (yes, that’s not strictly true, see the last post).
Unfortunately, only DNS and Kerberos-via-UDP are truly session-less (as opposed to “stateless,” which Kerberos plainly is not, and DNS isn’t in practice due to caching), which means they’re the only things that are considered truly ideal for an anycast application.
DNS, in particular, is nice because of the failure path in most DNS clients’ state diagrams, which look like this:
- Server is already dead
- Client sends a query
- Client waits for n seconds (20-60 seconds, typically) in order to figure out the server is dead
- Client sends a query to the alternate server
Without anycast, any user trying to hit a dead nameserver will incur that massive multi-second penalty, every time, for as long as the nameserver is dead. However, when you have anycast, the user has to be performing a query at exactly the moment of death (that is, between the time the DNS server fails, and before IP SLA detects it and your IGP has re-converged to send the traffic to your alternate—around 5-10 seconds, depending on how you’ve tuned IP SLA and your IGP re-convergence timers) for them to be impacted.
So doing DNS via anycast is almost a no-brainer: you allow users to continue doing whatever it is they’re doing unaffected by outages, maintenance, etc. Almost makes you wish certain onerous, US-based, cable-Internet providers could figure it out.
This isn’t to say that you can only use anycast for DNS and Kerberos/UDP. While the definition I gave of a session-oriented vs. session-less communications is correct as far as it goes, and demonstrates that anycast is best suited to doing distributed UDP-based services, that doesn’t mean anycast is only suited to distributed session-less services.
In fact, it’s just as well suited to any service that you’d want geographic mirrors of, and can reasonably suffer a multi-second failover time. The reason for this is simple: if a session-oriented service fails, you’re still going to have to discover the failure and re-establish your session—and it’s actually a lot easier to rely on any given application, framework, or library to do a simplistic reconnect than it is to expect them to allow you to configure a failover policy.
Otherwise, some things that are “session-oriented” are simply well suited to anycast: an n-way multimaster LDAP directory, for example, works well with anycast. As does a site-local yum repository (you don’t have to do stupid DNS tricks, or configure each machine with knowledge of the “closest” mirror—you just need to let the network know the mirrors, it and let it handle the routing for you). I’m likely going to look at this for a distributed Cassandra-based project I’m working on as well.
In the next instalment I’ll post the combined configuration from the last few posts in a single place, in case it was hard to follow through all the background and explanation:
There is DCCP besides TCP and UDP :). It is datagram based too.
.
Thanks for sending that over, I hadn’t heard of it but it seems interesting. Given the uptake of IPv6, I expect to see this usable in production in about 20 years. 😉