Redundant hosting using DNS

Part of the beginners guide to website hosting tutorial

This is a technique to use the DNS to provide redundant hosting across two of more servers. It doesn't require any special "management systems" and we have proved this to be an effective solution for protection against server failure. Although we came up with this idea ourselves and we have not seen this technique documented elsewhere, we don't claim to have "invented" it since it seems a logical arrangement once DNS is understood.

Redndancy Schemes: If you have websites or internet services that are critical or you cannot afford outage if a server fails for any reason, there are a number of different redundancy schemes available. By having the services available on two geographically and topologically seperated servers ensures a very high probablility to one of them being available at any one time and many solutions address the issue of sending all traffic to one of them should the other one fail. Many of these require a third management server which again introduces a single point of failure and, of course, requires the costs and administration of a third server.

Using the DNS configuration for redundancy: DNS records are held on two or more name servers registered to the domain which point to different servers (if you are unfamiliar with this then please read Domains and DNS servers. The name servers are served in turn and, thus, the servers providing the service will share the internet traffic accordingly. To extend this ability to provide redundancy, the DNS servers for the domain should be on the same physical machine as the services that they are pointing to. In addition, the individual DNS records should provide IP addresses for the machine that they are on and, as such, will be different on each server. In this way, should one of the servers fail, the DNS will fail with it and the requesting machine will ask one of the other name servers for a DNS record. The next request will give another DNS record which is on a physically different machine and this DNS will also point to itself. This ensures that the DNS request is only served by a machine that can also provide the service. Let's look at an example of one of our high traffic sites using this configuration:

www.metric-conversions.org The name servers for this domain provide the following DNS records:

ns1.metric-conversions.org. 95.172.22.163 [GB]
metric-conversions.org. NS ns1.metric-conversions.org.
metric-conversions.org. NS ns2.metric-conversions.org.
metric-conversions.org. A 95.172.22.163
ns1.metric-conversions.org. A 95.172.22.163
ns2.metric-conversions.org. A 72.55.156.214
www.metric-conversions.org. CNAME metric-conversions.org.

ns1.metric-conversions.org. 72.55.156.214 [CA]
metric-conversions.org. NS ns1.metric-conversions.org.
metric-conversions.org. NS ns2.metric-conversions.org.
metric-conversions.org. A 72.55.156.214
ns1.metric-conversions.org. A 95.172.22.163
ns2.metric-conversions.org. A 72.55.156.214
www.metric-conversions.org. CNAME metric-conversions.org.

Normally, the two servers share the traffic fairly equally providing load balancing. In actual fact, we have found that the DNS of the nearest name server is generally used and so European visitors tend to get NS1 (in the UK) and North American visitors tend to get NS2 (in Canada)- this gives an added benefit of serving from the closest server.

Should the first server go down, one of the following two scenarios will take place when accessing metric-conversions.org:

Scenario A
1) Request for name server, second name server ns2.metric-conversions.org is served
2) www.metric-conversions.org translates to metric-conversions.org which translates to 72.55.156.214
3) Request successfully made to working server.

Scenario B
1) Request for name server, first name server ns1.metric-conversions.org is served
2) ns1.metric-conversions.org fails. Re-request name server for metric-conversions.org, second name server ns2.metric-conversions.org is served
3) www.metric-conversions.org translates to metric-conversions.org which translates to 72.55.156.214
4) Request successfully made to working server.

Of course, an ISP which has cached the DNS previously will still try to access the failed server but by changing the TTL (time to live) setting for the DNS record to a few minutes the outage due to this case will be kept to a minimum. More elaborate and complex solutions may have advantages over this method but for many applications, this will provide a very simple adequate solution to implementing redundancy with just two servers.

Our other tutorials: Making your website fast and Beginners guid to search engine optimisation.

If you are interested in our hosting packages please follow our website hosting page.