High Availability (also known as HA) is the capability of a system to remain online regardless the adversary events that might happen. Then, we will state that availability is the characteristic that concerns a service to be reachable when it is needed. And as you should guess, availability can be measured using percentages (from 0% to 100%); of course, we all know the more close to the 100%, the more expensive it is to deploy a system like that. In the folklore when you say a service has 3 nines of availability, you mean 99.9%.,
As a security consultant and CISSP certified, I have not found any hard definition that states how many 9's you should have in order to claim you have high availability. In my experience, people start calling high availability when you talk about 3 nines or better. But this is only a feeling.
When someone says in the cloud, it is a very gray term that means other's computer. When we speak about servers, we think about VPS'es in the cloud. A name such as Digital Ocean or Vultr jumps right away to my mind. If you are looking forward to having a non-expensive, reliable system in high availability with some load balancing, this article will help you to understand how this works.
Different Techniques for High Availability
High Availability can be reached by applying countermeasures in different layers of the network layer model. In my experience, usually in the following:
- Dynamic routing: by using some protocols such as BGP or RIP an autonomous system could be relocated to a secondary data centre. As you see, this applies to network blocks.
- Floating IP: tow servers that are sensed in a short period. Both servers share a single IP that it is called the floating IP. When one server fails, the other takes over the IP and it starts getting all the traffic. Both servers should share all information to avoid breaking sessions, for example, HTTP sessions or SIP sessions. This approach is for the whole server and both servers must be plugged in the same collision segment (aka the same vLAN). As you see, this technique applies to the whole server.
- Smart DNS. This technique is the one I will talk in this article. In short, the DNS answers the requests using some decisions. Depending on some network factors, for an endpoint it can resolve IP-1, but for another endpoint IP-2. As you can see, this technique applies to the domains names, not the servers. I will talk more about this.
- Native Support for the Protocol: Some protocols such as SMTP, SIP and XMPP are built with fault tolerance in mind. This kind of protocols doesn't need more than a good implementation of their specification.
Native Support for the Protocol
Protocols such as SMTP, SIP and XMPP are the perfect examples for this. SMTP is a special case, which it uses MX records instead SRV. However, those three have a fallback mechanism built in. SRV records are an ideal option if the protocol you are using supports them.
You should always keep in mind that SRV records won't give you High Availability automatically. The servers within the farm should be able to take over other's work. Your servers should share information among them.
In the world of the VoIP, SRV records are heavily used to accomplish High Availability. However, the decision of honouring such relies purely on the endpoint. I have found that some IP Phone telephones will honour perfectly SRV records and other simple they won't. SRV records can be used to give fallback capabilities, however, their nature is static and the server with the highest priority will get all the load until it fails.
Obtaining High Availability through Smart DNS Technique
DNS is the protocol responsible for translating hostnames into numeric IP addresses. Nowadays, it is very strange a server who doesn't use DNS. If you are looking forward to having fault tolerance, using DNS instead of hard-coding IP addresses is the correct approach. Not to mention that it is easier to remember www.inside-out.xyz rather than 18.104.22.168 for IP version 4 or fe80::f816:3eff:fe33:ea02 for IP version 6.
This Smart DNS server should be capable of taking a real-time decision before delivering any register. Helped by other scripts, it could resolve to the IP where latency is lower or server load is minimal. If a node goes down, the Smart DNS technique should realize of this and it should take out that faulty server IP out of the pool. When the server is back, IP should be restored into the pool.
Smart DNS technique should apply on A (for IP version 4), AAAA (for IP version 6) and SRV record at least. The A and AAAA records should have a smaller TTL than SRV records. Because the nature of the fall back on the SRV records, they can have longer TTL with minimal service disruption.
As always, like the Native Support for the Protocol, your cluster nodes should have a way to share information.
This technique is one of the best options if you are using a server in the cloud, and in my opinion, the best if those servers (VPS) are hosted by different companies. Because you don't have control of the Autonomous System, nor the physical server, manipulating the DNS is where you need to focus. All you need to do is to create a VPN between the nodes (if you don't want to pass clear data through the Internet), get a valid domain name and host that zone with any of the existing implementations.
Implementations of a Smart DNS
In my career, I only know two implementations of this Smart DNS approach. I don't doubt that there are more out there, but my Google-Fu is low.
- F5 Big-IP: Not much to say that this is a proprietary appliance you get from F5. It is more than a DNS, but I am focusing only on it.
- Low Latency PowerDNS Add-On: This is a piece of software I have developed myself. It is not an appliance, but a plugin that works with PowerDNS to add the smart DNS capabilities. To be fair, when writing this article my plug only has two algorithms, F5 has at least five. But F5, as it is an appliance, it needs you have access to a data centre, which it discards automatically all the public who wants to use VPS'es in the cloud from different cities and different companies.
Interested in having yours?
Don't lose time and contact me. There is always a way to figure out an affordable solution for you.
Good Luck!blog comments powered by Disqus