Last week I had the opportunity to attend a session of the four-day class, Red Hat Enterprise Clustering and Storage Management (RH436), held in Toronto. It was a busy week, and that lovely view of the CN Tower above, as seen from my hotel room window, had to suffice for experiencing the city. Fortunately I’ve been here before, looked through the Glass Floor, and generally done the tourist thing. So let’s get down to business.
Purpose Of The Trip
At the radiology practice in which I work, we’ve long relied on Red Hat Enterprise Linux as the operating system underpinning our PACS (Picture Archiving and Communication System) that is the heart of our medical imaging business. For awhile, most of the rest of our back-end systems ran atop the Microsoft family of server products, just as our workstations and laptops run Microsoft Windows Professional. But over the last couple of years, the Microsoft-centric approach has gradually started to shift for us, as we build and deploy additional solutions on Linux. (The reasons for this change have a lot to do with the low cost and abundance of various open-source infrastructure technologies as compared to their Microsoft licensed equivalents.) But as we build out and begin to rely on additional applications running on Linux, we have to invest time in making these platforms as reliable and fault-tolerant as possible.
Fault Tolerance, Generally Speaking
The term ‘fault tolerance’ is fairly self-explanatory, though in practice it can cover a substantial amount of ground where technical implementations are concerned. Perhaps it’s best thought of as eliminating single points of failure everywhere we can. At my employer, and perhaps for the majority of businesses our size and larger, there’s already a great deal of fault tolerance underneath any new ‘server’ that we deploy today. For starters, our SAN storage environment includes fault tolerant RAID disk groups, redundant storage processors, redundant Fibre Channel paths to the storage, redundant power supplies on redundant electrical circuits, etc. Connected to the storage is a chassis containing multiple redundant physical blade servers, all running VMware’s virtualization software, including their High Availability (HA) and Distributed Resource Scheduler (DRS) features. Finally, we create virtual Microsoft and Linux servers on top of all this infrastructure. Those virtual servers get passed around from one physical host to another – seamlessly – as the workload demands, or in the event of a hardware component failure. That’s a lot of redundancy. But what if we want to take this a step further, and implement fault tolerance at the operating system or application level, in this case leveraging Red Hat Enterprise Linux? That is where Red Hat clustering comes into play.
Before we go any further, we should note that Red Hat lists the following prerequisites in their RH436 course textbook: “Students who are senior Linux systems administrators with at least five years of full-time Linux experience, preferably using Red Hat Enterprise Linux,” and “Students should enter the class with current RHCE credentials.” Neither of those applies to me, so what you’re about to read is filtered through the lens of someone who is arguably not in the same league as the intended audience. Then again, we’re all here to learn.
What Red Hat Clustering Is…
In Red Hat parlance, the term ‘clustering’ can refer to multiple scenarios, including simple load-balancing, high-performance computing clusters, and finally, high availability clusters. Today we’ll focus on the latter, provided by Red Hat’s High Availability Add-On, an extra-cost module that starts at $399/year per 2-processor host. With Red Hat’s HA addon, we’re able to cluster instances of Apache web server, a file system, an IP address, MySQL, an NFS client or server, an NFS/CIFS file system, Open LDAP, Oracle 10g, PostgreSQL, Samba, a SAP database, Sybase, Tomcat or a virtual machine. We’re also able to cluster any custom service that launches via an init script, and which returns status appropriately. Generally speaking, a clustered resource will run in an active-passive configuration, with one node holding the resource until it fails, at which time another node will take over.
…And What Red Hat HA Clustering Is Not
Less than two weeks prior to the RH436 class, I somehow managed to get through a half-hour phone conversation with a Red Hat Engineer without touching on one fundamental requirement of HA that, when later identified, shaped my understanding of Red Hat clustering going forward. So perhaps the following point merits particular attention: Any service clustered via Red Hat’s HA add-on that also uses storage – say Apache or MySQL – requires that the cluster nodes have shared access to block level storage. Let’s read it again: Red Hat’s HA clustering requires that all nodes have shared access to block level storage; the type typically provided by an iSCSI or Fibre Channel SAN. Red Hat HA passes control of this shared storage back and forth among nodes as needed, rather than having some built-in facility for replicating a cluster’s user-facing content from one node to another. For this reason and others, we can’t simply create discrete Red Hat servers here and there and combine them into a cluster, with no awareness of, nor regard for, our underlying storage and network infrastructure. Yet before anyone goes dismissing any potential use cases out of hand, remember that like much of life and technology, the full story is always just a bit more complicated.
Let’s begin by talking about how we might implement a traditional Red Hat HA cluster. The following steps are vastly oversimplified, as a lot of planning is required around many of these actions prior to execution. We’re not going to get into any command-line detail in today’s discussion, though that would make for an interesting post down the road.
- We’ll begin with between two and sixteen physical or virtual servers running Red Hat Enterprise Linux with the HA add-on license. The physical or virtual servers must support power fencing, a technology that allows a surviving node to separate failed nodes from possibly writing to shared storage by shutting the failed node down. This is supported on physical servers by Cisco, Dell, HP, IBM and others, and is also supported on VMware.
- We’ll need one or more shared block level storage instances accessible to all nodes, though one at a time. In a traditional cluster, we’d make this available via an iSCSI or Fibre Channel SAN.
- All nodes are on the same network segment in the same address space, though it’s wise to isolate cluster communication to a separate VLAN from published services. Multicast, IGMP and gratuitous ARP are supported on our segments. There’s no traditional layer 3 routing separating one cluster node from another.
- We’d install a web-based cluster management application called Luci on a non-cluster node. We’re not concerned about fault-tolerance of this management tool, as a new one can be spun up at a moment’s notice and pointed at an existing cluster.
- Then we’d install a corresponding agent called Ricci (or likely the more all-encompassing “High Availability” and “Resilient Storage” groups from the Yum repository) on each cluster node, assign passwords, and set them to start on boot.
- At this point we’d likely log into the Luci web interface, create a cluster, add nodes, set up fencing, set up failover, create shared resources (like an IP address, a file system or an Apache web service) and add those resources to a service group. If that sounds like a lot, you’re right. We could spend hours or days on this one bullet the first time around.
- Before we declare Mission Accomplished, we’ll want to restart each node in the cluster and test every failover scenario that we can think of. We don’t want to assume that we’ve got a functional cluster without proving it.
What About Small Environments Without a SAN?
It’s conceivable that someone might want to cluster Red Hat servers in an environment without a SAN at all. Or perhaps one has a SAN, but they’ve already provisioned the entire thing for use by VMware, and they’d rather not start carving out LUNs to present directly to every new clustered scenario that they deploy. What then? Well, there are certainly free and non-free virtual iSCSI SAN products including FreeNAS, Openfiler and others. Some are offered in several forms including a VMware VMDK file or virtual appliance. They can be installed and sharing iSCSI targets in minutes, where previously we had none. Some virtual iSCSI solutions even offer replication from one instance to another, analogous to an EMC MirrorView or similar. In addition to eliminating yet another single point of failure, SAN replication provides a bit of a segue into what we’re going to talk about next.
What About Geographic Fault Tolerance?
As mentioned early on, at my office we already have several layers of fault tolerance built into our computing environment at our primary data center. When looking into Red Hat HA, our ideal scenario might involve clustering a service or application across two data centers, separated in our case by around 25 miles, 1 Gbit/s of network bandwidth and a 1 ms response time. Can we do it, and what about the shared storage requirement? Fortunately Red Hat supports certain scenarios of Multi-Site Disaster Recovery Clusters and Stretch Clusters. Let’s take a look at a few of the things involved. Be aware that there are other requirements.
- A Stretch Cluster, for instance, requires the data volumes to be replicated via hardware or 3rd-party software so that each group has access to a replica.
- Further, a Stretch Cluster must span no more than two sites, and must have the same number of nodes at each location.
- Both sites must share the same logical network, and routing between the two physical sites is not supported. The network must also offer LAN-like latency that is less than or equal to 2 ms.
- In the event of a site failure, human intervention is required to continue cluster operation, since a link failure would prevent the remaining site from initiating fencing.
- Finally, all Stretch Clusters are subject to a Red Hat Architecture Review before they’ll be supported. In fact, an Architecture Review might be a good idea in any cluster deployment, stretch or not.
While many enterprise computing environments already contain a great deal of fault tolerance these days, the clustering in Red Hat’s High Availability Add-On is one more tool that Systems Administrators may take advantage of as the need dictates. Though generally designed around increasing the availability of enterprise workloads within a single data center, it can be scaled down to use virtual iSCSI storage, or stretched under certain specific circumstances to provide geographic fault tolerance. In today’s 24×7 world, it’s good to have options.