High-availability (HA) clusters are essential for websites and services that require continuous uptime and predictable failover behavior. For operators deploying in Hong Kong and other international locations, choosing the right infrastructure and configuring a resilient cluster can dramatically reduce downtime. This article provides a practical, step-by-step guide to building a HA cluster on a Hong Kong VPS, covering architecture choices, component-level configuration, and operational best practices. It is aimed at webmasters, enterprise IT teams, and developers who want actionable technical detail rather than high-level theory.
Why build a HA cluster on a VPS in Hong Kong?
Hong Kong is a major Asia-Pacific internet hub with excellent connectivity to mainland China, Southeast Asia, and international backbones. Deploying on a Hong Kong Server-based VPS offers low-latency access for regional users while retaining the flexibility to interoperate with other regions such as a US VPS or US Server for disaster recovery or global load distribution. Using VPS instances makes HA deployments cost-effective and agile compared to colocated hardware, while still supporting enterprise-grade redundancy when designed correctly.
Typical HA use cases
- Web and application servers that must maintain session continuity.
- Database clusters requiring automatic failover (e.g., MySQL/MariaDB with DRBD + Pacemaker).
- Load balancers that need a floating VIP to ensure service continuity.
- File services that must stay available across maintenance windows (e.g., GlusterFS).
Core components and principles
A robust HA cluster combines node-level redundancy, reliable failover orchestration, and synchronized data. Key components include:
- Corosync + Pacemaker: Corosync provides cluster messaging and membership; Pacemaker is the resource manager that enforces constraints and performs failover actions.
- Resource agents (OCF/LRM): Scripts to manage services like Apache, MySQL, or virtual IPs (e.g., ocf:heartbeat:IPaddr2).
- VIP (Virtual IP): A floating IP address assigned to the active node so clients see a single endpoint.
- Data replication: Block-level replication with DRBD or distributed storage like GlusterFS for file-level replication.
- Fencing/STONITH: Mechanisms to forcibly isolate a failed node to avoid split-brain scenarios.
Quorum is central: the cluster must know the authoritative partition. In two-node setups, this usually requires a tie-breaker (e.g., a quorum disk, a third arbiter VM, or using corosync’s votequorum with a tie-breaker). Without proper quorum and fencing, split-brain can corrupt data.
Architecture options for Hong Kong VPS
Choose based on RTO/RPO, cost, and complexity:
- Two-node active-passive (basic): Minimal setup with VIP failover. Suitable for stateless web services. Requires a third-party tie-breaker or configured quorum mechanism.
- Two-node with DRBD active-passive: Adds block replication for databases. Use Pacemaker to manage DRBD resources, filesystem promotion, and the database service.
- Multi-node active-active with load balancing: Use HAProxy or Keepalived with VRRP for VIP and GlusterFS or a distributed database for shared state.
- Cross-region HA: Hong Kong Server primary with a US Server or US VPS as DR backup or read-only replica for disaster recovery.
Practical step-by-step setup (Ubuntu/CentOS example)
The following outlines a practical two-node HA cluster on Hong Kong VPS instances. Replace package manager commands as appropriate for Ubuntu (apt) or CentOS (yum/dnf).
1. Provisioning and networking
- Deploy two Hong Kong VPS instances (node1, node2). Ensure they have static private IPs for cluster communication and public IPs for client traffic.
- Open necessary firewall ports: corosync (UDP 5405 or configured), pacemaker/CRM (communication via corosync), and any service ports (HTTP/HTTPS, DB).
- Set hostnames and add /etc/hosts entries for both nodes to resolve each other’s cluster IPs.
2. Install cluster stack
- Install corosync, pacemaker, pcs (CentOS) or pacemaker, corosync (Ubuntu). Example (Ubuntu): apt update && apt install -y corosync pacemaker crmsh
- Enable and start corosync and pacemaker.
3. Configure corosync
- Create /etc/corosync/corosync.conf with cluster name, transport (udpu for IP-based), node entries, and multicast/TTL tuning for VPS environments. Ensure the bindnetaddr matches the private network.
- Secure the cluster with authkeys and consistent time (use NTP/chrony) to avoid membership flaps.
4. Set up quorum and fencing
- For two-node clusters, configure a tie-breaker: either a lightweight QDevice, a third small VPS acting as arbiter, or configure corosync votequorum with expected_votes and two_node mode.
- Configure STONITH. If no IPMI is available on VPS, use external fencing like fence_virtual or fence_ssh, but be cautious — ensure fencing is reliable to avoid split-brain.
5. Add resources to Pacemaker
- Create a Virtual IP resource: crm configure primitive vip ocf:heartbeat:IPaddr2 params ip=”203.0.113.10″ cidr_netmask=”24″ nic=”eth0″
- Wrap services as Pacemaker resources (e.g., Apache: ocf:heartbeat:apache).
- Define order and colocation constraints: VIP must be assigned before starting Apache; database must be promoted before allowing dependent services.
6. Configure DRBD for block replication (if using)
- Install drbd-utils and kernel module, configure /etc/drbd.d/mydata.res with device, disk, network sections, and peers’ IPs.
- Initialize metadata, bring DRBD up, and set one node as primary for filesystem creation. Manage promotion/demotion via Pacemaker resource agents so Pacemaker controls which node is primary.
7. Load balancing (optional)
- Instead of a single VIP, deploy HAProxy as a managed resource. Use keepalived for VRRP when distributing across multiple load balancers. Pacemaker can manage HAProxy and the VIP together.
8. Testing failover and recovery
- Simulate node failure (e.g., stop Pacemaker or shutdown node) and observe automatic failover of VIP and services to remaining node.
- Test split-brain scenarios by isolating network and ensure fencing recovers the cluster correctly.
- Monitor logs (/var/log/messages, /var/log/daemon.log, corosync.log) for membership events and resource actions.
Operational best practices
To maintain a healthy production HA cluster:
- Monitoring: Integrate with Prometheus/Grafana or Nagios to watch resource states, DRBD sync progress, and network latency.
- Backups: Even with replication, maintain point-in-time backups (logical DB dumps, filesystem snapshots).
- Maintenance windows: Plan rolling upgrades and demote/pause resources before patching nodes.
- Network considerations: Ensure private network throughput between Hong Kong VPS nodes is sufficient for DRBD or filesystem replication. Cross-region replication to a US VPS will involve higher latency and bandwidth costs—use asynchronous replication for geo-DR.
- Security: Harden SSH, use firewalls and VLANs for cluster traffic, and keep authkeys safe.
Comparing Hong Kong Server deployments with US VPS/US Server choices
Choosing between a regional Hong Kong Server and a US VPS/US Server depends on your audience and redundancy needs. Hong Kong deployments offer lower latency for APAC users and may have regulatory advantages for certain markets. A US VPS is useful for global redundancy or for serving North American users. Key trade-offs:
- Latency: Hong Kong Server wins for APAC traffic; US Server is better for NA users. Consider geo-routing or CDN for global reach.
- Bandwidth and peering: Test real-world network performance—Hong Kong often has good international peering for Asia-Pacific routes.
- Disaster recovery: Cross-region replication (Hong Kong ↔ US) improves resilience but introduces latency and higher egress charges.
Summary
Deploying a high-availability cluster on Hong Kong VPS instances is a practical way to achieve resilient, low-latency services for the APAC region. By combining Corosync/Pacemaker for orchestration, DRBD or distributed storage for data consistency, and reliable fencing/quorum mechanisms, you can build an HA architecture that tolerates node failures and maintenance windows. Remember to plan for monitoring, backups, and cross-region strategies when you need global redundancy (for example involving a US VPS or US Server). Start small with an active-passive setup, validate failover thoroughly, then iterate toward more complex multi-node or cross-datacenter topologies as your requirements grow.
For practical VPS choices that support rapid HA deployments in Hong Kong, see the Hong Kong VPS options available at https://server.hk/cloud.php.