How to Configure Nginx Load Balancing on Hong Kong VPS for High-Traffic Sites (2026)

April 6, 2026

When a single application server reaches its throughput limits — CPU saturation, memory pressure, or connection limits under high concurrent load — the next step is distributing traffic across multiple backend instances using a load balancer. Nginx is the most widely used open-source load balancer for this purpose, and configuring it on a Hong Kong VPS is the standard approach for scaling Asia-Pacific web applications beyond single-server capacity.

This guide covers Nginx load balancing configuration: upstream server pools, load balancing algorithms, health checks, sticky sessions, and SSL termination — the complete setup for a production high-traffic architecture.

Architecture Overview

The architecture this guide implements:

                    Internet (CN2 GIA)
                          │
                    ┌─────▼──────┐
                    │  Nginx LB  │  ← Hong Kong VPS (load balancer)
                    │  Port 80/443│
                    └─────┬──────┘
              ┌───────────┼───────────┐
         ┌────▼────┐ ┌────▼────┐ ┌────▼────┐
         │ App 1   │ │ App 2   │ │ App 3   │
         │:3000    │ │:3001    │ │:3002    │
         └─────────┘ └─────────┘ └─────────┘

The Nginx load balancer can proxy to backend application instances running on the same VPS (different ports) or on separate VPS instances (different IPs). Both configurations use identical Nginx upstream syntax.

Step 1: Install Nginx

sudo apt update
sudo apt install -y nginx
sudo systemctl enable nginx
sudo systemctl start nginx

Step 2: Configure the Upstream Pool

sudo nano /etc/nginx/sites-available/loadbalancer

# Define the upstream server pool
upstream app_backend {
    # Load balancing algorithm — choose one:

    # Round-robin (default) — distributes requests evenly
    # (no directive needed — round-robin is the default)

    # Least connections — sends to the server with fewest active connections
    # (best for requests with variable processing time)
    least_conn;

    # IP hash — routes same client IP to same backend (sticky sessions without cookies)
    # ip_hash;

    # Weighted round-robin — send more traffic to more powerful servers
    # server 127.0.0.1:3000 weight=3;
    # server 127.0.0.1:3001 weight=1;

    # Backend servers
    server 127.0.0.1:3000;
    server 127.0.0.1:3001;
    server 127.0.0.1:3002;

    # Mark a server as backup — only used when all primary servers are down
    # server 127.0.0.1:3003 backup;

    # Health check parameters
    # max_fails: number of failed requests before marking server as unavailable
    # fail_timeout: how long to mark server as unavailable after max_fails
    server 127.0.0.1:3000 max_fails=3 fail_timeout=30s;
    server 127.0.0.1:3001 max_fails=3 fail_timeout=30s;
    server 127.0.0.1:3002 max_fails=3 fail_timeout=30s;

    # Keep connections open to backends (reduces TCP handshake overhead)
    keepalive 32;
}

server {
    listen 80;
    server_name yourdomain.com www.yourdomain.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name yourdomain.com www.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 1d;

    # Pass real client IP to backends
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header Host $host;

    # Proxy timeouts
    proxy_connect_timeout 10s;
    proxy_send_timeout 60s;
    proxy_read_timeout 60s;

    # Enable keepalive to upstream
    proxy_http_version 1.1;
    proxy_set_header Connection "";

    # Gzip
    gzip on;
    gzip_types text/plain application/json application/javascript text/css;

    # Static files served directly
    location /static/ {
        alias /home/deploy/apps/myapp/static/;
        expires 30d;
    }

    # All dynamic requests proxied to the upstream pool
    location / {
        proxy_pass http://app_backend;
    }
}

sudo ln -s /etc/nginx/sites-available/loadbalancer /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

Step 3: Start Multiple Application Instances

Each backend instance runs on a different port. For a Node.js application using PM2:

nano /home/deploy/apps/myapp/ecosystem.config.js

module.exports = {
  apps: [
    {
      name: 'myapp-1',
      script: './index.js',
      env: { PORT: 3000, NODE_ENV: 'production' }
    },
    {
      name: 'myapp-2',
      script: './index.js',
      env: { PORT: 3001, NODE_ENV: 'production' }
    },
    {
      name: 'myapp-3',
      script: './index.js',
      env: { PORT: 3002, NODE_ENV: 'production' }
    }
  ]
};

pm2 start ecosystem.config.js
pm2 save

For Gunicorn (Python), bind each worker group to a different port or socket:

# Instance 1
gunicorn --bind 127.0.0.1:3000 --workers 2 app:app &

# Instance 2
gunicorn --bind 127.0.0.1:3001 --workers 2 app:app &

Step 4: Cookie-Based Sticky Sessions

Some applications require that the same user always hits the same backend — for in-memory session state or stateful processing. Cookie-based sticky sessions (available in Nginx Plus, or via the nginx-extras package with the sticky module) handle this more reliably than IP hash:

upstream app_backend {
    # Requires nginx-extras package:
    # sudo apt install nginx-extras
    sticky cookie srv_id expires=1h path=/;

    server 127.0.0.1:3000;
    server 127.0.0.1:3001;
    server 127.0.0.1:3002;
}

For applications using Redis for session storage (recommended), sticky sessions are unnecessary — any backend instance can read any user’s session from the shared Redis store, making round-robin or least-connection routing safe.

Step 5: Monitor Load Balancer Status

Enable Nginx’s built-in status module to monitor upstream health:

location /nginx_status {
    stub_status on;
    allow 127.0.0.1;
    deny all;
}

curl http://127.0.0.1/nginx_status

For comprehensive upstream monitoring, install nginx-module-vts or use Prometheus with nginx-prometheus-exporter for metrics integration.

Monitor which backends are receiving traffic:

# Watch access logs and count requests per upstream
tail -f /var/log/nginx/access.log | grep --line-buffered upstream_addr

Step 6: Tune for High-Concurrency Asia-Pacific Traffic

Edit /etc/nginx/nginx.conf for high-traffic optimisation:

worker_processes auto;
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;
    multi_accept on;
    use epoll;
}

http {
    # Connection keepalive tuning
    keepalive_timeout 65;
    keepalive_requests 1000;

    # Buffer tuning for proxy
    proxy_buffer_size 128k;
    proxy_buffers 4 256k;
    proxy_busy_buffers_size 256k;

    # Rate limiting — prevent abuse
    limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;
    limit_conn_zone $binary_remote_addr zone=conn:10m;
}

Apply rate limiting to API endpoints:

location /api/ {
    limit_req zone=api burst=50 nodelay;
    limit_conn conn 20;
    proxy_pass http://app_backend;
}

Conclusion

Nginx load balancing on a Hong Kong VPS scales your application horizontally across multiple backend instances — handling traffic volumes that overwhelm single-server deployments while maintaining CN2 GIA low-latency routing for mainland Chinese and Asia-Pacific users. The setup above handles tens of thousands of concurrent connections on a well-provisioned VPS.

For dedicated hardware capable of handling very high concurrent loads without virtualisation overhead, explore Server.HK’s Hong Kong dedicated server plans. For VPS-based scaling, our Hong Kong VPS plans provide the NVMe SSD I/O and CN2 GIA routing to support this architecture from entry-level pricing.

Frequently Asked Questions

What load balancing algorithm should I use for a web application?

For most web applications, least_conn (least connections) is the best default — it automatically routes new requests to the backend with the fewest active connections, preventing any single instance from becoming a bottleneck. Round-robin is adequate for applications where all requests take roughly equal time. IP hash is appropriate only when sticky sessions are required and Redis session sharing is not viable.

How many backend instances should I run on a single Hong Kong VPS?

The practical limit is your VPS’s available CPU and RAM. Each application instance consumes CPU workers and RAM — monitor usage with htop and add instances until CPU saturation is reached during peak load, then scale the VPS vertically or distribute backends across multiple VPS instances. For a 4 vCPU / 8 GB RAM VPS, 4–8 Node.js or Python instances is a typical practical range.

Does load balancing work with WebSocket connections?

Yes, with proper Nginx configuration. Add proxy_set_header Upgrade $http_upgrade; and proxy_set_header Connection "upgrade"; to the location block proxying WebSocket traffic. For sticky WebSocket connections (where the same client must maintain the same backend connection), use IP hash or cookie-based sticky sessions.

Architecture Overview

Step 1: Install Nginx

Step 2: Configure the Upstream Pool

Step 3: Start Multiple Application Instances

Step 4: Cookie-Based Sticky Sessions

Step 5: Monitor Load Balancer Status

Step 6: Tune for High-Concurrency Asia-Pacific Traffic

Conclusion

Frequently Asked Questions

What load balancing algorithm should I use for a web application?

How many backend instances should I run on a single Hong Kong VPS?

Does load balancing work with WebSocket connections?

Knowledge Base

Live Chat

Send Ticket

Cloud VPS

Dedicated Servers

More