High Availability Infrastructure

The honest conversation first

HA is a spectrum, not a binary

High availability doesn't mean "two of everything running all the time." It means the right level of redundancy for the cost of downtime — which is different for every organization, and should be designed around that reality rather than a textbook topology.

Some clients will not accept idle standby servers. That's a legitimate position — a passive server that sits unused 99.9% of the time has a real cost and a real opportunity cost. For those situations, the architecture shifts toward faster recovery rather than instant failover: robust monitoring, tested runbooks, ZFS snapshots that allow rapid restoration, and deployment automation that can bring a replacement online quickly. Honest about the trade-off, not a compromise sold as something it isn't.

For clients where the cost of downtime exceeds the cost of standby capacity — revenue-critical platforms, applications where every minute offline has a measurable dollar value — proper active/passive HA with automatic failover is the right answer. The designs below reflect both realities.

Cost-constrained

Fast recovery without idle standby

Single primary with comprehensive monitoring, ZFS point-in-time snapshots, tested restoration runbooks, and deployment automation. Downtime on failure, but measured in minutes rather than hours. Honest about the model.

Revenue-critical

Active/passive with automatic failover

Dual HAProxy with CARP or heartbeat elastic IP promotion, multiple PHP-FPM application nodes, MariaDB primary/replica with automated promotion. Failover happens without human intervention. Downtime measured in seconds.

Load balancer tier

Dual HAProxy with automatic failover

Bare metal — CARP virtual IP

On dedicated hardware at GigeNET and similar providers, dual HAProxy nodes use CARP (Common Address Redundancy Protocol) to share a virtual IP address. The primary node owns the VIP and handles all traffic. If the primary fails — process crash, hardware fault, network partition — the secondary detects the absence of CARP advertisements and promotes itself, taking ownership of the VIP within seconds. No external orchestration required, no cloud API calls, no DNS TTL to wait out. The failover is handled entirely at the network layer.

Both HAProxy nodes maintain identical configuration, deployed and kept in sync by Puppet. A configuration change on one is a Puppet commit — it applies to both simultaneously rather than requiring manual synchronization that drifts over time.

AWS — heartbeat with elastic IP promotion

In AWS environments, CARP isn't available at the network layer. Instead, dual HAProxy nodes run heartbeat to monitor each other. When the primary becomes unavailable, the secondary detects the heartbeat failure and executes an AWS API script that reassigns the Elastic IP address to its own instance. From the user's perspective, the IP address continues to resolve to a functioning HAProxy node. The failover involves an AWS API call rather than a network-layer event, which adds a few seconds of latency compared to CARP, but remains fully automatic and requires no manual intervention.

The AWS script handles the IP reassignment atomically — detach from the failed instance, associate with the standby — and writes the event to a monitoring log that generates an alert. The operations team knows a failover happened; they don't have to make it happen.

HAProxy configuration

HAProxy is the right tool for this tier because it provides precise control over every aspect of traffic management: backend health checking, connection limits, request queuing, SSL termination, header manipulation, and access control — all in a single process with deterministic behavior and a configuration file that can be reviewed and audited without vendor tooling.

Health checks are configured to test real request paths rather than TCP handshakes. A backend that accepts a connection but cannot serve a request is removed from rotation — not kept active because it passed a handshake. This is the distinction documented in the PHP-FPM deadlock case study — health checks that test the wrong thing create a false sense of backend availability that HAProxy's proper configuration eliminates.

Active/passive CARP — bare metal & dedicated
Heartbeat + AWS elastic IP — cloud environments
Real request path health checks
Backend connection limits & queue management
SSL termination & certificate management
Stick tables for session affinity where needed
ACL-based routing — host, path, header
Puppet-managed config — both nodes always in sync

Application tier

Scalable PHP-FPM pools across redundant hosts

HAProxy manages a pool of PHP-FPM application nodes. The pool can be sized to the workload — two nodes for a modest traffic profile, more for higher concurrency — and the nodes can be added or removed without downtime by updating the HAProxy backend configuration.

Dedicated app servers

PHP-FPM only nodes

For high-traffic workloads, dedicated servers running nothing but PHP-FPM — up to 85 child processes per node, sized to the CPU core count and workload profile. No competing services, no resource contention. Each node is a Puppet-managed clone of the others: same package versions, same pool configuration, same PHP extensions.

Jail-based

PHP-FPM jails across multiple hosts

For cost-efficiency without sacrificing redundancy, PHP-FPM runs in iocage jails distributed across multiple physical hosts. A host failure removes its jails from the HAProxy pool via health check failure; the remaining nodes absorb the traffic. ZFS snapshots mean a failed jail can be restored to a new host rapidly from a known-good state.

Session handling

Valkey/Redis store or sticky sessions

PHP sessions in a multi-node pool require either a shared session store or session affinity at the load balancer. Valkey/Redis in its own jail provides a shared session store that any application node can read and write — the preferred approach because it allows truly stateless app nodes. Where a shared store isn't feasible, HAProxy stick tables provide sticky sessions with fallback on node failure.

Pool sizing

Worker counts aligned to hardware

PHP-FPM max_children is set based on CPU core count, available memory per worker, and the database connection limit — not as a round number or a value copied from a different server. The scheduler latency case study documents what happens when worker counts aren't aligned to the hardware they're running on.

Database tier

MariaDB primary/replica with zero-downtime promotion

The database tier uses MariaDB primary/replica replication. The replica provides read distribution during normal operation and a promotion path when the primary needs to be replaced — planned or unplanned.

Replication is configured with GTID-based position tracking, which makes promotion and re-replication reliable — the replica knows exactly where it is relative to the primary at all times, without depending on binary log filename and position that can become ambiguous after a failover. Replication lag is monitored continuously; an alerting threshold ensures that a lagging replica is flagged before it becomes a problem rather than discovered at promotion time.

Promotion procedure follows the sequence documented in the zero-downtime MariaDB migration case study: verify replication is current, verify data consistency with pt-table-checksum, decouple the replica cleanly with RESET SLAVE ALL, promote to read-write, update application configuration via code deployment. The procedure is documented, tested, and timed — not improvised under pressure at 2am.

For environments where the database tier itself needs to survive a host failure automatically, a second replica is maintained as a hot standby — current, verified, and ready to promote without a seeding cycle. This adds cost but eliminates the window between primary failure and a promoted replica being ready to accept writes.

GTID-based replication — reliable positioning
pt-table-checksum verification before promotion
Replication lag monitoring with alerting threshold
Zero-downtime promotion via code deployment
Hot standby replica for instant failover
innodb_flush_log_at_trx_commit tuned to workload
ZFS SLOG on SSD for synchronous write performance
Non-blocking schema changes via pt-osc / gh-ost
Connection pool management vs max_connections
MariaDB in dedicated iocage jail per host

Advanced orchestration

Nomad + Consul + pot — dynamic FreeBSD container orchestration

For workloads that need to scale dynamically — adding capacity in response to traffic rather than maintaining a fixed pool — a full container orchestration stack built on FreeBSD-native primitives provides the same capabilities as Kubernetes without the complexity overhead or the Linux requirement.

pot is a FreeBSD jail management framework with a higher-level abstraction than iocage, designed specifically for orchestration integration. Nomad is HashiCorp's workload scheduler — it places and manages pot jail workloads across the fleet, handling scheduling, health checking, and rescheduling on node failure. Consul provides service discovery and health checking — every pot jail that Nomad starts registers itself with Consul, and Consul maintains a real-time view of which instances of which services are healthy and available.

The HAProxy integration is the part that makes this a complete system rather than just an orchestration layer. HAProxy is configured to use Consul as its backend source via the Consul Template daemon — when Nomad starts a new pot jail and Consul marks it healthy, Consul Template regenerates the HAProxy configuration and triggers a reload. The new node appears in the load balancer pool automatically, with no manual intervention and no configuration change required on the HAProxy nodes. Scaling up is a Nomad job submission. Scaling down is the same. The load balancer always reflects the current state of the cluster.

This architecture has been built and operated in production on FreeBSD. It provides the dynamic scaling behavior that most organizations associate with Kubernetes, on infrastructure that is simpler to operate, cheaper to run, and built on the same FreeBSD foundation as the rest of the stack — without introducing a separate Linux-based orchestration layer into an otherwise FreeBSD environment.

pot jail framework — orchestration-ready jail management
Nomad — workload scheduling across FreeBSD hosts
Consul — service discovery and health checking
Consul Template — HAProxy dynamic backend generation
Automatic node insertion and removal from HAProxy pool
Health-checked scaling — only healthy nodes receive traffic
Job-based scaling — capacity changes via Nomad submission
Full FreeBSD native — no Linux orchestration layer required

Case studies from production HA environments

The following case studies document real problems encountered and solved in production high availability environments — the kind of problems that only appear when systems are running under real load.

Load balancer

PHP-FPM deadlock behind HAProxy

Health checks passing. Workers blocked. Why FastCGI handshake checks aren't enough — and what a proper real-path health check prevents.

Database tier

Zero-downtime MariaDB migration

Multi-terabyte database migrated to new hardware with zero downtime — tarpipe seeding, Percona verification, 2am code-push cutover.

Database tier

InnoDB stall and ZFS SLOG

Periodic query stalls on a low-CPU database. InnoDB flush behavior amplified by ZFS sync writes — diagnosed with ktrace, fixed with flush tuning and a dedicated SSD.

Application tier

Scheduler latency from worker pool oversubscription

100 runnable threads on 16 cores. System felt slow, CPU looked fine. DTrace on the scheduler found threads spending more time waiting to run than running.

HA is a spectrum, not a binary

Dual HAProxy with automatic failover

Bare metal — CARP virtual IP

AWS — heartbeat with elastic IP promotion

HAProxy configuration

Scalable PHP-FPM pools across redundant hosts

MariaDB primary/replica with zero-downtime promotion

Nomad + Consul + pot — dynamic FreeBSD container orchestration

Case studies from production HA environments

Building or improving an HA stack?