CentOS Stream (9 or 10 in 2026) inherits Red Hat Enterprise Linux’s robust tuning ecosystem. With kernel 6.12+ in Stream 10, modern hardware support (including x86-64-v3 baseline), and continuous updates, the focus shifts toward balanced throughput, low latency, energy-aware scaling, and workload-specific profiles rather than extreme micro-optimizations.
Performance tuning remains iterative: measure baselines (with tools like perf, sar, tuned-adm profile list, stress-ng, fio, iperf3, or wrk), apply changes, re-measure, and revert if regressions appear. Avoid blanket “max performance” settings — they often hurt latency or stability under mixed loads.
1. Use Tuned Profiles – The Smart Starting Point
tuned is the official dynamic tuning daemon. It applies sysctl, CPU governor, disk elevator, power, and IRQ changes based on workload.
Install & list profiles:
sudo dnf install tuned
sudo tuned-adm listRecommended 2026 profiles:
- throughput-performance — Best default for general servers (web, API, file serving). Disables power saving, uses deadline/noop scheduler, boosts network/disk sysctls.
- network-throughput — For high-bandwidth NICs (40/100 Gbps+); emphasizes large buffers and interrupt coalescing.
- latency-performance — For low-jitter needs (databases, real-time apps); pins CPU to performance governor, minimizes C-states.
- virtual-guest / virtual-host — For KVM/VMware guests or hosts.
- balanced — Default; good compromise if unsure.
Activate:
sudo tuned-adm profile throughput-performance
sudo tuned-adm activeMonitor: tuned-adm recommend suggests based on current usage.
2. CPU & Power Management
Modern servers throttle aggressively for power. Disable for consistent performance:
Set governor to performance (or use tuned profile above):
Bashcpupower frequency-set -g performance # install cpupower if neededBIOS/UEFI: Enable Performance mode, disable C-states below C1, disable power-saving features (P-states, turbo on demand if latency-critical).
Disable transparent huge pages if they cause latency spikes (common in databases):
Bashecho never > /sys/kernel/mm/transparent_hugepage/enabled echo never > /sys/kernel/mm/transparent_hugepage/defragMake permanent in /etc/sysctl.d/99-performance.conf.
3. Memory & VM Tuning
Key sysctls for high-concurrency servers:
Reduce swappiness (avoid swapping under memory pressure):
vm.swappiness = 10 (or 1 for latency-sensitive)
Increase file handles & inotify watches:
fs.file-max = 2097152 fs.inotify.max_user_watches = 524288
TCP memory buffers (for high-bandwidth connections):
net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216
Overcommit policy (allow more allocations than RAM):
vm.overcommit_memory = 1 (careful with OOM killer)
Apply in /etc/sysctl.d/99-performance.conf then sysctl –system.
4. Network Stack Optimizations
For 10/25/100 Gbps+ links:
Increase socket backlog & queues:
net.core.somaxconn = 65535 net.core.netdev_max_backlog = 5000
Enable TCP timestamps & window scaling:
net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_window_scaling = 1
Tune congestion control (BBR often best in 2026):
net.ipv4.tcp_congestion_control = bbr
Disable slow-start after idle:
net.ipv4.tcp_slow_start_after_idle = 0
For very high packet rates: increase ring buffers via ethtool -G eth0 rx 4096 tx 4096 (NIC-dependent).
5. Disk I/O & Filesystem Tuning
Scheduler: Use mq-deadline or none (NVMe default) for SSDs; avoid cfq/bfq on servers.
Read-ahead: Increase for sequential workloads:
blockdev –setra 4096 /dev/nvme0n1
XFS/ext4 mount options (in /etc/fstab):
noatime,nodiratime,discard (TRIM on SSDs) For XFS: inode64,allocsize=1m on large filesystems.
LVM stripe cache + RAID write-back if hardware RAID.
6. IRQ & NUMA Awareness
Pin IRQs to specific cores (avoid sharing with application threads):
Use irqbalance (default) or manual with set_irq_affinity.sh scripts from vendor drivers.
NUMA: Bind processes to nodes local to NIC/SSD:
numactl –cpunodebind=0 –membind=0 nginx
7. Monitoring & Validation Tools (2026 Essentials)
- tuned-adm, perf top, bpftrace, bcc-tools (e.g., funccount, biolatency)
- sar, iotop, nethogs, htop with extended metrics
- prometheus-node-exporter + Grafana for long-term trends
- Benchmark: fio (disk), iperf3/netperf (network), stress-ng –cpu –vm (memory/CPU)
Quick Prioritization Table (Workload-Based)
| Workload Type | Top Priorities (in order) | Expected Gain |
|---|---|---|
| High-concurrency web/API | tuned throughput, BBR, large TCP buffers, somaxconn | +30–80% throughput |
| Database (MySQL/PostgreSQL) | latency-performance, no THP, vm.swappiness=1, deadline scheduler | -20–50% p99 latency |
| File/Storage server | network-throughput, large readahead, XFS allocsize | +50% sequential I/O |
| Container/K8s host | virtual-host profile, hugepages disabled if not using, IRQ pinning | Better pod density |
| Low-latency (real-time-ish) | latency-performance, C-states disabled, TCP timestamps | Reduced jitter |
Start with a tuned profile, baseline with real workload traffic, apply 2–3 sysctls at a time, and always test rollback. Over-tuning often causes regressions under bursty or mixed loads.