Skip to content

Commit 702f91d

Browse files
hunleydCopilotsamrose
authored
perf(tuned): activate and configure zswap for postgresql tuned profile (#2077)
* perf: activate and configure zswap for postgresql tuned profile This change enables zswap with the zsmalloc allocator and zstd compression within the postgresql tuned profile to improve system responsiveness and database performance under memory pressure. ### Technical Overview: What is zswap? zswap is a Linux kernel feature that provides a compressed write-back cache for swapped pages. Instead of immediately moving pages from RAM to the physical swap device (HDD/SSD) when memory is full, zswap intercepts them, compresses them, and stores them in a dynamically allocated RAM pool. This effectively increases the "perceived" amount of RAM available to the system. ### Why zsmalloc over zbud? We have selected 'zsmalloc' as the zpool allocator instead of 'zbud' for several reasons: - Density: zsmalloc is a slab-based allocator that can pack many compressed objects into a single physical page, whereas zbud is hard-coded to a maximum of 2 objects per page (2:1 ratio). - Efficiency: On modern kernels (Linux 6.3+), zsmalloc supports proper eviction to disk, removing the primary historical advantage of zbud. It provides significantly better memory utilization and higher compression density. ### Why zstd over lz4? While lz4 offers faster compression/decompression speeds, we have opted for 'zstd' for the following reasons: - Compression Ratio: zstd provides a substantially higher compression ratio than lz4. In database workloads where memory density is critical, fitting more pages into the zswap pool is more beneficial than the marginal speed gains of lz4. - Balance: zstd offers a superior balance between CPU overhead and compression efficiency, especially for the 4KB-8KB pages typically handled by the kernel and PostgreSQL. ### Benefits for PostgreSQL Performance Implementing zswap is particularly beneficial for PostgreSQL workloads: - Mitigation of Swap Thrashing: PostgreSQL performance degrades exponentially when active buffers are swapped to disk. zswap ensures that these pages stay in RAM (compressed), allowing "major page faults" to be resolved at RAM speeds rather than waiting for disk I/O. - I/O Latency Stability: By reducing the frequency of physical writes to the swap partition, zswap prevents I/O contention between the kernel's swap subsystem and PostgreSQL's own WAL and data writes. - Consistent Query Latency: Queries that access less-frequently used data that has been compressed into zswap will return much faster than if they had to be fetched from physical swap, leading to more predictable p99 latencies. * Update ansible/tasks/setup-tuned.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update ansible/tasks/setup-tuned.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * chore: bump ami version vars * Update ansible/tasks/setup-tuned.yml * fix: spacing that GH editor messed up * refactor: use modprobe module instead * fix: avoid an error on the GH runners due to missing kernel module * fix: the runners are also ubuntu, so just ignore if we cannot load this * refactor(tuned): refactor based on feedback from Sam * fix(tuned): errant quotes * Update vars.yml * chore: bump to test * fix(tuned): switch from copy to lineinfil to avoid perm issues on the temp dir copy creates * refactor(tuned): use command instead of lineinfile in case the original file contents change on us * chore: bump to release --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Sam Rose <samuel@supabase.io>
1 parent 8810dd7 commit 702f91d

3 files changed

Lines changed: 62 additions & 3 deletions

File tree

ansible/tasks/setup-tuned.yml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,54 @@
9999
when:
100100
- ansible_facts['processor'][0] is search("GenuineIntel", ignorecase=True)
101101

102+
- name: tuned - Enable zswap if swap is present
103+
when:
104+
- ansible_facts['swaptotal_mb'] > 0
105+
block:
106+
- name: tuned - Decrease the kernel swappiness
107+
become: true
108+
community.general.ini_file:
109+
create: true
110+
group: 'root'
111+
mode: '0644'
112+
no_extra_spaces: true
113+
option: 'vm.swappiness'
114+
path: '/etc/tuned/profiles/postgresql/tuned.conf'
115+
section: 'sysctl'
116+
state: 'present'
117+
value: 10
118+
119+
- name: tuned - Load zstd compressor module
120+
become: true
121+
community.general.modprobe:
122+
name: 'zstd'
123+
persistent: 'present'
124+
state: 'present'
125+
126+
- name: tuned - Configure and enable zswap
127+
ansible.builtin.shell:
128+
cmd: "echo {{ zswap_item['value'] }} > /sys/module/zswap/parameters/{{ zswap_item['param'] }}"
129+
loop:
130+
- { param: 'compressor', value: 'zstd' }
131+
- { param: 'max_pool_percent', value: '10' }
132+
- { param: 'zpool', value: 'zsmalloc' }
133+
- { param: 'enabled', value: 'Y' }
134+
loop_control:
135+
loop_var: 'zswap_item'
136+
137+
- name: tuned - Enable zswap at boot
138+
become: true
139+
community.general.ini_file:
140+
create: true
141+
group: 'root'
142+
mode: '0644'
143+
no_extra_spaces: true
144+
option: 'cmdline_zswap'
145+
path: '/etc/tuned/profiles/postgresql/tuned.conf'
146+
section: 'bootloader'
147+
state: 'present'
148+
value: 'zswap.enabled=1 zswap.zpool=zsmalloc zswap.compressor=zstd zswap.max_pool_percent=10'
149+
102150
- name: Activate the tuned service
103151
ansible.builtin.systemd_service:
104152
daemon_reload: true

ansible/vars.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ postgres_major:
1010

1111
# Full version strings for each major version
1212
postgres_release:
13-
postgresorioledb-17: "17.6.0.050-orioledb"
14-
postgres17: "17.6.1.093"
15-
postgres15: "15.14.1.093"
13+
postgresorioledb-17: "17.6.0.052-orioledb"
14+
postgres17: "17.6.1.095"
15+
postgres15: "15.14.1.095"
1616

1717
# Non Postgres Extensions
1818
pgbouncer_release: 1.25.1
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# kernel-param baseline for AMI build
2+
# Checks that critical kernel parameters haven't changed
3+
kernel-param:
4+
vm.page-cluster:
5+
value: '3' # kernel default; assert to catch if kernel upgrade changes it
6+
kernel.panic:
7+
value: '10' # set by setup-system.yml
8+
vm.panic_on_oom:
9+
value: '1' # set by setup-system.yml
10+
vm.swappiness:
11+
value: '10' # intentionally lowered from kernel default (60) for DB + zswap

0 commit comments

Comments
 (0)