Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,13 +1,46 @@
---
title: "Who ate my ClickHouse® memory?"
linkTitle: "Who ate my memory?"
weight: 100
description: >
*"It was here a few minutes ago..."*
keywords:
*"It was here a few minutes ago..."* — finding what is using ClickHouse® RAM.
keywords:
- clickhouse memory
- clickhouse memory usage
- MemoryTracking
- jemalloc
---

When `MemoryTracking` is high or the OS shows ClickHouse® RSS far above what
queries account for, the goal is to attribute the memory to one of its
consumers. The major ones are:

- **Caches** — mark cache, uncompressed cache, mmap cache, query cache,
filesystem cache, compiled-expression cache, primary-index cache.
- **Primary keys** loaded in memory for every active part (`system.parts.primary_key_bytes_in_memory_allocated`).
- **Dictionaries** (`system.dictionaries.bytes_allocated`).
- **Running queries** (`system.processes.memory_usage`) and **merges/mutations**
(`system.merges.memory_usage`).
- **In-memory engines** — `Memory`, `Set`, `Join`, `Buffer` tables.
- **Async inserts** waiting to flush (`system.asynchronous_inserts`).
- **In-memory parts** (`part_type = 'InMemory'`).
- **Virtual / accounting overhead** — file read buffers, thread stacks, mmaps.
- **jemalloc retained pages** — memory the allocator has freed but not yet
returned to the OS. This inflates RSS without being attributable to any
ClickHouse subsystem; running `SYSTEM JEMALLOC PURGE` first makes the rest
of the numbers comparable.

For the related limits and overcommit behavior, see
[memory configuration settings](/altinity-kb-setup-and-maintenance/altinity-kb-memory-configuration-settings/),
[Memory Overcommiter](/altinity-kb-setup-and-maintenance/altinity-kb-memory-overcommit/), and
[Configure ClickHouse for low memory environments](/altinity-kb-setup-and-maintenance/configure_clickhouse_for_low_mem_envs/).

## All-in-one breakdown

Run this first. It purges jemalloc retained pages, then unions every memory
consumer into one `(group, name, val)` result so you can compare them at a
glance.

```sql
SYSTEM JEMALLOC PURGE;

Expand Down Expand Up @@ -39,7 +72,7 @@ SELECT 'AsyncInserts' as group, 'db:'||database as name, toInt64(sum(total_bytes
UNION ALL
SELECT 'FileBuffersVirtual' as group, metric as name, toInt64(value * 2*1024*1024) FROM system.metrics WHERE metric like 'OpenFileFor%'
UNION ALL
SELECT 'ThreadStacksVirual' as group, metric as name, toInt64(value * 8*1024*1024) FROM system.metrics WHERE metric = 'GlobalThread'
SELECT 'ThreadStacksVirtual' as group, metric as name, toInt64(value * 8*1024*1024) FROM system.metrics WHERE metric = 'GlobalThread'
UNION ALL
SELECT 'UserMemoryTracking' as group, user as name, toInt64(memory_usage) FROM system.user_processes
UNION ALL
Expand All @@ -48,43 +81,56 @@ select 'QueryCacheBytes' as group, '', toInt64(sum(result_size)) FROM system.que
SELECT 'MemoryTracking' as group, 'total' as name, toInt64(value) FROM system.metrics WHERE metric = 'MemoryTracking'
```

> Note: `FileBuffersVirtual` and `ThreadStacksVirtual` are *upper bounds* on
> reservation, not committed RSS — useful for spotting runaway thread/file-handle
> counts but not directly comparable to `MemoryTracking`.

## Drill down by subsystem

Once the all-in-one query points at a suspect group, use these to inspect it.

```sql
SELECT *, formatReadableSize(value)
FROM system.metrics
-- Live memory/cache metrics
SELECT *, formatReadableSize(value)
FROM system.metrics
WHERE (metric ilike '%Cach%' or metric ilike '%Mem%') and value != 0
order by metric format PrettyCompactMonoBlock;

SELECT *, formatReadableSize(value)
FROM system.asynchronous_metrics
WHERE metric like '%Cach%' or metric like '%Mem%'
SELECT *, formatReadableSize(value)
FROM system.asynchronous_metrics
WHERE metric like '%Cach%' or metric like '%Mem%'
order by metric format PrettyCompactMonoBlock;

SELECT event_time, metric, value, formatReadableSize(value)
FROM system.asynchronous_metric_log
WHERE event_time > now() - 600 and (metric like '%Cach%' or metric like '%Mem%') and value <> 0
-- Last 10 minutes of memory/cache metrics from the async log
SELECT event_time, metric, value, formatReadableSize(value)
FROM system.asynchronous_metric_log
WHERE event_time > now() - 600 and (metric like '%Cach%' or metric like '%Mem%') and value <> 0
order by metric, event_time format PrettyCompactMonoBlock;

-- Dictionaries
SELECT formatReadableSize(sum(bytes_allocated)) FROM system.dictionaries;

-- In-memory engines
SELECT
database,
name,
formatReadableSize(total_bytes)
FROM system.tables
WHERE engine IN ('Memory','Set','Join');

-- Primary keys + in-memory parts
SELECT
sumIf(data_uncompressed_bytes, part_type = 'InMemory') as memory_parts,
formatReadableSize(sum(primary_key_bytes_in_memory)) AS primary_key_bytes_in_memory,
formatReadableSize(sum(primary_key_bytes_in_memory_allocated)) AS primary_key_bytes_in_memory_allocated
FROM system.parts;

-- Merges, in-flight queries, query cache
SELECT formatReadableSize(sum(memory_usage)) FROM system.merges;

SELECT formatReadableSize(sum(memory_usage)) FROM system.processes;

select formatReadableSize(sum(result_size)) FROM system.query_cache;

-- Top current queries by peak memory
SELECT
initial_query_id,
elapsed,
Expand All @@ -95,6 +141,7 @@ FROM system.processes
ORDER BY peak_memory_usage DESC
LIMIT 10;

-- Top recent (last 2h) finished queries by memory
SELECT
type,
event_time,
Expand All @@ -105,14 +152,20 @@ FROM system.query_log
WHERE (event_date >= today()) AND (event_time >= (now() - 7200))
ORDER BY memory_usage DESC
LIMIT 10;

```

## Polling memory live

When the suspect is moving (an active merge or query), poll the relevant
tables in a loop. Run from the host shell.

Just merges + processes:

```bash
for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_empty_set=0 -q "select (select 'Merges: \
'||formatReadableSize(sum(memory_usage)) from system.merges), (select \
'Processes: '||formatReadableSize(sum(memory_usage)) from system.processes)";\
sleep 3; done
sleep 3; done

Merges: 96.57 MiB Processes: 41.98 MiB
Merges: 82.24 MiB Processes: 41.91 MiB
Expand All @@ -121,6 +174,8 @@ Merges: 66.49 MiB Processes: 37.13 MiB
Merges: 67.78 MiB Processes: 37.13 MiB
```

Wider view — merges, queries, primary keys, in-memory tables, dictionaries:

```bash
echo " Merges Processes PrimaryK TempTabs Dicts"; \
for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_empty_set=0 -q "select \
Expand All @@ -130,7 +185,7 @@ for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_emp
(select leftPad(formatReadableSize(sum(total_bytes)),15, ' ') from system.tables \
WHERE engine IN ('Memory','Set','Join'))||
(select leftPad(formatReadableSize(sum(bytes_allocated)),15, ' ') FROM system.dictionaries)
"; sleep 3; done
"; sleep 3; done

Merges Processes PrimaryK TempTabs Dicts
0.00 B 0.00 B 21.36 MiB 1.58 GiB 911.07 MiB
Expand All @@ -140,7 +195,13 @@ for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_emp

```

## retrospection analysis of the RAM usage based on query_log and part_log (shows peaks)
## Retrospective: peaks from query_log + part_log

If the spike already happened, reconstruct the timeline by replaying
allocations and releases from `query_log`, `part_log`, and `query_views_log`.
Each row contributes `+peak_memory_usage` at start and `-peak_memory_usage`
at end; the running sum approximates concurrent RAM use, broken out by event
type.

```sql
WITH
Expand Down Expand Up @@ -201,7 +262,11 @@ ORDER BY timeframe
FORMAT PrettyCompactMonoBlock;
```

## retrospection analysis of trace_log
## Retrospective: trace_log

`trace_log` records `MemoryPeak` samples per query. This bucketizes them in
5-minute windows and surfaces the worst offender per bucket — useful when
the query is gone but the trace is still on disk.

```sql
WITH
Expand Down Expand Up @@ -239,9 +304,26 @@ ORDER BY t ASC;
-- later on you can check particular query_ids in query_log
```

## analysis of the server text logs
## Server text logs

`MemoryTracker` lines in the server log show every time a query, merge, or
global tracker hits a peak or a limit — handy when system tables have already
rolled over.

```
grep MemoryTracker /var/log/clickhouse-server.log
zgrep MemoryTracker /var/log/clickhouse-server.log.*.gz
```

## Next steps

Once you've identified the culprit:

- Hitting `Memory limit (for query) exceeded` — see
[memory configuration settings](/altinity-kb-setup-and-maintenance/altinity-kb-memory-configuration-settings/)
and consider enabling the
[Memory Overcommiter](/altinity-kb-setup-and-maintenance/altinity-kb-memory-overcommit/).
- Caches/dictionaries dominating on a small box — see
[Configure ClickHouse for low memory environments](/altinity-kb-setup-and-maintenance/configure_clickhouse_for_low_mem_envs/).
- System tables themselves are large — see
[system tables eat my disk](/altinity-kb-setup-and-maintenance/altinity-kb-system-tables-eat-my-disk/).
Loading