basiradocs
ClickHouse

ClickHouse Monitoring

Monitor ClickHouse parts, merges, mutations, and server metrics.

Basira collects ClickHouse-specific telemetry beyond standard query performance. These views help you understand storage health, background operations, and server resource usage.

Parts

ClickHouse stores data in immutable parts that are periodically merged into larger parts. The Parts page shows:

MetricDescription
TableWhich table the part belongs to
Active partsNumber of parts currently in use
Part sizesOn-disk size of parts

What to watch for

  • Too many active parts — slows queries and may trigger "Too many parts" errors. Check that merges are keeping up.
  • Growing part count over time — inserts are outpacing merges. Consider batching inserts or tuning max_parts_in_total.

Merges

Background merges combine small parts into larger ones. The Merges page shows active merge operations:

MetricDescription
TableTable being merged
ProgressPercentage complete
ElapsedHow long the merge has been running
Rows / bytes readData processed so far

What to watch for

  • Stalled merges — progress not advancing over time. May indicate disk I/O saturation.
  • High merge count — many concurrent merges consume CPU and disk bandwidth. Tune max_threads or background_pool_size.
  • Merges can't keep up — if parts accumulate faster than merges complete, batch your inserts or reduce insert frequency.

Mutations

Mutations are ALTER TABLE ... UPDATE/DELETE operations that rewrite parts in the background. The Mutations page shows:

MetricDescription
TableTable being mutated
CommandThe ALTER statement
StatusWhether the mutation is running, complete, or stuck
Parts remainingHow many parts still need to be rewritten

What to watch for

  • Stuck mutations — a mutation that stops progressing. Check for errors in system.mutations with is_done = 0 and latest_fail_reason != ''.
  • Too many mutations — mutations are sequential per table. Queue too many and they block each other. Batch related changes into fewer operations.

Server Metrics

The Server Metrics page shows system-level resource usage:

MetricDescription
Memory usageCurrent memory consumed by ClickHouse
CPU utilizationServer CPU usage
ConnectionsActive client connections
Disk spaceStorage usage and availability

What to watch for

  • Memory approaching limitmax_memory_usage may kill queries. Tune per-query memory limits or add RAM.
  • Disk space running low — ClickHouse needs free space for merges and temporary data. Monitor with alerts on disk usage.
  • Connection count growing — check for connection leaks in your application.

On this page