One of the more powerful features of ClickHouse is its introspective capabilities. This can be easily leveraged to understand where load on our multi-tenant clickhouse servers is coming from.
Show me the queries:
The following query gives an at-a-glance overview of what is generating load on the cluster:
Advanced
To diagnose further, it's important to understand ClickHouse operations.
Useful dimensions to slice the data on:
- query_duration_ms- How long the query took
- formatReadableSize(read_bytes)- Total number of bytes read from all tables and table functions participated in query.
- formatReadableSize(memory_usage)- Memory usage of this query.
- formatReadableSize(result_bytes)- How big was the response for this query. Useful for- not is_initial_queryto determine if too much data streaming is going on.
- ProfileEvents['OSCPUVirtualTimeMicroseconds'])- How much time was spent by the CPU
- ProfileEvents['ReadCompressedBytes'])- How much data was read from disk (compressed)
- ProfileEvents['NetworkSendElapsedMicroseconds'])- How much time was spent sending data over network
- ProfileEvents['NetworkReceiveBytes'])- How much time was spent reading data over network
- ProfileEvents['NetworkSendBytes'])- How much data was sent over the network
Full list of all valid measurements can be found in ClickHouse source code.
Other useful expressions:
- is_initial_query- indicates whether this was a main query or pushed down from coordinator. Note- log_commentis also forwarded.
- any(log_comment)- shows the structure of the log comment
- getMacro('replica')- What replica was this on?
- getMacro('shard')- What shard was this query made on?
- getMacro('hostClusterType')- What cluster was this on? Online or offline?
log_comment
We make use of the log_comment column quite extensively to add metadata to queries in order to make analysis simpler.
log_comment is set by specifying the log_comment setting when running a query. It then populates in the column with the same name on the query log table.
Some useful things you'll find in log_comment include:
- JSONExtractString(log_comment, 'kind')- What is the query from? Either- celeryor- request
- JSONExtractString(log_comment, 'query_type')- Type of query e.g.- trends_total_volumeor- funnel_conversion
- JSONExtractString(log_comment, 'id')- Request path or task name (depending on kind)
- JSONExtractString(log_comment, 'route_id')- what geberic route id was responsible for the query (only set for kind=request)
- JSONExtractInt(log_comment, 'user_id')- user_id
- JSONExtractInt(log_comment, 'team_id')- team_id
- JSONExtractString(log_comment, 'access_method')- What access method was used? Blank indicates it's normal web traffic (only set for kind=request)
- JSONExtractString(log_comment, 'http_referer')- HTTP referer (only set for kind=request)
- JSONExtractString(log_comment, 'http_user_agent')- HTTP user agent (only set for kind=request)
- JSONExtractString(log_comment, 'container_hostname')- Kubernetes pod where the query was initiated from
- JSONExtractString(log_comment, 'workload')- Either- WORKLOAD.ONLINEor- WORKLOAD.OFFLINE. This determines which node on the shard we prefer to send the query to
- JSONExtractString(JSONExtractString(log_comment, 'query_settings'), '<setting_name>')- Any settings passed along with the query can be found within the- query_settingsobject. To get the value for a specific setting, you need to call- JSONExtractXtwice.
- JSONExtractString(log_comment, 'filter')- The- Filterobject for an insight. You'll need to use array functions and- JSONExtractfunctions in order to access deeply nested filter data