(1)aggregation_memory_efficient_merge_threads:
So, what about mystical "aggregation_memory_efficient_merge_threads" setting? This setting has its meaning when you enable "distributed_aggregation_memory_efficient" to limit memory usage during distributed aggregation or when you enabled "max_bytes_before_external_group_by".
It sets number of threads to do merging of intermediate results. Zero value means - automatically determine number of threads by number of CPU cores. To efficiently merge intermediate data, it is split by 256 buckets (partitions), and all data for each bucket is merged at one step.
When using one merging thread, only one bucket (1/256 of total amount of intermediate data) will reside in memory. If using more threads - then proportionally more memory is used, but merging are processed proportionally faster.
The setting was set to 1 in older versions of ClickHouse, because there was bugs in multithreaded bucket-aware merging of intermediate data.
Then bug was fixed, and default value of setting was changed to 0 (use automatically determined number of threads). Now external and distributed aggregation works much faster by default.
(2)listen_backlog:
The behavior of the backlog argument on TCP sockets changed with Linux 2.2. Now it specifies the queue length for completely established sockets waiting to be accepted, instead of the number of incomplete connection requests. The maximum length of the queue for incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog.
clickhouse default value:
<listen_backlog>64</listen_backlog>
高并发时需提升这个配置值。
(3)enable_scalar_subquery_optimization =0
解决with 语句在集群中执行时的问题。