Table of Contents
导语
上文分析到JVM 默认 -XX:ParallelGCThreads参数源码分析时看到得到的计算公式
8 + (n - 8) * (5/8) 这个n大家都知道是机器CPU。
可以具体什么类型的CPU呢?逻辑CPU、物理CPU?这两者区别是什么?JVM用的是那种CPU呢?
带着这些疑问,笔者进行了继续的分析。
机器CPU类型释疑
首先如何查看Linux机器CPU信息。大家都知道去查看 /proc/cpuinfo,具体参数指的是什么呢?如下图:
物理CPU
其中physical id指的物理CPU的唯一标示。不同ID的个数即标示物理CPU个数。也即是实际服务器上CPU插槽上的CPU个数。可用命令 查看
cat /proc/cpuinfo |grep "physical id"|sort |uniq|wc -l
逻辑CPU
现在的CPU都是多核的,实际可以处理任务的 processor是多个。截图的中core id 即表示核的ID,逻辑CPU可以通过physical id和core id的组合确定。超线程(HT)的使用还可以使processor的处理能力增加1倍。
一般情况下计算
逻辑CPU = 物理CPU x 核数 (如果开启超线程支持,再*2)
也可用命令查看
cat /proc/cpuinfo |grep "processor"|wc -l
JVM的选择
直接上代码
unsigned int Abstract_VM_Version::nof_parallel_worker_threads(
unsigned int num,
unsigned int den,
unsigned int switch_pt) {
if (FLAG_IS_DEFAULT(ParallelGCThreads)) {
assert(ParallelGCThreads == 0, "Default ParallelGCThreads is not 0");
unsigned int threads;
// For very large machines, there are diminishing returns
// for large numbers of worker threads. Instead of
// hogging the whole system, use a fraction of the workers for every
// processor after the first 8. For example, on a 72 cpu machine
// and a chosen fraction of 5/8
// use 8 + (72 - 8) * (5/8) == 48 worker threads.
unsigned int ncpus = (unsigned int) os::initial_active_processor_count();
threads = (ncpus <= switch_pt) ?
ncpus :
(switch_pt + ((ncpus - switch_pt) * num) / den);
#ifndef _LP64
// On 32-bit binaries the virtual address space available to the JVM
// is usually limited to 2-3 GB (depends on the platform).
// Do not use up address space with too many threads (stacks and per-thread
// data). Note that x86 apps running on Win64 have 2 stacks per thread.
// GC may more generally scale down threads by max heap size (etc), but the
// consequences of over-provisioning threads are higher on 32-bit JVMS,
// so add hard limit here:
threads = MIN2(threads, (2*switch_pt));
#endif
return threads;
} else {
return ParallelGCThreads;
}
}
unsigned int ncpus = (unsigned int) os::initial_active_processor_count();
这个ncpus如何获取的呢,继续往下看。
static int initial_active_processor_count() {
assert(_initial_active_processor_count > 0, "Initial active processor count not set yet.");
return _initial_active_processor_count;
}
void os::initialize_initial_active_processor_count() {
assert(_initial_active_processor_count == 0, "Initial active processor count already set.");
_initial_active_processor_count = active_processor_count();
log_debug(os)("Initial active processor count set to %d" , _initial_active_processor_count);
}
不同的系统对active_processor_count()的实现不同,不同的系统获取可用processor也不相同。下面以linux实现为例
int os::active_processor_count() {
cpu_set_t cpus; // can represent at most 1024 (CPU_SETSIZE) processors
cpu_set_t* cpus_p = &cpus;
int cpus_size = sizeof(cpu_set_t);
int configured_cpus = processor_count(); // upper bound on available cpus
int cpu_count = 0;
// old build platforms may not support dynamic cpu sets
#ifdef CPU_ALLOC
// To enable easy testing of the dynamic path on different platforms we
// introduce a diagnostic flag: UseCpuAllocPath
if (configured_cpus >= CPU_SETSIZE || UseCpuAllocPath) {
// kernel may use a mask bigger than cpu_set_t
log_trace(os)("active_processor_count: using dynamic path %s"
"- configured processors: %d",
UseCpuAllocPath ? "(forced) " : "",
configured_cpus);
cpus_p = CPU_ALLOC(configured_cpus);
if (cpus_p != NULL) {
cpus_size = CPU_ALLOC_SIZE(configured_cpus);
// zero it just to be safe
CPU_ZERO_S(cpus_size, cpus_p);
}
else {
// failed to allocate so fallback to online cpus
int online_cpus = ::sysconf(_SC_NPROCESSORS_ONLN);
log_trace(os)("active_processor_count: "
"CPU_ALLOC failed (%s) - using "
"online processor count: %d",
os::strerror(errno), online_cpus);
return online_cpus;
}
}
else {
log_trace(os)("active_processor_count: using static path - configured processors: %d",
configured_cpus);
}
#else // CPU_ALLOC
// these stubs won't be executed
#define CPU_COUNT_S(size, cpus) -1
#define CPU_FREE(cpus)
log_trace(os)("active_processor_count: only static path available - configured processors: %d",
configured_cpus);
#endif // CPU_ALLOC
// pid 0 means the current thread - which we have to assume represents the process
if (sched_getaffinity(0, cpus_size, cpus_p) == 0) {
if (cpus_p != &cpus) { // can only be true when CPU_ALLOC used
cpu_count = CPU_COUNT_S(cpus_size, cpus_p);
}
else {
cpu_count = CPU_COUNT(cpus_p);
}
log_trace(os)("active_processor_count: sched_getaffinity processor count: %d", cpu_count);
}
else {
cpu_count = ::sysconf(_SC_NPROCESSORS_ONLN);
warning("sched_getaffinity failed (%s)- using online processor count (%d) "
"which may exceed available processors", os::strerror(errno), cpu_count);
}
if (cpus_p != &cpus) { // can only be true when CPU_ALLOC used
CPU_FREE(cpus_p);
}
assert(cpu_count > 0 && cpu_count <= processor_count(), "sanity check");
return cpu_count;
}
sysconf由unistd.h提供,其参数可以是_SC_NPROCESSORS_ONLN 即返回实际可以逻辑CPU个数。也可以是_SC_NPROCESSORS_CONF 即返回所有可用逻辑CPU个数(包括禁止使用的CPU个数)。
可以看出JVM用的是 cpu_count = ::sysconf(_SC_NPROCESSORS_ONLN); 即获取的是实际可用CPU的个数。
总结
目前容器化技术的推进,例如docker,使用 /proc/cpuinfo获取的很有可能是物理的CPU信息。导致gc问题的出现。这种情况需要用户根据容器可使用物理机的占比来指定-XX:ParallelGCThreads= -XX:ConcGCThreads= 来实现JVM调优。