Java高并发(三.四)-JMH性能测试_jmh 并发测试-CSDN博客

本文链接：https://blog.csdn.net/Athazement/article/details/102811485

本文深入探讨了JMH(Java Microbenchmark Harness)性能测试框架的使用与原理，包括基本概念、配置方法、不同模式下的测试结果分析，以及通过实例对比了多种数据结构和并发控制机制的性能。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

性能测试的原因

部分并发程序是由串行程序改造而来,需要比较两种算法的性能
由于业务原因引入多线程,多线程并发控制导致性能损耗,评估损耗比重是否能够接受.

4.1 JMH

JMH ( Java Microbenchmark Harness ) 是一个在 OpenJDK 项目中发布的，专门用于性能
测试的框架，其精度可以到达毫秒级.

4.2 JMH简单使用

导入JMH包
使用Maven导入,pom.xml内容如下:

<dependency>
	<groupId>org.openjdk.jmh</groupId>
	<artifactId>jmh-core</artifactId>
	<version>1.20</version>
</dependency>
<dependency>
	<groupId>org.openjdk.jmh</groupId>
	<artifactId>jmh-generator-annprocess</artifactId>
	<version>1.20</version>
	<scope>provided</scope>
</dependency>

JMH程序示例

@BenchmarkMode(Mode.AverageTime)//度量模式
@OutputTimeUnit(TimeUnit.MICROSECONDS)//度量单位
public class JMHSample_01_HelloWorld {
    @Benchmark
    public void wellHelloThere() {
        // this method was intentionally left blank.
    	//System.out.println("ok");
    }
    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(JMHSample_01_HelloWorld.class.getSimpleName())
                .forks(1).build();
        new Runner(opt).run();
    }
}

设置APT模式

APT(Annotatino Processing Tool)的作用是处理代码中的注解, 用来生成代码
JMH 框架会在测试开始前，根据用户的测试用例，通过 Java APT 机制生成真正的测试代码

设置过程

安装Maven插件m2e-apt
Preference=>Maven=>Annotation Processing Tool=>勾选automatically

测试结果分析

# JMH version: 1.20
# VM version: JDK 1.8.0_45, VM 25.45-b02
# VM invoker: D:\Desktop\study\java\jdk\jre\bin\java.exe
# VM options: -Dfile.encoding=UTF-8
# Warmup: 20 iterations, 1 s each
# Measurement: 20 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: geym.conc.ch3.jmh.JMHSample_01_HelloWorld.wellHelloThere

# Run progress: 0.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration   1: ≈ 10⁻⁴ us/op
...
# Warmup Iteration  20: ≈ 10⁻⁴ us/op
Iteration   1: ≈ 10⁻⁴ us/op
...
Iteration  20: ≈ 10⁻⁴ us/op

Result "geym.conc.ch3.jmh.JMHSample_01_HelloWorld.wellHelloThere": ≈ 10⁻⁴ us/op

# Run complete. Total time: 00:00:40

Benchmark                               Mode  Cnt   Score    Error  Units
JMHSample_01_HelloWorld.wellHelloThere  avgt   20  ≈ 10⁻⁴           us/op

代码分析

1-10行是测试的基本信息,包括java路径,预热和测试代码迭代次数,线程数量等
Warmup是热身时的性能指标,预热能够使JVM充分优化测试代码
Iteration为实际测试代码时的性能指标
最后一行表示被测试函数,测试模式,测试次数,得分等信息

4.3 JMH的基本概念与配置

模式(Mode)

Throughput： 整体吞吐量，表示 1 秒内可以执行多少次调用。
AverageTime： 调用的平均时间，指每一次调用所需要的时间。
SampleTime： 随机取样，最后输出取样结果的分布，例如“ 99%的调用在 xxx 毫秒” 。
SingleShotTime： 只运行一次。同时把 warmup 次数设为 0, 用于测试冷启动时的性能(不预热)。

迭代(Iteration)
JMH的一次测试单位,一次迭代为1s,期间不断调用被测方法,并采样计算吞吐量,平均时间等参数.

预热(Warmup)

由于JVM中JIT的存在,同一方法在JIT编译前后时间不同.
预热代码,使代码得到充分JIT编译,通常只考虑方法在JIT后的性能

状态(State)
通过State可指定对象的作用范围

线程范围(Thread):为每个线程生成一个对象
基准测试范围(Benchmark):多个线程共享一个实例

配置类(Options/OptionsBuilder)
测试前对测试参数配置

指定测试类(include)
使用进程个数(fork)

预热迭代次数(warmupIterations)

Options opt = new OptionsBuilder()
	.include(JMHSample_01_HelloWorld.class.getSimpleName())
	.forks(1).build();
new Runner(opt)•run();

4.4 JMH中的Mode

测试代码

@Benchmark
@BenchmarkMode(Mode.XXX)//表示不同模式
eOutputTimeUnit(TimeUnit_SECONDS)
public void measureThroughput(} throws InterruptedException (
	TimeUnit.MILLISECONDS.sleep(100);
}

测试结果

Mode.Throughput

JMHSample 02 BenehmarkModes.measureThroughput thrpt 20 9.960 ± 0.007 ops/s
每秒约10次操作

Mode.AverageTime

JMHSample_02_BenchmarkModes•measureAvgTime avgt 20 100449,572 土 77.384 us/op
每次操作约100ms

Mode.SampleTime

JMHSample_02_BenchmarkModes.measureSamples sample 200 100323.820 士 83,746 us/op
JMHSample_02_BenchmarkModes•measureSamples:measureSamples p0.00 sample 99221.504 us/op
JMHSample_02_BenchmarkModes.measureSamples:measureSamples p0.50 sample 100270.380 us/op
JMHSample_02_BenchnerkModes.measureSamples:measureSamples p0.90 sample 100794.368 us/op
JMHSample_02_BenchmarkModes.measureSamples:measureSamples pO.99 sample 101055,201 us/op
JMHSampIe 02 BenchmarkModes.measureSamples:measureSamples p1.00 sample 101974,016 us/op
在一定时间内完成的概率

4.5 JMH中的State

代码示例

public class JMHSample_03_States {
    @State(Scope.Benchmark)//线程共享
    public static class BenchmarkState {
        volatile double x = Math.PI;
    }
    @State(Scope.Thread)//线程独享副本
    public static class ThreadState {
        volatile double x = Math.PI;
    }
    @Benchmark
    public void measureUnshared(ThreadState state) {
        state.x++;
    }
    @Benchmark
    public void measureShared(BenchmarkState state) {
        state.x++;
    }
    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(JMHSample_03_States.class.getSimpleName())
                .threads(4)
                .forks(1)
                .build();
        new Runner(opt).run();
    }
}

结果分析

Benchmark                             Mode  Cnt          Score         Error  Units
JMHSample_03_States.measureShared    thrpt   20   77596034.965 ±  560383.574  ops/s
JMHSample_03_States.measureUnshared  thrpt   20  699479891.399 ± 3711396.990  ops/s

线程共享一份数据,写入时效率较低

4.6 对于性能的思考

性能比较

在不同的使用环境下,模块的性能可能不同.
严格的性能比较,两个模块的功能和测试环境应该相同.
性能的两个参数
时间复杂度
空间复杂度

性能比较实例

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
public class MapTest {
    static Map hashMap = new HashMap();
    static Map syncHashMap = Collections.synchronizedMap(new HashMap());
    static Map concurrentHashMap = new ConcurrentHashMap();
    @Setup
    public void setup() {
        for (int i = 0; i < 10000; i++) {
            hashMap.put(Integer.toString(i), Integer.toString(i));
            syncHashMap.put(Integer.toString(i), Integer.toString(i));
            concurrentHashMap.put(Integer.toString(i), Integer.toString(i));
        }
    }
    @Benchmark
    public void hashMapGet() {
        hashMap.get("4");
    }
    @Benchmark
    public void syncHashMapGet() {
        syncHashMap.get("4");
    }
    @Benchmark
    public void concurrentHashMapGet() {
        concurrentHashMap.get("4");
    }
    @Benchmark
    public void hashMapSize() {
        hashMap.size();
    }
    @Benchmark
    public void syncHashMapSize() {
        syncHashMap.size();
    }
    @Benchmark
    public void concurrentHashMapSize() {
        concurrentHashMap.size();
    }
    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder().include(MapTest.class.getSimpleName()).forks(1).warmupIterations(5)
                .measurementIterations(5).threads(2).build();
        new Runner(opt).run();
    }
}

@Setup表示初始化操作,被修饰的方法在测试前执行
结果分析

单线程测试

Benchmark                       Mode  Cnt     Score     Error   Units
MapTest.concurrentHashMapGet   thrpt    5   138.200 ±   5.530  ops/us
MapTest.concurrentHashMapSize  thrpt    5   915.124 ± 146.810  ops/us
MapTest.hashMapGet             thrpt    5   157.456 ±  31.701  ops/us
MapTest.hashMapSize            thrpt    5  1705.856 ± 175.743  ops/us
MapTest.syncHashMapGet         thrpt    5    67.337 ±   7.518  ops/us
MapTest.syncHashMapSize        thrpt    5    76.763 ±   0.898  ops/us

多线程测试

Benchmark                       Mode  Cnt     Score     Error   Units
MapTest.concurrentHashMapGet   thrpt    5   254.638 ±  31.406  ops/us
MapTest.concurrentHashMapSize  thrpt    5  1639.774 ± 189.014  ops/us
MapTest.hashMapGet             thrpt    5   290.629 ±  63.919  ops/us
MapTest.hashMapSize            thrpt    5  3213.160 ± 220.701  ops/us
MapTest.syncHashMapGet         thrpt    5    18.772 ±   0.366  ops/us
MapTest.syncHashMapSize        thrpt    5    22.952 ±   1.291  ops/us

代码分析

HashMap:无锁
ConcurrentHashMap:多段锁
Collections.synchronizedMap(new HashMap()):全局锁
单线程测试时:无锁>多段锁>全局锁(锁需要消耗性能)
3, 多线程时,全局锁应该阻塞性能反而降低,而无锁和多段锁性能约提高一倍

4.7 CopyOnWriteArrayList 与 ConcurrentLinkedQueue

CopyOnWriteArrayList 通过写复制提高并发性能
ConcurrentLinkedQueue 通过CAS和锁分离提高性能

性能测试实例

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
public class ListTest {
	CopyOnWriteArrayList smallCopyOnWriteList = new CopyOnWriteArrayList();
	ConcurrentLinkedQueue smallConcurrentList = new ConcurrentLinkedQueue();
	CopyOnWriteArrayList bigCopyOnWriteList = new CopyOnWriteArrayList();
	ConcurrentLinkedQueue bigConcurrentList = new ConcurrentLinkedQueue();
	@Setup
	public void setup() {
		for (int i = 0; i < 10; i++) {
			smallCopyOnWriteList.add(new Object());
			smallConcurrentList.add(new Object());
		}
		for (int i = 0; i < 1000; i++) {
			bigCopyOnWriteList.add(new Object());
			bigCopyOnWriteList.add(new Object());
		}
	}
	@Benchmark
	public void copyOnWriteGet() {
		smallCopyOnWriteList.get(0);
	}
	@Benchmark
	public void copyOnWriteSize() {
		smallCopyOnWriteList.size();
	}
	@Benchmark
	public void concurrentListGet() {
		smallConcurrentList.peek();
	}
	@Benchmark
	public void concurrentListSize() {
		smallConcurrentList.size();
	}
	@Benchmark
	public void smallCopyOnWriteWrite() {
		smallCopyOnWriteList.add(new Object());
		smallCopyOnWriteList.remove(0);
	}
	@Benchmark
	public void smallConcurrentListWrite() {
		smallConcurrentList.add(new Object());
		smallConcurrentList.remove(0);
	}
	@Benchmark
	public void bigCopyOnWriteWrite() {
		bigCopyOnWriteList.add(new Object());
		bigCopyOnWriteList.remove(0);
	}
	@Benchmark
	public void bigConcurrentListWrite() {
		bigConcurrentList.offer(new Object());
		bigConcurrentList.remove(0);
	}
	public static void main(String[] args) throws RunnerException {
		Options opt = new OptionsBuilder().include(ListTest.class.getSimpleName()).forks(1).warmupIterations(5)
				.measurementIterations(5).threads(4).build();
		new Runner(opt).run();
	}
}

测试结果分析

Benchmark                           Mode  Cnt     Score     Error   Units
ListTest.bigConcurrentListWrite    thrpt    5     0.012 ±   0.007  ops/us
ListTest.bigCopyOnWriteWrite       thrpt    5     0.264 ±   0.026  ops/us
ListTest.concurrentListGet         thrpt    5  4206.582 ± 598.722  ops/us
ListTest.concurrentListSize        thrpt    5   310.722 ±  53.405  ops/us
ListTest.copyOnWriteGet            thrpt    5  4243.784 ± 326.868  ops/us
ListTest.copyOnWriteSize           thrpt    5  5403.908 ± 671.604  ops/us
ListTest.smallConcurrentListWrite  thrpt    5     0.012 ±   0.007  ops/us
ListTest.smallCopyOnWriteWrite     thrpt    5    10.162 ±   1.582  ops/us