java+enhanced+loop,Java Enhanced-For-Loop比传统的更快速吗?

So my understanding was that enhanced for loops should be slower because they must use an Iterator.. However my code is providing mixed results.. (Yes I know that loop logic takes up the majority of time spent in a loop)

For a low number of iterations (100-1000), the enhanced for loop seems to be much faster with and without JIT. On the contrary with a high number of iterations (100000000), the traditional loop is much faster. What's going on here?

public class NewMain {

public static void main(String[] args) {

System.out.println("Warming up");

int warmup = 1000000;

for (int i = 0; i < warmup; i++) {

runForLoop();

}

for (int i = 0; i < warmup; i++) {

runEnhancedFor();

}

System.out.println("Running");

int iterations = 100000000;

long start = System.nanoTime();

for (int i = 0; i < iterations; i++) {

runForLoop();

}

System.out.println((System.nanoTime() - start) / iterations + "nS");

start = System.nanoTime();

for (int i = 0; i < iterations; i++) {

runEnhancedFor();

}

System.out.println((System.nanoTime() - start) / iterations + "nS");

}

public static final List array = new ArrayList(100);

public static int l;

public static void runForLoop() {

for (int i = 0; i < array.size(); i++) {

l += array.get(i);

}

}

public static void runEnhancedFor() {

for (int i : array) {

l += i;

}

}

}

解决方案

Faulty benchmarking. The non exhaustive list of what is wrong:

No proper warmup: single shot measurements are almost always wrong;

Mixing several codepaths in the single method: we probably start compiling the method with the execution data available only for the first loop in the method;

Sources are predictable: should the loop compile, we can actually predict the result;

Results are dead-code eliminated: should the loop compile, we can throw the loop away

Take your time listening to these talks, and going through these samples.

This is how you do it arguably correct with jmh:

@OutputTimeUnit(TimeUnit.NANOSECONDS)

@BenchmarkMode(Mode.AverageTime)

@Warmup(iterations = 3, time = 1)

@Measurement(iterations = 3, time = 1)

@Fork(3)

@State(Scope.Thread)

public class EnhancedFor {

private static final int SIZE = 100;

private List list;

@Setup

public void setup() {

list = new ArrayList(SIZE);

}

@GenerateMicroBenchmark

public int enhanced() {

int s = 0;

for (int i : list) {

s += i;

}

return s;

}

@GenerateMicroBenchmark

public int indexed() {

int s = 0;

for (int i = 0; i < list.size(); i++) {

s += list.get(i);

}

return s;

}

@GenerateMicroBenchmark

public void enhanced_indi(BlackHole bh) {

for (int i : list) {

bh.consume(i);

}

}

@GenerateMicroBenchmark

public void indexed_indi(BlackHole bh) {

for (int i = 0; i < list.size(); i++) {

bh.consume(list.get(i));

}

}

}

...which yields something along the lines of:

Benchmark Mode Samples Mean Mean error Units

o.s.EnhancedFor.enhanced avgt 9 8.162 0.057 ns/op

o.s.EnhancedFor.enhanced_indi avgt 9 7.600 0.067 ns/op

o.s.EnhancedFor.indexed avgt 9 2.226 0.091 ns/op

o.s.EnhancedFor.indexed_indi avgt 9 2.116 0.064 ns/op

Now that's a minute differences between enhanced and indexed loops, and that difference is naively explained by taking the different code paths to access the backing storage. However, the explanation is actually much simpler: OP FORGOT TO POPULATE THE LIST, which means the loop bodies ARE NEVER EVER EXECUTE, and the benchmark is actually measuring the cost of size() vs iterator()!

Fixing that:

@Setup

public void setup() {

list = new ArrayList(SIZE);

for (int c = 0; c < SIZE; c++) {

list.add(c);

}

}

...yields then:

Benchmark Mode Samples Mean Mean error Units

o.s.EnhancedFor.enhanced avgt 9 171.154 25.892 ns/op

o.s.EnhancedFor.enhanced_indi avgt 9 384.192 6.856 ns/op

o.s.EnhancedFor.indexed avgt 9 148.679 1.357 ns/op

o.s.EnhancedFor.indexed_indi avgt 9 465.684 0.860 ns/op

Note the differences are really minute even on the nano-scale, and the non-trivial loop bodies will consume the difference, if any. The differences here can be explained by how lucky we are in inlining get() and Iterator methods, and the optimizations that we could enjoy after those inlinings.

Note the indi_* tests, which negate down the loop unrolling optimizations. Even though indexed enjoys better performance while successfully unrolled, but it is the opposite when the unrolling is broken!

With the headlines like that, the difference between indexed and enhanced is nothing more than of academic interest. Figuring out the exact generated code -XX:+PrintAssembly for all the cases is left as exercise to the reader :)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值