java 流拆分,将Java流拆分为两个惰性流,无需终端操作

I understand that in general Java streams do not split. However, we have an involved and lengthy pipeline, at the end of which we have two different types of processing that share the first part of the pipeline.

Due to the size of the data, storing the intermediate stream product is not a viable solution. Neither is running the pipeline twice.

Basically, what we are looking for is a solution that is an operation on a stream that yields two (or more) streams that are lazily filled and able to be consumed in parallel. By that, I mean that if stream A is split into streams B and C, when streams B and C consume 10 elements, stream A consumes and provides those 10 elements, but if stream B then tries to consume more elements, it blocks until stream C also consumes them.

Is there any pre-made solution for this problem or any library we can look at? If not, where would we start to look if we want to implement this ourselves? Or is there a compelling reason not to implemented at all?

解决方案

You can implement a custom Spliterator in order to achieve such behavior. We will split your streams into the common "source" and the different "consumers". The custom spliterator then forwards the elements from the source to each consumer. For this purpose, we will use a BlockingQueue (see this question).

Note that the difficult part here is not the spliterator/stream, but the syncing of the consumers around the queue, as the comments on your question already indicate. Still, however you implement the syncing, Spliterator helps to use streams with it.

@SafeVarargs

public static long streamForked(Stream source, Consumer>... consumers)

{

return StreamSupport.stream(new ForkingSpliterator<>(source, consumers), false).count();

}

private static class ForkingSpliterator

extends AbstractSpliterator

{

private Spliterator sourceSpliterator;

private BlockingQueue queue = new LinkedBlockingQueue<>();

private AtomicInteger nextToTake = new AtomicInteger(0);

private AtomicInteger processed = new AtomicInteger(0);

private boolean sourceDone;

private int consumerCount;

@SafeVarargs

private ForkingSpliterator(Stream source, Consumer>... consumers)

{

super(Long.MAX_VALUE, 0);

sourceSpliterator = source.spliterator();

consumerCount = consumers.length;

for (int i = 0; i < consumers.length; i++)

{

int index = i;

Consumer> consumer = consumers[i];

new Thread(new Runnable()

{

@Override

public void run()

{

consumer.accept(StreamSupport.stream(new ForkedConsumer(index), false));

}

}).start();

}

}

@Override

public boolean tryAdvance(Consumer super T> action)

{

sourceDone = !sourceSpliterator.tryAdvance(queue::offer);

return !sourceDone;

}

private class ForkedConsumer

extends AbstractSpliterator

{

private int index;

private ForkedConsumer(int index)

{

super(Long.MAX_VALUE, 0);

this.index = index;

}

@Override

public boolean tryAdvance(Consumer super T> action)

{

// take next element when it's our turn

while (!nextToTake.compareAndSet(index, index + 1))

{

}

T element;

while ((element = queue.peek()) == null)

{

if (sourceDone)

{

// element is null, and there won't be no more, so "terminate" this sub stream

return false;

}

}

// push to consumer pipeline

action.accept(element);

if (consumerCount == processed.incrementAndGet())

{

// start next round

queue.poll();

processed.set(0);

nextToTake.set(0);

}

return true;

}

}

}

With the approach used, the consumers work on each element in parallel, but wait for each other before starting on the next element.

Known issue

If one of the consumers is "shorter" than the others (e.g. because it calls limit()) it will also stop the other consumers and leave the threads hanging.

Example

public static void sleep(long millis)

{

try { Thread.sleep((long) (Math.random() * 30 + millis)); } catch (InterruptedException e) { }

}

streamForked(Stream.of("1", "2", "3", "4", "5"),

source -> source.map(word -> { sleep(50); return "fast " + word; }).forEach(System.out::println),

source -> source.map(word -> { sleep(300); return "slow " + word; }).forEach(System.out::println),

source -> source.map(word -> { sleep(50); return "2fast " + word; }).forEach(System.out::println));

fast 1

2fast 1

slow 1

fast 2

2fast 2

slow 2

2fast 3

fast 3

slow 3

fast 4

2fast 4

slow 4

2fast 5

fast 5

slow 5

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值