Java 流阶段是连续的吗?阶段、Java

2023-09-07 01:12:12 作者:裸奔の毛毛虫ゞ

我有一个关于中间阶段顺序状态的问题 - 一个阶段的操作是否应用于所有输入流(项目)或是所有阶段/操作应用于每个流项目?

I have a question on the intermediate stages sequential state - are the operations from a stage applied to all the input stream (items) or are all the stages / operations applied to each stream item?

我知道这个问题可能不容易理解,所以我举个例子.关于以下流处理:

I'm aware the question might not be easy to understand, so I'll give an example. On the following stream processing:

List<String> strings = Arrays.asList("Are Java streams intermediate stages sequential?".split(" "));
strings.stream()
           .filter(word -> word.length() > 4)
           .peek(word -> System.out.println("f: " + word))
           .map(word -> word.length())
           .peek(length -> System.out.println("m: " + length))
           .forEach(length -> System.out.println("-> " + length + "
"));

我对这段代码的期望是它会输出:

My expectation for this code is that it will output:

f: streams
f: intermediate
f: stages
f: sequential?

m: 7
m: 12
m: 6
m: 11

-> 7
-> 12
-> 6
-> 11

相反,输出是:

f: streams
m: 7
-> 7

f: intermediate
m: 12
-> 12

f: stages
m: 6
-> 6

f: sequential?
m: 11
-> 11

由于控制台输出,是否只是显示所有阶段的项目?或者它们是否也处理所有阶段,一次一个?

Are the items just displayed for all the stages, due to the console output? Or are they also processed for all the stages, one at a time?

如果不够清楚,我可以进一步详细说明问题.

I can further detail the question, if it's not clear enough.

推荐答案

此行为启用代码的优化.如果每个中间操作要在进行下一个中间操作之前处理流的所有元素,那么就没有优化的机会.

This behaviour enables optimisation of the code. If each intermediate operation were to process all elements of a stream before proceeding to the next intermediate operation then there would be no chance of optimisation.

所以为了回答您的问题,每个元素一次沿流管道垂直移动一个(除了稍后讨论的一些有状态操作),因此在可能的情况下启用优化.

So to answer your question, each element moves along the stream pipeline vertically one at a time (except for some stateful operations discussed later), therefore enabling optimisation where possible.

鉴于您提供的示例,每个元素将沿流管道一个接一个地垂直移动,因为不包含 有状态 操作.

Given the example you've provided, each element will move along the stream pipeline vertically one by one as there is no stateful operation included.

另一个例子,假设您正在寻找长度大于 4 的第一个 String,在提供结果之前处理所有元素是不必要且耗时的.

Another example, say you were looking for the first String whose length is greater than 4, processing all the elements prior to providing the result is unnecessary and time-consuming.

考虑这个简单的例子:

List<String> stringsList = Arrays.asList("1","12","123","1234","12345","123456","1234567");
int result = stringsList.stream()
                        .filter(s -> s.length() > 4)
                        .mapToInt(Integer::valueOf)
                        .findFirst().orElse(0);

上面的filter中间操作不会找到所有长度大于4的元素并返回它们的新流,而是发生的情况是,一旦我们找到长度大于 4 的第一个元素,该元素就会进入 .mapToInt,然后 findFirst说我找到了第一个元素",然后执行就停止了.因此结果将是 12345.

The filter intermediate operation above will not find all the elements whose length is greater than 4 and return a new stream of them but rather what happens is as soon as we find the first element whose length is greater than 4, that element goes through to the .mapToInt which then findFirst says "I've found the first element" and execution stops there. Therefore the result will be 12345.

请注意,当 sorted 之类的有状态中间操作包含在流管道中时,该 特定 操作将遍历整个流.如果您考虑一下,这是完全有道理的,因为要对元素进行排序,您需要查看所有元素以确定哪些元素在排序顺序中排在第一位.

Note that when a stateful intermediate operation as such of sorted is included in a stream pipeline then that specific operation will traverse the entire stream. If you think about it, this makes complete sense as in order to sort elements you'll need to see all the elements to determine which elements come first in the sort order.

distinct 中间操作也是有状态的操作,但是,正如@Holger 所提到的,与 sorted 不同,它不需要遍历整个流,因为每个不同的元素都可以立即通过管道并可能满足短路条件.

The distinct intermediate operation is also a stateful operation, however, as @Holger has mentioned unlike sorted, it does not require traversing the entire stream as each distinct element can get passed down the pipeline immediately and may fulfil a short-circuiting condition.

filtermap 等无状态中间操作不必遍历整个流,可以像上面提到的那样一次自由地处理一个元素.

stateless intermediate operations such as filter , map etc do not have to traverse the entire stream and can freely process one element at a time vertically as mentioned above.

最后但同样重要的是,当终端操作是短路操作时,终端短路方法可以在遍历底层流的所有元素之前完成.

Lastly, but not least it's also important to note that, when the terminal operation is a short-circuiting operation the terminal-short-circuiting methods can finish before traversing all the elements of the underlying stream.

阅读:Java 8 流教程

reading: Java 8 stream tutorial