I was really confused by the output from the files list stream
16:25:29.032 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.031 [main] INFO StressTest - compute from parallel file with collection
16:25:29.032 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection
16:25:29.032 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel file with collection
16:25:29.039 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel file with collection
16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.037 [main] INFO StressTest - compute from parallel file with collection
16:25:29.039 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection
16:25:29.039 [main] INFO StressTest - parallel compute for file names with collect
16:25:29.042 [main] INFO StressTest - compute from parallel string with collection
16:25:29.042 [main] INFO StressTest - compute from parallel string with collection
16:25:29.042 [main] INFO StressTest - compute from parallel string with collection
16:25:29.042 [main] INFO StressTest - compute from parallel string with collection
16:25:29.042 [main] INFO StressTest - compute from parallel string with collection
16:25:29.042 [main] INFO StressTest - compute from parallel string with collection
16:25:29.042 [main] INFO StressTest - compute from parallel string with collection
16:25:29.043 [main] INFO StressTest - compute from parallel string with collection
16:25:29.043 [main] INFO StressTest - compute from parallel string with collection
16:25:29.043 [main] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel string with collection
16:25:29.042 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel string with collection
16:25:29.043 [main] INFO StressTest - compute from parallel string with collection
16:25:29.043 [main] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection
16:25:29.043 [main] INFO StressTest - parallel compute for file names
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.045 [main] INFO StressTest - compute from parallel string
16:25:29.048 [main] INFO StressTest - compute from parallel string
16:25:29.048 [main] INFO StressTest - compute from parallel string
16:25:29.048 [main] INFO StressTest - compute from parallel string
16:25:29.048 [main] INFO StressTest - compute from parallel string
16:25:29.048 [main] INFO StressTest - compute from parallel string
16:25:29.048 [main] INFO StressTest - compute from parallel string
16:25:29.048 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.049 [main] INFO StressTest - compute from parallel string
16:25:29.050 [main] INFO StressTest - compute from parallel string
16:25:29.050 [main] INFO StressTest - compute from parallel string
16:25:29.050 [main] INFO StressTest - compute from parallel string
16:25:29.050 [main] INFO StressTest - parallel compute for files
16:25:29.051 [main] INFO StressTest - compute from parallel file
16:25:29.051 [main] INFO StressTest - compute from parallel file
16:25:29.051 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.052 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.053 [main] INFO StressTest - compute from parallel file
16:25:29.054 [main] INFO StressTest - compute from parallel file
16:25:29.054 [main] INFO StressTest - compute from parallel file
16:25:29.054 [main] INFO StressTest - compute from parallel file
16:25:29.054 [main] INFO StressTest - compute from parallel file
16:25:29.054 [main] INFO StressTest - compute from parallel file
16:25:29.054 [main] INFO StressTest - compute from parallel file
16:25:29.054 [main] INFO StressTest - compute from parallel fileProcess finished with exit code 0
corresponding to code
try {
Files.list(Paths.get("...."))
.parallel()
.filter(Files::isRegularFile)
.map(Path::toFile)
.filter(file -> file.getName().startsWith(Constants.AB_MODEL))
.collect(Collectors.toList())
.parallelStream()
.forEach(s -> {
log.info("compute from parallel file with collection");
});
} catch (IOException e) {
e.printStackTrace();
} log.info("parallel compute for file names with collect");
try {
Files.list(Paths.get("...."))
.parallel()
.filter(Files::isRegularFile)
.map(Path::toFile)
.filter(file -> file.getName().startsWith(Constants.AB_MODEL))
.map(file -> file.getName())
.collect(Collectors.toList())
.parallelStream()
.forEach(s -> {
log.info("compute from parallel string with collection");
});
} catch (IOException e) {
e.printStackTrace();
} log.info("parallel compute for file names");
try {
Files.list(Paths.get("...."))
.parallel()
.filter(Files::isRegularFile)
.map(Path::toFile)
.filter(file -> file.getName().startsWith(Constants.AB_MODEL))
.map(file -> file.getName())
.forEach(s -> {
log.info("compute from parallel string");
});
} catch (IOException e) {
e.printStackTrace();
} log.info("parallel compute for files");
try {
Files.list(Paths.get("...."))
.parallel()
.filter(Files::isRegularFile)
.map(Path::toFile)
.filter(file -> file.getName().startsWith(Constants.AB_MODEL))
.forEach(s -> {
log.info("compute from parallel file");
});
} catch (IOException e) {
e.printStackTrace();
}
so the parallel from Files.list is resulting in a single thread to process all files (~50 files).
Unless there is a collect to do a parallel stream again, then it will split into the common pool.
After a lot of investigate and research, turns out JCP has a really not crafted implementations on parallel:
source: http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-July/034539.html
so basically when it’s doing splitIterator, it was using Long.MAX_VALUE:
IteratorSpliterator (est. MAX_VALUE elements)
| |
ArraySpliterator (est. 1024 elements) IteratorSpliterator (est. MAX_VALUE elements)
| |
/---------------/ |
| |
ArraySpliterator (est. 2048 elements) IteratorSpliterator (est. MAX_VALUE elements)
| |
/---------------/ |
| |
ArraySpliterator (est. 3072 elements) IteratorSpliterator (est. MAX_VALUE elements)
| |
/---------------/ |
| |
ArraySpliterator (est. 856 elements) IteratorSpliterator (est. MAX_VALUE elements)
|
(split returns null: refuses to split anymore)
do { a[j] = i.next(); } while (++j < n && i.hasNext());