Java performance - loop unfolding -
when read optimization, covered topic loop unfolding. doing small search on google, didnt found if java's compiler or not.
so best way try if self.
actually quite suprised of fact, doing loop unfolding, managed speed up, since quite sure modern compilers me.
public static void folded() { system.out.println("folded:"); long c1 = system.currenttimemillis(); (int r = 0; r < 10; r++) { (int = 0; < 500000; i++) { math.sin(i); } } system.out.println(system.currenttimemillis() - c1); } public static void unfolded() { system.out.println("unfolded:"); long c1 = system.currenttimemillis(); (int r = 0; r < 10; r++) { (int = 0; < 500000; += 10) { math.sin(i); math.sin(i + 1); math.sin(i + 2); math.sin(i + 3); math.sin(i + 4); math.sin(i + 5); math.sin(i + 6); math.sin(i + 7); math.sin(i + 8); math.sin(i + 9); } } system.out.println(system.currenttimemillis() - c1); }
result(counter 500'000):
folded:453
unfolded:114
result(counter 5'000'000):
folded: 13850
unfolded: 11929
so should trust? manual optimization or compilers? since in test, result shows manual optimization seems better.
unfolder loop useful when can parallelize unfolded operations. lot of modern cpus support vector instructions https://en.wikipedia.org/wiki/vector_processor
beginning 7u40 server compiler java supports basic vector instructions http://bugs.java.com/view_bug.do?bug_id=6340864. arraya[0..n] + arrayb[0..n]
etc. read more do jvm's jit compilers generate code uses vectorized floating point instructions?
in case unfolded operation math.sin(...)
more 1 cpu instruction. result java not able convert known cpu vector instruction , provide performance benefit compare loop.
Comments
Post a Comment