So, (without any further justification of this comparison) I took the same source code for Java, Scala and Groovy as previously and took this modified version of the Groovy code for Groovy++:
So, I basically took the Groovy code, and added the @Typed annotation and some more type information. I ran this on my machine with Java 1.6.0 Update 17, Scala 2.7.7, Groovy 1.7 and Groovy++ which is based on Groovy 1.8.0-SNAPSHOT. Here are the results (in ms) for sorting 100.000 and 1.000.000 integers:
n | 100.000 | 1.000.000 |
---|---|---|
Java | 12 | 95 |
Scala | 12 | 95 |
Groovy | 3750 | 41700 |
Groovy++ | 515 | 4900 |
Obviously, while still Java and Scala are by far the fastest, Groovy++ makes a huge difference. Groovy takes about 312x the time of Java/Scala, Groovy++ just takes 42x.
While there isn't much difference in the source code of Groovy and Groovy++, there is of course a big difference in the generated byte code. I had a look at it with the Java Decompiler. The Groovy byte code contains lots of dynamic dispatch code, while there's not much in the Groovy++ byte code that indicates that it was compiled from Groovy. Here's a short excerpt:
public Object quicksort(int[] a, int L, int R) {
while (true) {
int m = ArraysMethods.getAt(a, DefaultGroovyPPMethods.div(DefaultGroovyPPMethods.box(L + R), DefaultGroovyPPMethods.box(2)).intValue());
int i = L;
int j = R;
while (i <= j) {
for (; ArraysMethods.getAt(a, i) < m; (i++));
for (; ArraysMethods.getAt(a, j) > m; (j--));
if (i <= j) {
swap(a, i, j);
(i++);
(j--);
}
}
if (L < j) quicksort(a, L, j);
if (R <= i) break; R = R; L = i; a = a; this = this; } return null;
}
This isn't even far from the byte code generated by java. There is just that autoboxing of integers, integer division and the array access with getAt(). Considering the early state of Groovy++ this looks very promising.
Update: There's is a follow-up post that shows how to tune the Groovy++ code to make it run as fast as Java and Scala (Thanks to ait, see comments).
7 comments:
Nice! Have you tried removing '.intValue()' from line 11? looks to me it's no longer needed and explicit boxing might be a bottleneck.
Thanks Andres. You're right, I could remove .intValue(), but it doesn't make a change. The value get unboxed afterwards, then:
With intValue():
int m = ArraysMethods.getAt(a, DefaultGroovyPPMethods.div(DefaultGroovyPPMethods.box(L + R), DefaultGroovyPPMethods.box(2)).intValue());
Without intValue():
int m = ArraysMethods.getAt(a, DefaultTypeTransformation.intUnbox(DefaultGroovyPPMethods.div(DefaultGroovyPPMethods.box(L + R), DefaultGroovyPPMethods.box(2))));
Good catch. We'll see how the next Groovy++ release compares. Alex and his team pay close attention to these type of tests ;-)
Nick, could you please update your results with following Groovy code.
I will follow up later with post why it is way faster
@Typed
package test
void swap(int[] a, int i, int j) {
def temp = a[i]
a[i] = a[j]
a[j] = temp
}
def quicksort(int[] a, int L, int R) {
int m = a[(L+R) >> 1]
int i=L
int j=R
while (i<=j) {
while (a[i]m) j--
if (i<=j) {
swap(a, i, j)
i++
j--
}
}
if (Li) quicksort(a,i,R)
}
def quicksort(int[] a) {
quicksort(a, 0, a.length-1)
}
// Sample data
int[] a = new int[1000000]
for (i in 0..> 1)+1
if (i%3==0) a[i] = -a[i]
}
int t1 = System.currentTimeMillis()
quicksort(a)
int t2 = System.currentTimeMillis()
println t2-t1
I for one would be very interested in how Fantom would do in such a test and if it would be as fast as Java and Scala.
Post a Comment