In this article we'll define a new operator that performs a completely new action, albeit one that can be performed by a method invocation. We'll define the ** operator with the meaning of "raise to a specific power". Thus, writing a**b will be the same as Math.pow(a,b). The operator ** can take numeric types as left and right operands and the return type will always be a double.
As a start, we must make the scanner and parser recognize our operator. This has been already discussed in the previous articles, so simply repeat the steps taken there. Initially, we won't be messing with precedence level, so give your new operator a precedence level of mulPrec (the same precedence as multiplication). Since the attribution phase checks that each operator has the appropriate parameters, you must add as many enterBinOp statements in SymTab as needed. The opcode you specify is irrelevant (as long as it is not error), since we are going to perform a tree translation before even that opcode is used. A sample definition could be:
enterBinop("**", doubleType, doubleType, doubleType, 0);
There are two ways of implementing a completely new binary operator : by translating the AST tree during the desugaring process or by emitting your own bytecodes. Now, bytecode emission is not a very pleasant activity, and it requires you to know the Java Virtual Machine Specification, so I'd suggest that whenever possible, take advantage of the translation process.
The tree translation must be performed at the appropriate step, considering two facts:
If the method to be called can raise a checked exception, the transformation must be done before the Flow step. Otherwise (as in our case), we can do that transformation in the desugaring step (Lower).
In the previous articles, we examined the steps taken by the compiler on the AST in order to produce the actual bytecodes, and one of them was a "desugar" process that converted "syntatic sugar" java constructs into simple and plain java code. This process is performed by the com.sun.tools.javac.comp.Lower class, which extends from the more general TreeTranslator class that you can use as a base to define your own translators. The translation process creates a new AST by visiting the original tree recursively, and using a basic rule that could be written as
translate( node ) {
for each child of node {
node.child = translate(node.child)
}
result = node
}
where "result" is a global instance variable
It is important to know that TreeTranslator does not apply translations to
Have this in mind if you want to make funny things like defining aliases for identifiers or chaning the meaning of literals.
A binary operator node (JCBinary class) has two children : rhs and lhs for (you guessed it :) the right hand side and the left hand side operands. So if you didn't want to perform any translation on the node itself, you would do something like:
tree.lhs = translate(tree.lhs); tree.rhs = translate(tree.rhs); result = tree;
Now the above code is not exactly correct and contains an error that has to be fixed, but for now we'll stick with it in order to understand the essence of the process. There's always time for fine-grained details.
Since the Lower class uses the visitor pattern, the translation of node XXXX is done in the visitXXX method. Thus, binary operator nodes are translated in visitBinary(), which looks like this:
0001public void visitBinary(JCBinary tree) { 0002 List<Type> formals = tree.operator.type.getParameterTypes(); 0003 JCTree lhs = tree.lhs = translate(tree.lhs, formals.head); 0004 switch (tree.tag) { 0005 case JCTree.OR: 0006 if (lhs.type.isTrue()) { 0007 result = lhs; 0008 return; 0009 } 0010 if (lhs.type.isFalse()) { 0011 result = translate(tree.rhs, formals.tail.head); 0012 return; 0013 } 0014 break; 0015 case JCTree.AND: 0016 if (lhs.type.isFalse()) { 0017 result = lhs; 0018 return; 0019 } 0020 if (lhs.type.isTrue()) { 0021 result = translate(tree.rhs, formals.tail.head); 0022 return; 0023 } 0024 break; 0025 } 0026 tree.rhs = translate(tree.rhs, formals.tail.head); 0027 result = tree; 0028} 0029
You can see some basic compiler optimizations here:
We want any a ** b expression to be translated to a call to Math.pow(a,b). When the translation process needs to buid new nodes, it takes two possible approaches:
The makeCall() method takes three parameters - the identifier on which you want to apply the method, the name of the method itself and a list of arguments, and returns a method invocation node. So in pseudocode, we'd like to do the following for a**b:
tree.lhs = translate(tree.lhs);
tree.rhs = translate(tree.rhs);
result = makeCall("Math","pow", {tree.lhs,tree.rhs} );
Of course, things aren't so simple:
So having in mind the things said above, we proceed to make the following changes to the visitBinary method:
public void visitBinary(JCBinary tree) { List<Type> formals = tree.operator.type.getParameterTypes(); JCTree lhs = tree.lhs = translate(tree.lhs, formals.head); switch (tree.tag) {case JCTree.POWER: Symbol math = rs.loadClass(attrEnv, names.fromString("java.lang.Math")); JCIdent identifier = make.Ident(math); tree.lhs = translate(tree.lhs); tree.rhs = translate(tree.rhs); List<JCExpression> arguments = List.of(tree.lhs,tree.rhs); result = makeCall( identifier, names.fromString("pow"), arguments ); return;case JCTree.OR: if (lhs.type.isTrue()) { result = lhs; return; } if (lhs.type.isFalse()) { ...
Now rebuild the compiler and try it with the following program:
public class Test { public static void main(String[] args) { System.out.println(5 ** (2**2)); } }
Finished? Not yet. As I mentioned before, writing tree.lhs = translate(tree.lhs) is not exactly right. In Java 5, boxing and autoboxing were introduced, meaning that things like this one work:
Integer i = new Integer(2); System.out.println(5 + i);
(the above code prints 5).
Now, this means that we should be able to write 5 ** i as well, but if you try this with the above youl'll get an error. This is because the translate() method we called before doesn't care about boxing/autoboxing - it just continues with the translation process. Whenever a node you have might require autoboxing / autounboxing, call translate(Tree t, Type type) instead,where type is the expected type. That translate method will see if the actual node type can be auto(un)boxed into the expected type, do it, and proceed furhter with the translation. How do you get the expected type of an operator? Call tree.operator.type.getParameterTypes(), which will return a List<Type>. The expected type of the lhs will be the first element of that list, and the expected type of the rhs will be the second element. So we can now fix the code we wrote as follows:
public void visitBinary(JCBinary tree) { List<Type> formals = tree.operator.type.getParameterTypes(); JCTree lhs = tree.lhs = translate(tree.lhs, formals.head); switch (tree.tag) { case JCTree.POWER: Symbol math = rs.loadClass(attrEnv, names.fromString("java.lang.Math")); JCIdent identifier = make.Ident(math);tree.lhs = translate(tree.lhs,formals.get(0)); tree.rhs = translate(tree.rhs,formals.get(1));List<JCExpression> arguments = List.of(tree.lhs,tree.rhs); result = makeCall( identifier, names.fromString("pow"), arguments ); ...
Or, if you are a lisp freak, you can use the tail and head properties of List, and write:
tree.lhs = translate(tree.lhs, formals.head); tree.rhs = translate(tree.rhs, formals.tail.head);
(this is the style used by the writers of the compiler, by the way). As a further optimization, note that the code for visitBinary() already computes the translation of the lhs and assigns it to tree.lhs before entering the switch, so you can directly skip that part in your code.
You should now be able to compile and run a class like this:
public class Test { public static void main(String[] args) { Integer i = new Integer(2); Short s = new Short((short)2); System.out.println(5 ** (i ** s)); } }
Now let's try to do some optimizations. Since 1something is always 1, we could check if the lhs operand of ** is 1, and if so,insert a literal node of 1 instead of calling Math.pow. For any node of the AST tree, you can get its type tag using node.type.tag. If the node has a constant value (known by the compiler), you can get it by calling node.type.constValue(). If we determine that our lhs operand is a constant 1, then we need not proceed further and we may substitute our subtree with a literal node of 1. To make a literal node, use the makeLit() helper method, which takes a type parameter and a value.
So we can now modify our code as follows:
public void visitBinary(JCBinary tree) { List<Type> formals = tree.operator.type.getParameterTypes(); JCTree lhs = tree.lhs = translate(tree.lhs, formals.head); switch (tree.tag) { case JCTree.POWER:if ( lhs.type.tag == TypeTags.INT && lhs.type.constValue() != null && lhs.type.constValue().equals(1)) { result = makeLit(Symtab.doubleType,new Double(1)); return; } Symbol math = rs.loadClass(attrEnv, names.fromString("java.lang.Math")); JCIdent identifier = make.Ident(math); tree.rhs = translate(tree.rhs,formals.get(1)); List<JCExpression> arguments = List.of(tree.lhs,tree.rhs); result = makeCall( identifier, names.fromString("pow"), arguments ); ...
Let's test it. Compile the following class:
public class Test { public static void main(String[] args) { System.out.println(1 ** (2 ** 5)); } }
And now let's use the handy disassembler included in javac on our compiled class, to see what was generated
javap -c Test
This will confirm that no call was produced:
F:\javac\compiler\dist\bin>javap -c Test
Compiled from "Test.java"
public class Test extends java.lang.Object{
public Test();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."":()V
4: return
public static void main(java.lang.String[]);
Code:
0: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
3: dconst_1
4: invokevirtual #3; //Method java/io/PrintStream.println:(D)V
7: return
}
The highlighted line (dconst_1) is the JVM instruction that stores a literal 1.
Be aware, though, that this optimization has side effects : whatever you put in the rhs of the operator will never be called:
public class Test { public static int getExponent() { System.out.println("You'll never see this"); return 5; } public static void main(String[] args) { System.out.println(1 ** (2 ** getExponent())); } }