Most of my technical interviews for Java developer position include a puzzle, where candidate should implement 2 very similar interfaces in a single class:
// Implement both interfaces in a single class if possible
// Explain why possible or not possible
interface WithPrimitiveInt {
void m(int i);
}
interface WithInteger {
void m(Integer i);
}
Sometimes candidates, not being sure about the right answer, are willing to solve the following puzzle instead (I give it to candidates later anyway):
interface S {
String m(int i);
}
interface V {
void m(int i);
}
Indeed, the latter puzzle appears to be much easier, and most of the candidates answer, that implementation of both
methods in a signle class shouldn’t be possible, because the signatures of S.m(int)
and V.m(int)
are the same while
return types are different. And this is absolutely correct.
Sometimes, though, I ask another question on the topic:
Do you think, it would make any sense to allow implementation of methods with the same method signature but different return types in a single class? Maybe, in some hypothetical JVM-based language or at least on a JVM level?
That’s kind of an open question and I do not expect a single correct answer here. But although I do not expect one, the correct answer exists. And a person, who worked with the reflections API a lot, performed bytecode manipulations or read JVM specification might know it.
Java method signature vs JVM method descriptor
Java method signature (i.e. method name and types of parameters) is only imposed by Java compiler during compilation. JVM, on the other hand, distinguishes methods in a class by a combination of the unqualified method name (simply the name of the method) and the method descriptor, that is a list of parameter descriptors and one return descriptor.
For example, if we wanted to invoke a method String m(int i)
directly on a class foo.Bar
, in the bytecode we’d need
to have:
INVOKEVIRTUAL foo/Bar.m (I)Ljava/lang/String;
and for void m(int i)
it would be:
INVOKEVIRTUAL foo/Bar.m (I)V
That said, JVM is perfectly fine with String m(int i)
and void m(int i)
in a single class. All we need to do, is to
generate proper bytecode.
Bytecode Kung Fu
We have interfaces S
and V
, let’s generate now a class SV
which implements both those interfaces. Representation
in Java, if it was allowed, should look like this:
public class SV implements S, V {
public void m(int i) {
System.out.println("void m(int i)");
}
public String m(int i) {
System.out.println("String m(int i)");
return null;
}
}
To generate bytecode we’ll use Objectweb ASM library, which is low-level enough to get a feeling of what JVM bytecode is.
Full source code is shared on github, here I’ll only list and explain essential snippets.
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES); 1
// package edio.java.experiments
// public class SV implements S, V
cw.visit(V1_7, ACC_PUBLIC, "edio/java/experiments/SV", null, "java/lang/Object",
new String[]{
"edio/java/experiments/S",
"edio/java/experiments/V"
}
); 2
// constructor
MethodVisitor constructor =
cw.visitMethod(ACC_PUBLIC, "<init>", "()V", null, null); 3
constructor.visitCode();
constructor.visitVarInsn(Opcodes.ALOAD, 0);
constructor.visitMethodInsn(
Opcodes.INVOKESPECIAL, "java/lang/Object", "<init>", "()V");
constructor.visitInsn(Opcodes.RETURN);
constructor.visitMaxs(1, 1);
constructor.visitEnd();
// public String m(int i)
MethodVisitor mString =
cw.visitMethod(ACC_PUBLIC, "m", "(I)Ljava/lang/String;", null, null);
mString.visitCode();
mString.visitFieldInsn(Opcodes.GETSTATIC, "java/lang/System", "out",
"Ljava/io/PrintStream;"); 4
mString.visitLdcInsn("String"); 5
mString.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "java/io/PrintStream", "println",
"(Ljava/lang/String;)V");
mString.visitInsn(Opcodes.ACONST_NULL); 6
mString.visitInsn(Opcodes.ARETURN);
mString.visitMaxs(2, 2);
mString.visitEnd();
// public void m(int i)
MethodVisitor mVoid = cw.visitMethod(ACC_PUBLIC, "m", "(I)V", null, null);
mVoid.visitFieldInsn(Opcodes.GETSTATIC, "java/lang/System", "out",
"Ljava/io/PrintStream;"); 4
mVoid.visitLdcInsn("void"); 5
mVoid.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "java/io/PrintStream", "println",
"(Ljava/lang/String;)V");
mVoid.visitInsn(Opcodes.RETURN); 6
mVoid.visitMaxs(2, 2);
mVoid.visitEnd();
cw.visitEnd();
- We start with creating
ClassWriter
to generate bytecode. - Then we declare a class, that implements interfaces
S
andV
. - Although, our reference pseudo-java code for
SV
didn’t contain any constructors, we must generate code for it anyway, if we do not declare constructors in Java, compiler implicitly generates empty constructor for us. - In the methods bodies we start by obtaining the
out
field of typejava.io.PrintStream
fromSystem
class and pushing it onto the operand stack - Then we load a constant (
"String"
or"void"
) onto the stack and invokeprintln
on an obtainedout
reference with the string constant as an argument. - Finally, for
String m(int i)
we push a constant of type reference with valuenull
to stack and use a correspondingly typedreturn
instruction,ARETURN
it is, to return a value back to a method caller. For thevoid m(int i)
we use untypedRETURN
to only jump back to a method caller without returning a value.
To verify, that our bytecode is correct (and I’ve been doing this all the time, iteratively fixing the issues), we write the generated class to a filesystem
Files.write(new File("/tmp/SV.class").toPath(), cw.toByteArray());
and use jad
(java decompiler) to turn bytecode back to java
$ jad -p /tmp/SV.class
The class file version is 51.0 (only 45.3, 46.0 and 47.0 are supported)
// Decompiled by Jad v1.5.8e. Copyright 2001 Pavel Kouznetsov.
// Jad home page: http://www.geocities.com/kpdus/jad.html
// Decompiler options: packimports(3)
package edio.java.experiments;
import java.io.PrintStream;
// Referenced classes of package edio.java.experiments:
// S, V
public class SV
implements S, V
{
public SV()
{
}
public String m(int i)
{
System.out.println("String");
return null;
}
public void m(int i)
{
System.out.println("void");
}
}
Close enough, I think.
Using generated class in runtime
Successful decompilation by jad
actually guarantees us nothing. jad
warns us if there are major problems with the
bytecode, like frame size to local variables discrepancy or missing return statement. But in general we can’t be sure
that our generated class will do any job in runtime.
To use generated class in runtime we need to load it somehow into JVM and then instantiate.
Let’s implement our own AsmClassLoader
. It is just a convenient wrapper around ClassLoader.defineClass
method:
public class AsmClassLoader extends ClassLoader {
public Class defineAsmClass(String name, ClassWriter classWriter) {
byte[] bytes = classWriter.toByteArray();
return defineClass(name, bytes, 0, bytes.length);
}
}
Now let’s use that classloader and instantiate the class:
ClassWriter cw = SVGenerator.generateClass();
AsmClassLoader classLoader = new AsmClassLoader();
Class<?> generatedClazz = classLoader.defineAsmClass(SVGenerator.SV_FQCN, cw);
Object o = generatedClazz.newInstance();
Since our class is generated in runtime, we can’t cast to it in our source code. We can cast to the implemented interfaces though. And non-reflective invocation becomes possible with this:
((S)o).m(1);
((V)o).m(1);
If we execute the code, the output will be:
String void
To some the output might seem unexpected: we call the same (from Java’s perspective) method on a class, but results differ depending on the interface we cast object to. Mind-blowing, isn’t it?
Things become clearer, if we think about the underlying bytecode. For the invocation we performed, compiler generates
INVOKEINTERFACE
instruction and the method descriptor comes not from the class, but from the interface.
Thus, for the first invocation we’ll have:
INVOKEINTERFACE edio/java/experiments/S.m (I)Ljava/lang/String;
and for the second one:
INVOKEINTERFACE edio/java/experiments/V.m (I)V
The object, on which invocation is performed, is obtained from the stack. And that is the power behind polymorphism in Java.
Bridge method is the name
One might ask: “So what is the point of that all? Will you ever use that kind of stuff in your code?”
The thing is that we do use this virtually every time we write usual Java code. For example, covariant return types, generics and access to private fields from inner classes are implemented using similar “magic” in bytecode.
Consider an interface:
public interface ZeroProvider {
Number getZero();
}
and its implementation returning a covariant type:
public class IntegerZero implements ZeroProvider {
public Integer getZero() {
return 0;
}
}
Let’s now think about the following code:
IntegerZero iz = new IntegerZero();
iz.getZero();
ZeroProvider zp = iz;
zp.getZero();
For the iz.getZero()
call compiler will generate INVOKEVIRTUAL
with ()Ljava/lang/Integer;
method descriptor, while
for the zp.getZero()
it will generate INVOKEINTERFACE
with ()Ljava/lang/Number;
method descriptor. We already
know, that JVM dispatches a call on the object by a method name and a method descriptor. Since descriptors are
different, those 2 calls can’t be dispatched to the same method in our IntegerZero
instance.
In fact, compiler generates one additional method, which acts as a bridge between the real method we declared in the class, and the method used during invocation via interface. Hence the name — bridge method. If only Java permitted this, the resulting code would look like:
public class IntegerZero implements ZeroProvider {
public Integer getZero() {
return 0;
}
// This is a synthetic bridge method, which is present only in bytecode.
// Java compiler wouldn't permit it.
public Number getZero() {
return this.getZero();
}
}
Afterword
Java programming language and Java Virtual Machine are not to be confused: although they share one common word in their names and although Java is the main language for JVM, their possibilities and limitations are not always the same. Knowing JVM helps a lot understanding Java or any other JVM-based language and knowing Java and its history, on the other hand, helps understanding certain decisions in JVM design.