Overloading that is not permitted or Java bridge methods

Most of my technical interviews for Java developer position include a puzzle, where candidate should implement 2 very similar interfaces in a single class:

// Implement both interfaces in a single class if possible
// Explain why possible or not possible

interface WithPrimitiveInt {
  void m(int i);
}

interface WithInteger {
  void m(Integer i);
}

Sometimes candidates, not being sure about the right answer, are willing to solve the following puzzle instead (I give it to candidates later anyway):

interface S {
  String m(int i);
}

interface V {
  void m(int i);
}

Indeed, the latter puzzle appears to be much easier, and most of the candidates answer, that implementation of both methods in a signle class shouldn’t be possible, because the signatures of S.m(int) and V.m(int) are the same while return types are different. And this is absolutely correct.

Sometimes, though, I ask another question on the topic:

Do you think, it would make any sense to allow implementation of methods with the same method signature but different return types in a single class? Maybe, in some hypothetical JVM-based language or at least on a JVM level?

That’s kind of an open question and I do not expect a single correct answer here. But although I do not expect one, the correct answer exists. And a person, who worked with the reflections API a lot, performed bytecode manipulations or read JVM specification might know it.

Java method signature vs JVM method descriptor

Java method signature (i.e. method name and types of parameters) is only imposed by Java compiler during compilation. JVM, on the other hand, distinguishes methods in a class by a combination of the unqualified method name (simply the name of the method) and the method descriptor, that is a list of parameter descriptors and one return descriptor.

For example, if we wanted to invoke a method String m(int i) directly on a class foo.Bar, in the bytecode we’d need to have:

INVOKEVIRTUAL foo/Bar.m (I)Ljava/lang/String;

and for void m(int i) it would be:

INVOKEVIRTUAL foo/Bar.m (I)V

That said, JVM is perfectly fine with String m(int i) and void m(int i) in a single class. All we need to do, is to generate proper bytecode.

Bytecode Kung Fu

We have interfaces S and V, let’s generate now a class SV which implements both those interfaces. Representation in Java, if it was allowed, should look like this:

public class SV implements S, V {
  public void m(int i) {
    System.out.println("void m(int i)");
  }
  public String m(int i) {
    System.out.println("String m(int i)");
    return null;
  }
}

To generate bytecode we’ll use Objectweb ASM library, which is low-level enough to get a feeling of what JVM bytecode is.

Full source code is shared on github, here I’ll only list and explain essential snippets.

ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES); (1)

// package edio.java.experiments
// public class SV implements S, V
cw.visit(V1_7, ACC_PUBLIC, "edio/java/experiments/SV", null, "java/lang/Object",
    new String[]{
      "edio/java/experiments/S",
      "edio/java/experiments/V"
    }
); (2)

// constructor
MethodVisitor constructor =
    cw.visitMethod(ACC_PUBLIC, "<init>", "()V", null, null); (3)
constructor.visitCode();
constructor.visitVarInsn(Opcodes.ALOAD, 0);
constructor.visitMethodInsn(
    Opcodes.INVOKESPECIAL, "java/lang/Object", "<init>", "()V");
constructor.visitInsn(Opcodes.RETURN);
constructor.visitMaxs(1, 1);
constructor.visitEnd();

// public String m(int i)
MethodVisitor mString =
    cw.visitMethod(ACC_PUBLIC, "m", "(I)Ljava/lang/String;", null, null);
mString.visitCode();
mString.visitFieldInsn(Opcodes.GETSTATIC, "java/lang/System", "out",
    "Ljava/io/PrintStream;"); (4)
mString.visitLdcInsn("String"); (5)
mString.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "java/io/PrintStream", "println",
    "(Ljava/lang/String;)V");
mString.visitInsn(Opcodes.ACONST_NULL); (6)
mString.visitInsn(Opcodes.ARETURN);
mString.visitMaxs(2, 2);
mString.visitEnd();

// public void m(int i)
MethodVisitor mVoid = cw.visitMethod(ACC_PUBLIC, "m", "(I)V", null, null);
mVoid.visitFieldInsn(Opcodes.GETSTATIC, "java/lang/System", "out",
    "Ljava/io/PrintStream;"); (4)
mVoid.visitLdcInsn("void"); (5)
mVoid.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "java/io/PrintStream", "println",
    "(Ljava/lang/String;)V");
mVoid.visitInsn(Opcodes.RETURN); (6)
mVoid.visitMaxs(2, 2);
mVoid.visitEnd();

cw.visitEnd();
  1. We start with creating ClassWriter to generate bytecode.

  2. Then we declare a class, that implements interfaces S and V.

  3. Although, our reference pseudo-java code for SV didn’t contain any constructors, we must generate code for it anyway, if we do not declare constructors in Java, compiler implicitly generates empty constructor for us.

  4. In the methods bodies we start by obtaining the out field of type java.io.PrintStream from System class and pushing it onto the operand stack

  5. Then we load a constant ("String" or "void") onto the stack and invoke println on an obtained out reference with the string constant as an argument.

  6. Finally, for String m(int i) we push a constant of type reference with value null to stack and use a correspondingly typed return instruction, ARETURN it is, to return a value back to a method caller. For the void m(int i) we use untyped RETURN to only jump back to a method caller without returning a value.

To verify, that our bytecode is correct (and I’ve been doing this all the time, iteratively fixing the issues), we write the generated class to a filesystem

Files.write(new File("/tmp/SV.class").toPath(), cw.toByteArray());

and use jad (java decompiler) to turn bytecode back to java

$ jad -p /tmp/SV.class
The class file version is 51.0 (only 45.3, 46.0 and 47.0 are supported)
// Decompiled by Jad v1.5.8e. Copyright 2001 Pavel Kouznetsov.
// Jad home page: http://www.geocities.com/kpdus/jad.html
// Decompiler options: packimports(3)

package edio.java.experiments;

import java.io.PrintStream;

// Referenced classes of package edio.java.experiments:
//            S, V

public class SV
    implements S, V
{

    public SV()
    {
    }

    public String m(int i)
    {
        System.out.println("String");
        return null;
    }

    public void m(int i)
    {
        System.out.println("void");
    }
}

Close enough, I think.

Using generated class in runtime

Successful decompilation by jad actually guarantees us nothing. jad warns us if there are major problems with the bytecode, like frame size to local variables discrepancy or missing return statement. But in general we can’t be sure that our generated class will do any job in runtime.

To use generated class in runtime we need to load it somehow into JVM and then instantiate.

Let’s implement our own AsmClassLoader. It is just a convenient wrapper around ClassLoader.defineClass method:

public class AsmClassLoader extends ClassLoader {
  public Class defineAsmClass(String name, ClassWriter classWriter) {
    byte[] bytes = classWriter.toByteArray();
    return defineClass(name, bytes, 0, bytes.length);
  }
}

Now let’s use that classloader and instantiate the class:

ClassWriter cw = SVGenerator.generateClass();
AsmClassLoader classLoader = new AsmClassLoader();
Class<?> generatedClazz = classLoader.defineAsmClass(SVGenerator.SV_FQCN, cw);
Object o = generatedClazz.newInstance();

Since our class is generated in runtime, we can’t cast to it in our source code. We can cast to the implemented interfaces though. And non-reflective invocation becomes possible with this:

((S)o).m(1);
((V)o).m(1);

If we execute the code, the output will be:

String
void

To some the output might seem unexpected: we call the same (from Java’s perspective) method on a class, but results differ depending on the interface we cast object to. Mind-blowing, isn’t it?

Things become clearer, if we think about the underlying bytecode. For the invocation we performed, compiler generates INVOKEINTERFACE instruction and the method descriptor comes not from the class, but from the interface.

Thus, for the first invocation we’ll have:

INVOKEINTERFACE edio/java/experiments/S.m (I)Ljava/lang/String;

and for the second one:

INVOKEINTERFACE edio/java/experiments/V.m (I)V

The object, on which invocation is performed, is obtained from the stack. And that is the power behind polymorphism in Java.

Bridge method is the name

One might ask: “So what is the point of that all? Will you ever use that kind of stuff in your code?”

The thing is that we do use this virtually every time we write usual Java code. For example, covariant return types, generics and access to private fields from inner classes are implemented using similar “magic” in bytecode.

Consider an interface:

public interface ZeroProvider {
  Number getZero();
}

and its implementation returning a covariant type:

public class IntegerZero implements ZeroProvider {
  public Integer getZero() {
    return 0;
  }
}

Let’s now think about the following code:

IntegerZero iz = new IntegerZero();
iz.getZero();

ZeroProvider zp = iz;
zp.getZero();

For the iz.getZero() call compiler will generate INVOKEVIRTUAL with ()Ljava/lang/Integer; method descriptor, while for the zp.getZero() it will generate INVOKEINTERFACE with ()Ljava/lang/Number; method descriptor. We already know, that JVM dispatches a call on the object by a method name and a method descriptor. Since descriptors are different, those 2 calls can’t be dispatched to the same method in our IntegerZero instance.

In fact, compiler generates one additional method, which acts as a bridge between the real method we declared in the class, and the method used during invocation via interface. Hence the name — bridge method. If only Java permitted this, the resulting code would look like:

public class IntegerZero implements ZeroProvider {
  public Integer getZero() {
    return 0;
  }

  // This is a synthetic bridge method, which is present only in bytecode.
  // Java compiler wouldn't permit it.
  public Number getZero() {
    return this.getZero();
  }
}

Afterword

Java programming language and Java Virtual Machine are not to be confused: although they share one common word in their names and although Java is the main language for JVM, their possibilities and limitations are not always the same. Knowing JVM helps a lot understanding Java or any other JVM-based language and knowing Java and its history, on the other hand, helps understanding certain decisions in JVM design.