Monday, 14 April 2008

Java: finding binary class dependencies with BCEL

Sometimes you need to find all the dependencies for a binary class. You might have a project that depends on a large product and want to figure out the minimum set of libraries to copy to create a build environment. You might want to check for missing dependencies during the kitting process.

package foo;

public class Bar implements Runnable {

  public void run() {
    System.out.println(java.io.File.class.getName());
    new somepackage.ExternalThing();
  }
  
}


The above code contains quite a few class dependencies. It implicitly extends java.lang.Object. It implements java.lang.Runnable. It uses an object out (java.io.PrintWriter) on java.lang.System. It also uses java.io.File and its class member (java.lang.Class). Lastly, it depends on somepackage.ExternalThing.

All these dependencies are included in J2SE except somepackage.ExternalThing, so this is the one we need to catch programmatically.

One way of accomplishing this is to use the Apache Foundation Byte Code Engineering Library. The BCEL can be used to do all sorts of things with class files and makes inspecting them really easy.

import org.apache.bcel.Repository;
import org.apache.bcel.classfile.ConstantClass;
import org.apache.bcel.classfile.ConstantPool;
import org.apache.bcel.classfile.DescendingVisitor;
import org.apache.bcel.classfile.EmptyVisitor;
import org.apache.bcel.classfile.JavaClass;

public class DependencyEmitter extends EmptyVisitor {

  private JavaClass javaClass;
  
  public DependencyEmitter(JavaClass javaClass) {
    this.javaClass = javaClass;
  }
  
  @Override
  public void visitConstantClass(ConstantClass obj) {
    ConstantPool cp = javaClass.getConstantPool();
    String bytes = obj.getBytes(cp);
    System.out.println(bytes);
  }
  
  public static void main(String[] argsthrows Exception {
    JavaClass javaClass = Repository.lookupClass("foo.Bar");
    DependencyEmitter visitor = new DependencyEmitter(javaClass);
    DescendingVisitor classWalker = new DescendingVisitor(javaClass, visitor);
    classWalker.visit();
  }

}


BCEL javadoc: http://jakarta.apache.org/bcel/apidocs/

EmptyVisitor is a do-nothing adapter implementation of the Visitor interface. By overriding the visitConstantClass method, we can be notified of any classes in the constant pool. Repository uses the classpath to locate classes, so the target class (foo.Bar) needs to be visible to the JVM.

This is the output:
foo/Bar
java/lang/Object
java/lang/Runnable
java/lang/System
java/io/File
java/lang/Class
java/io/PrintStream
somepackage/ExternalThing


One character substitution and we have the fully qualified class names for our dependencies. There is still some more work to do - special handling for array types, loading classes outside the JVM and so on, but the potential should be obvious.

Versions used:
Java 6 (jre1.6.0_03)
BCEL 5.2

2 comments:

  1. This code has the draw babk like if you use some thing like String s="abc" or MyClass m=null; then, this code not considers the String or MyClass.

    In order to consider as the class used it should be used with new operator like String s=new String("abc")

    ReplyDelete
  2. @Anonymous - you are correct - the code will only pick up classes defined in the constant pool.

    In the case of a reference like this...

    public void myMethod() {
     MyClass x = null;
     System.out.println(x);
    }

    ...MyClass will not be referenced in the resultant bytecode (you can verify this using javap).

    In the case of code like this...

    public void myMethod() {
     String x = "abc";
     System.out.println(x);
    }

    ...the class will load the literal from the constant pool (via CONSTANT_String), so the String class isn't explicitly referenced.

    Similar things happen if you reference something like Integer.MAX_VALUE. This static final value will be in-lined at compile time, so the byte code stores only the value (CONSTANT_Integer) in its constant pool.

    I suspect it is technically impossible to pick up all compile-time dependencies by inspecting the byte code, but you should be able to pick up any runtime dependencies. If you absolutely must pick up every java.lang dependency, you may need to infer it from other values in the constant pool.

    ReplyDelete

All comments are moderated