Wednesday, 11 April 2018

JDK 11 and proxies in a world past sun.misc.Unsafe

With JDK 11 the first methods of sun.misc.Unsafe are retired. Among them, the defineClass method was removed. This method has been commonly used by code generation frameworks to define new classes in existing class loaders. While this method was convenient to use, its existence also rendered the JVM inherently unsafe, just as the name of its defining class suggests. By allowing a class to be defined in any class loader and package, it became possible to gain package-scoped access to any package by defining a class within it, thus breaching the boundaries of an otherwise encapsulated package or module.

With the goal of removing sun.misc.Unsafe, the OpenJDK started offering an alternative for defining classes at runtime. Since version 9, the MethodHandles.Lookup class offers a method defineClass similar to the unsafe version. However, the class definition is only permitted for a class that resides in the same package as the lookup’s hosting class. As a module can only resolve lookups for packages that are owned by a module or that are opened to it, classes can no longer be injected into packages that did not intend to give such access.

Oracle Java Tutorials and Materials, Oracle Java Certifications

Using method handle lookups, a class foo.Qux can be defined during runtime as follows:

MethodHandles.Lookup lookup = MethodHandles.lookup();
MethodHandles.Lookup privateLookup = MethodHandles.privateLookupIn(foo.Bar.class, lookup);
byte[] fooQuxClassFile = createClassFileForFooQuxClass();
privateLookup.defineClass(fooQuxClassFile);

In order to perform a class definition, an instance of MethodHandles.Lookup is required which can be retrieved by invoking the MethodHandles::lookup method. Invoking the latter method is call site sensitive; the returned instance will therefore represent the privileges of the class and package from within the method is invoked. To define a class in another package then the current one, a class from this package is required to resolve against it using MethodHandles::privateLookupIn. This will only be possible if this target class’s package resides in the same module as the original lookup class or if this package is explicitly opened to the lookup class’s module. If those requirements are not met, attempting to resolving the private lookup throws an IllegalAccessException, protecting the boundaries that are implied by the JPMS.

Of course, code generation libraries are also constrained by this limitation. Otherwise they could be used to create and inject malicious code. And since the creation of method handles is call-site sensitive, it is not possible to incorporate the new class definition mechanism without requiring users to do some additional work by providing an appropriate lookup instance that represents the privileges of their module.

When using Byte Buddy, the required changes are fortunately minimal. The library defines classes using a ClassDefinitionStrategy which is responsible for the loading a class from its binary format. Prior to Java 11, a class could be defined using reflection or sun.misc.Unsafe using ClassDefinitionStrategy.Default.INJECTION. To support Java 11, this strategy needs to be replaced by ClassDefinitionStrategy.UsingLookup.of(lookup) where the provided lookup must have access to the package in which a classes would reside.

Migrating cglib proxies to Byte Buddy


As of today, other code generation libraries do not provide such a mechanism and it is uncertain when and if such capabilities are added. Especially for cglib, API changes have proven problematic in the past due to the libraries old age and widespread use in legacy applications that are no longer updated and would not adopt modifications. For users that want to adopt Byte Buddy as a more modern and actively developed alternative, the following segment will therefore describe a possible migration.

As an example, we generate a proxy for the following sample class with a single method:

public class SampleClass {
  public String test() {
    return "foo";
  }
}

To create a proxy, the proxied class is normally subclassed where all methods are overridden to dispatch the interception logic. Doing so, we append a value bar to the return value of the original implementation as an example.

A cglib proxy is typically defined using the Enhancer class in combination with an MethodInterceptor. A method interceptor supplies the proxied instance, the proxied method and its arguments. Finally, it also provides an instance of MethodProxy which allows to invoke the original code.

Enhancer enhancer = new Enhancer();
enhancer.setSuperclass(SampleClass.class);
enhancer.setCallback(new MethodInterceptor() {
  @Override
  public Object intercept(Object obj, Method method, Object[] args, MethodProxy proxy) {
    return proxy.invokeSuper(obj, method, args) + "bar";
  }
});
SampleClass proxy = (SampleClass) enhancer.create();
assertEquals("foobar", proxy.test());

Note that the above code will cause a problem if any other method such as hashCode, equals or toString was invoked on the proxy instance. The first two methods would also be dispatched by the interceptor and therefore cause a class cast exception when cglib attemted to return the string-typed return value. In contrast, the toString method would work but return an unexpected result as the original implementation was prefixed to bar as a return value.

In Byte Buddy, proxies are not a dedicated concept but can be defined using the library’s generic code generation DSL. For an approach that is the most similar to cglib, using a MethodDelegation offers the easiest migration path. Such a delegation targets a user-defined interceptor class to which method calls are dispatched:

public class SampleClassInterceptor {
  public static String intercept(@SuperCall Callable<String> zuper) throws Exception {
    return zuper.call() + "bar";
  }
}

The above interceptor first invokes the original code via a helper instance that is provided by Byte Buddy on demand. A delegation to this interceptor is implemented using Byte Buddy’s code generation DSL as follows:

SampleClass proxy = new ByteBuddy()
  .subclass(SampleClass.class)
  .method(ElementMatchers.named("test"))
  .intercept(MethodDelegation.to(SampleClassInterceptor.class))
  .make()
  .load(someClassLoader, ClassLoadingStrategy.UsingLookup.of(MethodHandles
      .privateLookupIn(SampleClass.class, MethodHandles.lookup()))
  .getLoaded()
  .getDeclaredConstructor()
  .newInstance();
assertEquals("foobar", proxy.test());

Other than cglib, Byte Buddy requires to specify a method filter using an ElementMatcher. While filtering is perfectly possible in cglib, it is quite cumbersome and not explicitly required and therefore easily forgotten. In Byte Buddy, all methods can still be intercepted using the ElementMatchers.any() matcher but by requiring to specify such a matcher, users are hopefully reminded to make a meaningful choice.

With the above matcher, any time a method named test is invoked, the call will be delegated to the specified interceptor using a method delegation as discussed.

The interceptor that was introduced would however fail to dispatch methods that do not return a string instance. As a matter of fact, the proxy creation would yield an exception issued by Byte Buddy. It is however perfectly possible to define a more generic interceptor that can be applied to any method similar to the one offered by cglib’s MethodInterceptor:

public class SampleClassInterceptor {
  @RuntimeType
  public static Object intercept(
      @Origin Method method,
      @This Object self,
      @AllArguments Object[] args,
      @SuperCall Callable<String> zuper
  ) throws Exception {
    return zuper.call() + "bar";
  }
}

Of course, since the additional arguments of the interceptor are not used in this case, they can be omitted what renders the proxy more efficient. Byte Buddy will only provide arguments on demand and if they are actually required.

As the above proxy is stateless, the interception method is defined to be static. Again, this is an easy optimization as Byte Buddy otherwise needs to define a field in the proxy class that holds a reference to the interceptor instance. If an instance is however required, a delegation can be directed to a member method of an instance using MethodDelegation.to(new SampleClassInterceptor()).

Caching proxy classes for performance


When using Byte Buddy, proxy classes are not automatically cached. This means that a new class is generated and loaded every time the above code is run. With code generation and class definition being expensive operations, this is of course inefficient and should be avoided if proxy classes can be reused. In cglib, a previously generated class is returned if the input is identical for two enhancements what is typically true when running the same code segment twice. This approach is however rather error prone and often inefficient since a cache key can normally be calculated much easier. With Byte Buddy, a dedicated caching library can be used instead, if such a library already is available. Alternatively, Byte Buddy also offers a TypeCache that implements a simple cache for classes by a user-defined cache key. For example, the above class generation can be cached using the base class as a key using the following code:

TypeCache<Class<?>> typeCache = new TypeCache<>(TypeCache.Sort.SOFT);
Class<?> proxyType = typeCache.findOrInsert(classLoader, SampleClass.class, () -> new ByteBuddy()
  .subclass(SampleClass.class)
  .method(ElementMatchers.named("test"))
  .intercept(MethodDelegation.to(SampleClassInterceptor.class))
  .make()
  .load(someClassLoader, ClassLoadingStrategy.UsingLookup.of(MethodHandles
      .privateLookupIn(SampleClass.class, MethodHandles.lookup()))
  .getLoaded()
});

Unfortunately, caching classes in Java brings some caveats. If a proxy is created, it does of course subclass the class it proxies what makes this base class ineligible for garbage collection. Therefore, if the proxy class was referenced strongly, the key would also be referenced strongly. This would render the cache useless and open for memory leaks. Therefore, the proxy class must be referenced softly or weakly what is specified by the constructor argument. In the future, this problem might be resolved if Java introduced ephemerons as a reference type. At the same time, if garbage collection of proxy classes is not an issue, a ConcurrentMap can be used to compute a value on absence.

Broaden the usability of proxy classes


To embrace reuse of proxy classes, it is often meaningful to refactor proxy classes to be stateless and to rather isolate state into an instance field. This field can then be accessed during the interception using the mentioned dependency injection mechanism, for example, to make the suffix value configurable per proxy instance:

public class SampleClassInterceptor {
  public static String intercept(@SuperCall Callable<String> zuper,
        @FieldValue("qux") String suffix) throws Exception {
    return zuper.call() + suffix;
  }
}

The above interceptor now receives the value of a field qux as a second argument which can be declared using Byte Buddy’s type creation DSL:

TypeCache<Class<?>> typeCache = new TypeCache<>(TypeCache.Sort.SOFT);
Class<?> proxyType = typeCache.findOrInsert(classLoader, SampleClass.class, () -> new ByteBuddy()
    .subclass(SampleClass.class)
    .defineField(“qux”, String.class, Visibility.PUBLIC)
    .method(ElementMatchers.named("test"))
    .intercept(MethodDelegation.to(SampleClassInterceptor.class))
    .make()
    .load(someClassLoader, ClassLoadingStrategy.UsingLookup.of(MethodHandles
        .privateLookupIn(SampleClass.class, MethodHandles.lookup()))
    .getLoaded()
});

The field value can now be set on an every instance after its creation using Java reflection. In order to avoid reflection, the DSL can also be used to implement some interface that declares a setter method for the mentioned field which can be implemented using Byte Buddy’s FieldAccessor implementation.

Weighting Proxy runtime and creation performance


Finally, when creating proxies using Byte Buddy, some performance considerations need to be made. When generating code, there exists a trade-off between the performance of the code generation itself and the runtime performance of the generated code. Byte Buddy typically aims for creating code that runs as efficiently as possible what might require additional time for the creation of such code compared to cglib or other proxing libraries. This bases on the assumption that most applications run for a long time but only creates proxies a single time what does however not hold for all types of applications.

As an important difference to cglib, Byte Buddy generates a dedicated super call delegate per method that is intercepted rather then a single MethodProxy. These additional classes take more time to create and load but having these classes available results in better runtime performance for each method execution. If a proxied method is invoked in a loop, this difference can quickly be crucial. If runtime performance is however not a primary goal and it is more important that the proxy classes are created in short time, the following approach avoids the creating of additional classes altogether:

public class SampleClassInterceptor {
  public static String intercept(@SuperMethod Method zuper,
        @This Object target,
        @AllArguments Object[] arguments) throws Exception {
    return zuper.invoke(target, arguments) + "bar";
  }
}

Proxies in a modularized environment


Using the simple form of dependency-injection for interceptors rather then relying on a library-specific type such as cglib’s
MethodInterceptor, Byte Buddy facilitates another advantage in a modularized environment: since the generated proxy class will reference the interceptor class directly rather then referencing a library specific dispatcher type such as cglib’s MethodInterceptor, the proxied class’s module does not need to read Byte Buddy’s module. With cglib, the proxied class module must read cglib’s module which defines the MethodInterceptor interface rather then the module that implements such an interface. This will most likely be non-intuitive for users of a library that uses cglib as a transitive dependency, especially if the latter dependency is treated as an implementation detail that should not be exposed.

In some cases, it might not even be possible or desirable that the proxied class’s module reads the module of the framework which supplies the interceptor. For this case, Byte Buddy also offers a solution to avoid such a dependency altogether by using its
Advice component. This component works on code templates such as that in the following example:

public class SampleClassAdvice {
  @Advice.OnMethodExit
  public static void intercept(@Advice.Returned(readOnly = false) String returned) {
    returned += "bar";
  }
}

The above code might not appear to make much sense as it stands and as a matter of fact, it will never be executed. The class merely serves as a byte code template to Byte Buddy which reads the byte code of the annotated method which is then inlined into the generated proxy class. To do so, every parameter of the above method must be annotated to represent a value of the proxied method. In the above case, the annotation defines the parameter to define the method’s return value to which bar is appended as a suffix given the template. Given this advice class, a proxy class could be defined as follows:

new ByteBuddy()
  .subclass(SampleClass.class)
  .defineField(“qux”, String.class, Visibility.PUBLIC)
  .method(ElementMatchers.named(“test”))
  .intercept(Advice.to(SampleClassAdvice.class).wrap(SuperMethodCall.INSTANCE))
  .make()

By wrapping the advice around a SuperMethodCall, the above advice code will be inlined after the call to the overridden method has been made. To inline code before the original method call, the OnMethodEnter annotation can be used.

Supporting proxies on Java versions prior to 9 and past 10


When developing applications for the JVM, one can normally rely on applications that run on a particular version to also run on later versions. This has been true for a long time, even if internal API has been used. However, as a consequence of removing this internal API, this is no longer true as of Java 11 where code generation libraries that have relied on sun.misc.Unsafe will no longer work. At the same time, class definition via MethodHandles.Lookup is not available to JVMs prior to version 9.

As for Byte Buddy, it is the responsibility of a user to use a class loading strategy that is compatible with the current JVM. To support all JVMs, the following selection needs to be made:

ClassLoadingStrategy<ClassLoader> strategy;
if (ClassInjector.UsingLookup.isAvailable()) {
  Class<?> methodHandles = Class.forName("java.lang.invoke.MethodHandles");
  Object lookup = methodHandles.getMethod("lookup").invoke(null);
  Method privateLookupIn = methodHandles.getMethod("privateLookupIn",
      Class.class,
      Class.forName("java.lang.invoke.MethodHandles$Lookup"));
  Object privateLookup = privateLookupIn.invoke(null, targetClass, lookup);
  strategy = ClassLoadingStrategy.UsingLookup.of(privateLookup);
} else if (ClassInjector.UsingReflection.isAvailable()) {
  strategy = ClassLoadingStrateg.Default.INJECTION;
} else {
  throw new IllegalStateException(“No code generation strategy available”);
}

The above code uses reflection to resolve a method handle lookup and to resolve it. Doing so, the code can be compiled and loaded on JDKs prior to Java 9. Unfortunately, Byte Buddy cannot implement this code as a convenience since MethodHandles::lookup is call site sensitive such that the above must be defined in a class that resides in the user’s module and not within Byte Buddy.

Finally, it is worth considering to avoid class injection altogether. A proxy class can also be defined in a class loader of its own using the ClassLoadingStrategy.Default.WRAPPER strategy. This strategy is not using any internal API and will work on any JVM version. However, one must keep in mind the performance costs of creating a dedicated class loader. And finally, even if the package name of the proxy class is equal to the proxied class, by defining the proxy in a different class loaders, their runtime packages will no longer be considered as equal by the JVM thus not allowing to override any package-private methods.

Final thoughts


On a final note I want to express my opinion that retiring sun.misc.Unsafe is an important step towards a safer, modularized JVM despite the costs of this migration. Until this very powerful class is removed, any boundaries set by the JPMS can be circumvented by using the privileged access that sun.misc.Unsafe still offers. Without this removal, the JPMS costs all the inconvenience of additional encapsulation without the benefit of being able to rely on it.

Most developers on the JVM will most likely never experience any problems with these additional restrictions but as described, code generation and proxying libraries need to adapt these changes. For cglib, this does unfortunately mean that the end of the road is reached. Cglib was originally modeled as a more powerful version of Java’s built-in proxying API where it requires its own dispatcher API to be referenced by the proxy class similar to how Java’s API requires referencing of its types. However, these latter types reside in the java.base module which is always read by any module. For this reason, the Java proxying API still functions whereas the cglib model was broken irreparably. In the past, this has already made cglib a difficult candidate for OSGi enviornments but with the JPMS, cglib as a library does no longer function. A similar problem exists for the corresponding proxying API that is provided by Javassist.

The upside of this change is that the JVM finally offers a stable API for defining classes during an application’s runtime, a common operation that has relied on internal API for over twenty years. And with the exception of Javaagents that I think still require a more flexible approach, this means that future Java releases are guaranteed to always work once all users of proxies completed this final migration. And given that the development of cglib has been dormant for years with the library suffering many limitations, an eventual migration by today’s users of the library was inevitable in any case. The same might be true for Javassist proxies, as the latter library has not seen commits in almost a half year either.