mar.
2012
Setting final fields like a boss
I've been playing recently with final fields, while working on my serialization library, Seren. And I've found a surprising, funny thing about them. You wouldn't believe. It's astonishing ! (tease, tease)
Final fields 101
But first, a quick reminder on what a final field is.
As you (should) know :
- A final field is introduced with the "final" keyword (duh).
- After its initial assignment, its value cannot be changed ever again. This feature is especially useful in multithreaded architectures, where the whole game consists of trying to prevent multiple threads to update the same data concurrently... No update, no problem !
- The initial assignment must occur before the end of the instance initialization.
Beware of point #2, as it is not as obvious as it looks. If the fields refers to an object (as opposed to a primitive type), it is the reference itself that cannot change (ie, cannot point to another object) ; the referenced object, on the other hand, can be modified (unless it is itself designed as immutable, but this is another topic). This is a fundamental difference with C's "const
" keyword.
Point #3 means that you must initialize the final field sometime before the constructor ends, like this :
public class FinalField { private final String message = "Hello world"; }
Or like this, in the constructor :
public class FinalField { private final String message; public FinalField() { this.message = "Hello world"; } }
Or even in an instance initialization block :
public class FinalField { private final String message; { this.message = "Hello world"; } }
Final fields and serialization
This looks all fine and easy, but there is one special case where final fields must be assigned : during de-serialization, where the object's constructor is not called. Fortunately, the standard serialization system has some superpowers and does not suffer the usual restrictions, so everything works as expected.
The trouble begins when you start to customize the serialization pipeline, by implementing the writeObject
and readObject
methods. If you manually saved a final field in the stream and want to restore it... you're basically screwed, because the compiler won't let you write the following :
public class FinalField implements Serializable { private final String message = "Hello world"; private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { this.message = "Hello world"; // Compiler kills a kitten here } }
Seren, my serialization library, had to work with arbitraty fields, even the final ones... So I tried different techniques to bypass this restriction.
Using reflection
Using reflection to update a final field is legal in Java 7, especially in the context of deserialization, as per the JLS §17.5.3 :
In some cases, such as deserialization, the system will need to change the final fields of an object after construction. final fields can be changed via reflection and other implementation-dependent means. The only pattern in which this has reasonable semantics is one in which an object is constructed and then the final fields of the object are updated. The object should not be made visible to other threads, nor should the final fields be read, until all updates to the final fields of the object are complete. Freezes of a final field occur both at the end of the constructor in which the final field is set, and immediately after each modification of a final field via reflection or other special mechanism.
This may not be a very robust solution, though, as this technique was allowed in some versions of the JVM, forbidden in the next, then allowed again...
Plus, reflection has a significant overhead on performance, readability and type safety.
OK, what do we have next ?
Using Unsafe
Java Champion Heinz Kabutz, of JavaSpecialists fame, then kindly pointed me to the Unsafe
class and provided a sample code to get me started.
The sun.misc.Unsafe
is an implementation-specific utility class that performs, you guessed it, unsafe operations. Obviously, it is not intended for general consumption, but this was all too tempting...
According to Heinz, the performance overhead is negligible. And portability does not seem to really be an issue either, at least on OpenJDK-compatible VMs : according to Doug Lea,
Portability of Unsafe constructions is not really a problem. We frequently communicate with all production JVM implementors to help ensure that they are correctly supported.
So why not. But I happended to find a better and simpler (and quite surprising) system before fully implementing the Unsafe
solution, so I don't really know how it would have turned out.
Using... nothing !
Remember how the compiler wouldn't let us directly assign a new value to the final field from within the readObject
method ? Would there still be a problem if... we didn't ask permission to the compiler at all ?
This is where I had an epiphany. Seren is a Java Agent, after all ; it uses byte-code engineering to enhance the serializable classes at load-time, that is, way after the compiler has checked for errors ! The only thing I didn't know, is whether the JVM itself would accept to actually reassign the field. Turned out, it did. Horray !
This came up as quite a surprise for everyone, I must say.. But it worked on Sun's JDK 5, 6 and 7, so this is how it is now implemented in Seren (any volunteer to test on exotic VMs ?).
Conclusion and pointers
As you can see, working on a technical library is quite fun and allows to learn a lot.
Here are some additional recommended resources :
- You can take a look at the final (pun intended) code on Seren's GitHub repository. Fork it, download it, play with it, it's free and easy to use !
- Also, I'll present a ''quickie'' session on Seren at Devoxx (April 18-20, Paris, France) !
- Finally, I recommend Heinz Kabutz's Java Specialists Master Course to learn about the gory details of serialization (and much more!). I present this course in French.
See you soon, and have fun with Java !
Commentaires
Interesting article. You're right pointing the fact that final and immutability are different notions. Many people are still confused with both terms.