Thursday, February 11, 2010

Data immutability in Java

Shared mutable state is one of the murkiest areas of concurrent programming in Java. To tackle this issue it is generally advised to prefer data immutability over thread-safety as the latter is hard to manage in large and complex applications. In this post I am noting few measures to approach data immutability in Java.

1. Always declare instance-level (and class-level) data members as "final". Assign only immutable or thread-safe objects to these variables. (See point #6 below on sharing mutable assignments in a thread-safe way.)

While declaring instance variables as final you will have to instantiate them in the constructor. This may lead to constructor-based dependency injection and a declarative design, which is a good thing.

2. Declare local variables as "final". The only exceptions to this are the loop counter primitives for performance reasons:

for (int i = 0; i < 40; i++) {
// blah
}


Edit: This point will not help concurrency but it will make sure you don't accidentally bash the value in place.

3. Use immutable data structures unless you need to modify it immediately.


java.util.Collections.unmodifiableList(list)
java.util.Collections.unmodifiableSet(set)
java.util.Collections.unmodifiableMap(map)


Google Collections is useful for dealing with collections conveniently:

Edit: Consider using persistent collections as it enforces immutability at collection level without giving up efficiency.

4. For concurrent scenarios, use concurrency-optimized data structure implementations.


java.util.concurrent.CopyOnWriteArrayList
java.util.concurrent.CopyOnWriteArraySet
java.util.concurrent.ConcurrentHashMap
java.util.concurrent.ConcurrentLinkedQueue


Never use the Collections.synchronizedXXX() methods, as they will reduce concurrency to zero. I blogged about this earlier.

5. Use reference copying while constructing new collections from immutable collections.


public <T>List<T> add(List<T> old, T element) {
final List<T> newlist = new ArrayList<T>(old);
newlist.add(element);
return Collections.unmodifiableList(newlist);
}


Note: For large data structures this may have a performance penalty.

6. Shared mutable assignments should be atomic. You can use the built-in library in Java5+:


java.util.concurrent.atomic.AtomicBoolean
java.util.concurrent.atomic.AtomicInteger
java.util.concurrent.atomic.AtomicIntegerArray
java.util.concurrent.atomic.AtomicIntegerFieldUpdater
java.util.concurrent.atomic.AtomicLong
java.util.concurrent.atomic.AtomicLongArray
java.util.concurrent.atomic.AtomicLongFieldUpdater
java.util.concurrent.atomic.AtomicMarkableReference
java.util.concurrent.atomic.AtomicReference
java.util.concurrent.atomic.AtomicReferenceArray
java.util.concurrent.atomic.AtomicReferenceFieldUpdater
java.util.concurrent.atomic.AtomicStampedReference


7. Never "bash in place" - rather construct a new object upon every action that requires change in state. A desirable approach may look like this:


public class XYPos {
public final int x;
public final int y;
public class XYPos(final int ix, final int iy) {
this.x = ix;
this.y = iy;
}
}


public class Navigator {
public XYPos moveRight(final XYPos pos) { return new XYPos(pos.x + 1, pos.y); }
public XYPos moveLeft (final XYPos pos) { return new XYPos(pos.x - 1, pos.y); }
public XYPos moveUp (final XYPos pos) { return new XYPos(pos.x, pos.y + 1); }
public XYPos moveDown (final XYPos pos) { return new XYPos(pos.x, pos.y - 1); }
}


Your opinions, comments and feedback are welcome.

7 comments:

  1. You might be interested in Functional Java, which takes this thesis as far as Java allows (which is not very far in the greater picture).

    ReplyDelete
  2. Or just use a language where you start from immutability, like Clojure.... :)

    ReplyDelete
  3. @Tony I have noticed Functional Java and I agree, I wish Java as a language could allow more.

    @Alex Clojure is simply awesome! But as long as we are in Java... :-)

    ReplyDelete
  4. I see à few problems with your post.

    Declaring local variables final has nothing to do with concurrency control since stackframes (the container where local variables are stored). Personally i find final local variables causing a lot of syntactic clutter). So i almost never use them.

    There is à race problem in your mutable class in the equals method. It could be that the value is set to null by another thread after the if check on value is null.

    Also your tostring and hashcode are potential victems of npe's

    ReplyDelete
  5. @Peter All three of your points are valid - I have updated the post. Thanks! However, I feel the clutter issue is a matter of perspective and getting used to.

    ReplyDelete
  6. Hi shantuna,

    i havent encountered problems caused by the lack of final local variables, so i don't like to pay for something that has not caused problems on systems i have been working on. But i guess it is a matter of taste.

    About the fixes, there still is a problem with the hash and tostring.

    The whole class can be dropped i think, check the atomicreference for a more powerful alternative.

    ReplyDelete
  7. @Peter That's right - AtomicReference does a better job. I have removed the Mutable abstraction entirely.

    ReplyDelete

Disqus for Char Sequence