Java TreeSet copying

There are (at least) three ways to shallow copy a TreeSet in Java: the copy constructor, addAll(), and clone(). A piece of code I’m running needs to copy a few fairly big TreeSets, so I was interested to see if there were any performance differences. It turns out that there are.


NavigableSet<String> originalSet = new TreeSet<>();

System.out.println("preparing");
for (int i = 0; i < 5000000; i++) {
originalSet.add(UUID.randomUUID().toString());
}
System.out.println("done");

System.out.println(originalSet.size());

for (int i = 0; i < 30; i++) {
long time = System.currentTimeMillis();
NavigableSet<String> copy1 = new TreeSet<>(originalSet);
System.out.println(1 + ": " + (System.currentTimeMillis() - time));

time = System.currentTimeMillis();
NavigableSet<String> copy2 = new TreeSet<>();
copy2.addAll(originalSet);
System.out.println(2 + ": " + (System.currentTimeMillis() - time));

time = System.currentTimeMillis();
NavigableSet<String> copy3 = (TreeSet<String>)((TreeSet<String>)originalSet).clone();
System.out.println(3 + ": " + (System.currentTimeMillis() - time));

So, we’re generating 5 000 000 random strings, and putting them into a TreeSet. We then measure the time taken to copy it three different ways, repeating 30 times to hopefully iron out effects of background noise.

Method Mean Std Dev
new TreeSet<>(originalSet) 1628 1241
addAll() 1611 1288
clone() 1788 1306

Conclusion: it would seem that there’s not much in it. I had a couple of earlier conclusions that implied otherwise, but I think these were just the effect of random noise. Look at those standard deviations: lots of variation in the results.

However, much better, I now think I can speed up my code by removing some of the need to copy these things, which is much better anyway!

Leave a Comment

Your email address will not be published.