Further Look At JVM Garbage Collection


14 June 2019 by Yan Yu

This is part 2 of my previous blog, if you haven't read the previous one, I strongly suggest you read that before diving into this blog.

In this blog, we will take a further look of JVM garbage collection (GC) - the available garbage collectors given by JVM or more precisely HotSpot JVM.
The diagram created by Jon Masamitsu presents an overview of available collectors.

collectors

Serial / Serial Old


Serial collector is a very old collector that runs in the young generation, when it performs garbage collection, all working threads will be stopped (the famous stop the world) until GC complete. It uses copying algorithm to perform GC.

Serial Old collector, as can be inferred from its name, is the tenured generation version, which uses the mark-compact algorithm. It has 2 usages when running in server mode:

  • Used together with Parallel Scavenge collector in JDK version <= 1.5
  • As a backup for CMS collector, used when Concurrent Mode Failure happens

Both serial/serial old are a single-threaded garbage collector.


Developers can specify -XX:+UseSerialGC when running programs to use Serial + Serial Old combination to do GC.

 

ParNew


ParNew is the multi-threaded version of serial collector. Aside from using multiple threads, it shares the same other characteristics with serial collector, e.g. parameters, algorithm, stop the world, etc.

By specifying -XX:+UseParNewGC, JVM will use ParNew + Serial Old combination to perform GC.Parallel

 

Scavenge / Old


Parallel Scavenge is a young generation multi-threaded collector that uses copying algorithm. It's similar to ParNew, but has one distinct feature with a focus on Throughput. So what is throughput? The definition is the ratio of time CPU spend on running application divided by the total time of CPU.

Throughput = Time spent on running application / 
(Time spent on running application + Time spent on garbage collection)

High throughput can effectively use CPU time, and finish the application's tasks quickly, ideally for programs running at the backend.

Parallel Old, the tenured generation version of Parallel Scavenge. It uses the mark-compact algorithm and first appeared in JDK 1.6.

-XX:+UseParallelGC, this is the default for JVM running in server mode, which uses Parallel Scavenge + Serial Old as a combination. But we can specify -XX:+UseParallelOldGC to use Parallel Scavenge + Parallel Old as a combination.

CMS


Concurrent Mark Sweep (CMS) is a collector aiming at shortening the pause of garbage collection. As the name tells us, it's using the mark-sweep algorithm under the hood. Since the goal of CMS is to achieve a short pause of garbage collection, which makes it ideal for running programs with lots of interactions with end users. This collector has 4 steps when doing GC:

  • Initial Mark
  • Concurrent Mark
  • Remark
  • Concurrent Sweep

Among the 4 steps, stop the world still happen in initial mark & remark steps. Since CMS is using the mark-sweep algorithm, it has a disadvantage that large memory fragments will be produced after GC, which in turn cause insufficient memory problem when allocating memory for a large object in tenured generation even though there are lots of memory available, this will trigger a full GC. To overcome this problem, CMS provides an option -XX:+UseCMSCompactAtFullCollection, which tells CMS to perform memory compaction after a Full GC complete, but this will add additional GC pause.


G1


G1 or Garbage First is a new GC collector first appeared in JDK 1.7 to replace CMS. It's using the mark-compact algorithm. Compared with other collectors introduced, it has several advantages:

  • Parallel: can effectively use multiprocessors, with the aid of multi-core, time of Stop The World can be reduced a lot.
  • Generations GC: G1 still has the concept of performing generations based garbage collection, the difference is that it can manage the entire JVM heap GC on its own without the need to be combined with other collectors.
  • G1 can compact space in tenured generation incrementally instead of as a whole by distributing the work across multiple shorter collections, which reduces pause time.


The End


Since JVM has so many of options related to GC we can play with, what's covered in the post is just a tiny fraction & summary of them. To get a better observation of how different options can affect your program, I suggest trying to supply different options when running your program, this way it can also help you get an idea on how to tune parameters to optimize program running.


Yan Yu
Yan Yu

Software Developer
Yan graduated from Miami University with a BS of Software Engineering and MS of Computer Science. He is a big fan of adventure sports, among which skiing and skydiving are his favorite. His ultimate goal is to start wingsuit flying one day. He is also a believer of Bitcoin & cryptocurrency, and is passionate about blockchain technology.