Differences Between CMS, G1GC, and ZGC – When Would You Choose Each in Production?

CMS (Concurrent Mark-Sweep):
CMS uses a generational heap layout, where memory is divided into the Young Generation (Eden + Survivor spaces) and Old Generation. It uses a contiguous layout, which can lead to heap fragmentation over time, especially in the Old Generation, since CMS does not compact memory after reclaiming it.
G1GC (Garbage First GC):
G1GC replaces the traditional generational model with a region-based heap layout. The heap is divided into equal-sized regions (not fixed as Young/Old at allocation time). Some regions are assigned dynamically as Eden, Survivor, or Old based on heuristics. This flexible model supports better compaction and scalability, reducing fragmentation over time.
ZGC (Z Garbage Collector):
ZGC also uses a region-based heap, but it takes it further by allowing very large heaps (up to 16 TB). Its layout includes colored pointers to support concurrent relocation and marking. ZGC’s memory layout is highly optimized for NUMA-aware and concurrent access, which makes it suitable for high-throughput and low-latency applications.

CMS:
CMS performs most of the Old Generation GC concurrently with application threads, but it still has two stop-the-world (STW) phases: Initial Mark and Remark. In high-throughput environments, the remark phase can cause noticeable pauses, especially under heavy load.
G1GC:
G1GC also has STW phases, but it allows developers to tune the maximum pause time using -XX:MaxGCPauseMillis. It attempts to meet this goal by breaking GC work into small tasks. While not always exact, it offers more predictable pause times compared to CMS.
ZGC:
ZGC is designed for sub-millisecond pause times, and it achieves this through extensive use of concurrent processing. Only very short STW phases exist (e.g., root scanning, reference processing), usually lasting less than 2 milliseconds, regardless of heap size. This makes it ideal for real-time, latency-sensitive systems.

CMS:
CMS does not perform compaction in the Old Generation. Over time, memory becomes fragmented, leading to allocation failures even if there’s enough total free space. This is known as a fragmentation-induced promotion failure, which may force a full GC (which is expensive and fully STW).
G1GC:
G1GC compacts memory as part of its evacuation process, which means live objects are moved into new regions during GC cycles. This keeps memory defragmented and compact, eliminating issues common in CMS. Compaction is parallel and incremental, reducing long pause times.
ZGC:
ZGC relocates objects concurrently, without stopping the application for long. It uses a technique called colored pointers along with load/store barriers, allowing ZGC to move live objects while threads continue to access them. As a result, ZGC completely avoids fragmentation, with no need for long pause-time compaction phases.

CMS:
CMS can offer high throughput since it does most work concurrently. However, under high allocation rates or if concurrent cycles can’t keep up, it may trigger Full GCs, which are costly and affect overall performance. It is generally considered less scalable for modern workloads.
G1GC:
G1GC balances throughput and latency well. It allows tuning via MaxGCPauseMillis, which controls how much CPU is dedicated to GC versus application threads. For most workloads, G1GC offers a good trade-off, making it suitable for general-purpose applications.
ZGC:
ZGC trades off a bit of throughput to deliver ultra-low pause times. Its barrier mechanisms introduce some overhead in read/write paths, especially when many objects are concurrently relocated. However, the cost is minimal for applications where latency matters more than raw throughput.

CMS:
CMS provides lower pause times than traditional GC, but not deterministic. As the heap grows and application threads increase, remark and sweep phases may lead to spikes in latency. Also, there’s no guarantee of pause time predictability under pressure.
G1GC:
G1GC is a latency-oriented GC with the ability to target specific pause durations. You can configure a goal, such as -XX:MaxGCPauseMillis=100, and G1GC will try to stay within it by adapting the size of collection sets and region processing. However, it’s still not truly real-time.
ZGC:
ZGC is designed for consistent and ultra-low latency. Its STW phases are so short (1-2 ms), that pause time is virtually independent of heap size or number of threads. For real-time systems or SLAs that require < 5ms latency, ZGC is the best choice available today.

CMS:
CMS is not suitable for large heaps. As heap size grows beyond 8–16 GB, the concurrent phases take longer, and fragmentation becomes a serious issue. It is unsuitable for systems with hundreds of GB or more RAM.
G1GC:
G1GC scales reasonably well up to heaps of 32–64 GB, and in some configurations, even more. It uses incremental compaction, and the region-based model keeps memory operations manageable. However, for very large heaps, pause times may still be noticeable.
ZGC:
ZGC is built from the ground up to support very large heaps (up to 16 TB). Its internal data structures (colored pointers, concurrent relocation) are optimized to handle huge memory footprints with consistent performance. Perfect for in-memory databases, big data analytics, or machine learning inference servers.

CMS:
CMS is highly tunable but also fragile. Getting optimal performance requires carefully setting dozens of flags (CMSInitiatingOccupancyFraction, UseCMSInitiatingOccupancyOnly, etc.). Misconfiguration can lead to frequent full GCs or promotion failures.
G1GC:
G1GC offers a much simpler tuning model, with fewer knobs needed for most applications. The most important parameter is -XX:MaxGCPauseMillis. It’s easier to manage in production and doesn’t require deep GC expertise.
ZGC:
ZGC is designed to be self-tuning and “just works” out-of-the-box. It minimizes the need for manual configuration. There are very few tuning options, making it excellent for DevOps/SRE teams who want predictable and automatic GC behavior.

CMS:
Deprecated since Java 9 and removed in Java 14. Only available in Java 8 and below. No longer maintained. Avoid in new applications.
G1GC:
Default GC in Java 9 through Java 14. Still supported and widely used in Java 11 LTS and Java 17 LTS. Recommended for most general-purpose server applications.
ZGC:
Introduced as experimental in Java 11, became stable in Java 15. Available and production-ready in Java 17 LTS and beyond. Ideal for modern low-latency, cloud-native, and real-time applications.