Java memory management does not specify any methodology rather mention objective of GC which cleans up JVM to free memory. GC is a background thread which can become active with higher priority at a non-predictable time. But developers have programmatic options to force a GC at required time.
System.gc() or Runtime.gc()
Although, separate GC methodology frees up JVM from memory management and let it stay focus on sheer execution of program, but GC has overhead of tracking object life-cycle and doing the cleanup behind the scene without disrupting program execution.
GC has three major responsibilities:
- Object allocation and Cleanup
- Preventing Heap fragmentation
- Reducing page fault in virtual memory
GC main job is
detect the garbage object and reclaim the heap by destroying these objects. In JVM, all the objects are under one root and if object is reachable from this root they are live object otherwise garbage. So,
reachability from is the root is the measure of GC operations.
For GC, its all about ref, if you have, stay alive, or get killed. And there are three types of ref can be maintained to stay alive
- Direct Object Ref.
- Stack variable obj ref.
- Constant pool ref. - string or objs.
Reference counting is a continuous process for GC for each object and it has to keep updating it. Here I present
first GC version algorithm for GC.
if (obj = new()) refCount++;
if (obj = objRef) refCount++;
if (obj = null) refCount--;
if (obj = newObjRef) refCount--;
if (refCount == 0) {
this.GC();
removeRefThis();
}
This algorithm is good for real-time applications where pause to do GC is prohibited. But this has many disadvantages like it may fail for cyclic reference situation.
It followed by next version of
Mark and Sweep Algorithm which works on tracking collector phenomenon .
Here I present marking and sweeping process.
Marking:
- Travel obj graph to all reachable object and mark them.
Sweeping:
- Clean up unmarked obj.
- Runs finalize on all the unreachable objs.
- Maintain all referenced objects on heap till finalize method completes.
Above algorithm bring another issue of Heap Fragmentation. To solve this, there is another processing of
Compacting collectors. Its job is to move marked object over free space. This process becomes easy because Java never deals with objects directly rather through
a obj ref table only through object ref table. To achieve this, it divides heap memory into two and utilize only one part of it for its operation. When this gets filled, it stops all operations and copies all live objects to other part and resumes operations. It is simpler way to de-fragment heap but drawback is that it keeps only half of memory ready for allocation.
======================= Old Notes ====================================
When Java was originally developed, the JDK shipped with a mark-and-sweep garbage collector. A mark-and-sweep garbage collector proceeds in two phases:
- Mark: identifies garbage objects
- Sweep: reclaims the memory for the garbage objects
Garbage objects are identified by traversing references from the current application stack frames; unreachable objects are assumed to be garbage.
Mark and sweep is a "stop-the-world" garbage collection technique; that is, all application threads stop until garbage collection completes, or until a higher-priority thread interrupts the garbage collector. If the garbage collector is interrupted, it must restart, which can lead to application thrashing with little apparent result. The other problem with mark and sweep is that many types of applications can't tolerate its stop-the-world nature. That is especially true of applications that require near real-time behavior or those that service large numbers of transaction-oriented clients.
Because of these problems, Sun Microsystems' Java HotSpot VM split the heap into three sections and added three garbage collection techniques. Splitting the heap allows different algorithms to be used for newly created objects and for objects that have been around for a while. This technique is based on the observation that most Java objects are small and short-lived. The heap's three sections are:
- Permanent space: used for JVM class and method objects
- Old object space: used for objects that have been around a while
- New (young) object space: used for newly created objects
The new object space is further subdivided into three parts: Eden, where all newly created objects go, and survivor spaces 1 and 2, where objects go before they become old. The survivor spaces make it easier to use copy-compaction with young objects; more details later.
The J2SE 1.3 garbage collection techniques are:
- Copy-compaction: used for new object space.
- Mark-compact: used in old object space. Similar to mark and sweep, mark-compact marks all unreachable objects; in the second phase, the unreachable objects compact. This technique avoids fragmentation problems and works well when the garbage collector runs infrequently.
- Incremental garbage collection (optional): Incremental GC creates a new middle section in the heap, which divides into multiple trains. Garbage is reclaimed from each train one at a time. This provides fewer, more frequent pauses for garbage collection, but it can decrease overall application performance. Incremental garbage collection can be enabled with the
-Xincgc command-line option.
All of these techniques are stop-the-world techniques. Though incremental garbage collection makes this effect less obvious, the application threads must still stop. That proves problematic for applications that can't afford to pause for garbage collection.
Garbage collection is based on live objects; that is, those reachable from the current stack space. Live objects are copied from new object space to survivor space (1 or 2), and then from survivor space to old object space. The amount of time objects spend in survivor space can be controlled with command-line parameters (see Tables 2 and 3 below).
The garbage collector typically runs in a low-priority thread, attempting to reclaim memory when the application is idle. This is fine for applications that regularly have idle time, such as graphical user interface (GUI)-driven applications. Unfortunately, if there is little or no idle time, the garbage collector may not get a chance to run.
Garbage collection can also be triggered if the heap's subregions are nearly full. In this case, the garbage collection thread's priority increases, thus increasing the chance that the garbage collection will run to completion. If the new generation is full, a minor collection is triggered; if the old generation is full, a major collection is triggered. The steps in a minor collection are:
- Copy objects from Eden to survivor space (1 or 2).
- Copy from survivor space 1 to survivor space 2, or vice versa. After a certain number of copies (controllable from the command line), an object becomes tenured, that is, a candidate for old object space.
- Tenured objects move from survivor space 1 or 2 to old object space.
A major collection uses the old generation garbage collector (mark-compact for J2SE 1.3) to reclaim old objects.
Reference Counting
- It is associated with program to execute in chunks.
- Advantage:
- Don't interrupts program for long.
- Good for real-time application.
- Disadvantage:
- Overhead of incrementing and decrementing ref counter
- Cant detect cyclic refs.
References: