1. Computing

Garbage Collection

By

Garbage Collection

Garbage collection is the automated process of reclaiming memory allocated by unused objects. When an object is not referred to by any other objects, it's considered "garbage," not part of the state of the program any longer. Its memory should be returned to the pool of free memory to be used by other objects or returned to the operating system.

Non-Garbage Collected Languages

There was a time when garbage collection was considered too heavy. Some languages used it, but the overhead of a garbage collector was certainly a burden. In these languages, such as C or C++, memory must be manually allocated from the heap and, when you're finished with it, manually returned to the operating system. This is extremely efficient as there is little to no overhead, but it's not without its problems.

If a program allocated memory but forgot to deallocate it, it remain allocated until the program ends (and all its memory is returned to the operating system). This is called a memory leak and often cause programs to continually allocate more memory, consuming more and more memory until the operating system kills the process. This class of bug is extremely difficult to track down, and is one of the biggest reasons of moving to higher level languages.

How the Garbage Collector Works

For every object allocated by Ruby, Ruby tracks the number of other objects that reference this object. For example, if you were to allocate a String object and assign it to a variable (even a "local" variable outside of any object), that's one reference to the object. Assigning it to other local, global, constant, class or instance variables will add more references to that object. If you assign something else to that variable, the reference count on the original object it referred to will go down.

This reference count defines whether an object is in use, or is "garbage." During a garbage collection cycle, which happens periodically, any objects with a zero reference count will be freed. While the specifics in any particular implementation may differ, this is generally how garbage collector work.

Disabling and Enabling the Garbage Collector

The garbage collector can run at inopportune times. For example, in a game or other interactive program you may not want the garbage collector to run at certain times. You can disable the garbage collector at any time. This may help your program be more responsive at times, or help heavy computations run faster, but it has some obvious drawbacks. Memory will never be deallocated or recycled in your program. If you allocate a lot of memory, it will quickly grow out of hand. Also, when you turn the garbage collector back on garbage collector cycles may take longer until the garbage collector has a chance to catch up. It's rarely useful to turn the garbage collector off, but you can turn it off using the GC.disable and back on using the GC.enable methods.

For short programs that allocate many objects, it may make sense to disable the garbage collector entirely. If the program is guaranteed to have a short run time, you can leave memory deallocation up to the operating system. The OS knows exactly what memory was allocated by your Ruby program (or, more accurately, but the C program that runs your Ruby program) and will free it all when the program ends. Why waste time collecting garbage at all in short scripts? Though, if your script runs longer (perhaps because of a bug, stuck in an infinite loop), it can grow to consume very high amounts of memory. Use this strategy with caution.

Running the Garbage Collector Manually

Similarly, if you wish to have more control over when the garbage collector runs, you can run the GC.start run the garbage collector manually. Running this just before critical operations can improve responsiveness and performance slightly. However, it will not run at all if disabled using GC.disable so you never have complete control over the garbage collector.

Getting Garbage Collector Statistics

It may be useful to get statistics about the garbage collector and memory usage in general. This is available via the GC.stat hash. Some useful members are GC.stat[:count], which tracks the number of times the garbage collector runs, GC.stat[:heap_live_num] which tracks how much memory is currently being used by objects and GC.stat[:heap_free_num] which tracks the amount of memory that's been freed but not returned to the operating system, and can be used by other objects.

In large, long-running programs it may be used to keep track of these numbers. They can give you some insight into how much memory your program is using. Memory is something you don't often think about on modern computers with gigabytes of memory available. But keeping an eye on how much you're using and freeing objects you no longer use can be a big help.

  1. About.com
  2. Computing
  3. Ruby
  4. Advanced Ruby
  5. Garbage Collection

©2014 About.com. All rights reserved.