Understanding and tuning garbage collector in Ruby

January 22nd 2016

Nowadays most of programming languages for deallocation memory uses garbage collector (GC). GC is a form of automatic memory management. The garbage collector attempt to reclaim garbage, or memory occupied by objects that are no longer in use by the program. Garbage collection was invited by John McCarthy around 1959 to abstract away manual memory management in Lisp.

 

Ruby uses incremental GC algorithm, to understand incremental algorithm we will discuss about previous GC algorithm that Ruby used. Mark and Sweep (M&S) algorithm is one of the most simplest GC algorithm which has two phase, first phase is to traverse all living object and mark them as living object, second phase is to collect all unused objects that are not marked. Problems with M&S algorithm are throughput and pause time which slows down your Ruby applications and brings poor UI/UX. Problems with M&S are solved with generational GC algorithm. Generational GC divides heap space into two ‘young’ one and ‘old’. Newly created objects are saved into young group, and after surviving three iterations of GC algorithm, they are promoted to old objects. In object oriented programming most of objects die young, and algorithm is executed only on young group. If there is not enough memory in young group algorithm is only then executed on old group. Drawback of this algorithm is that execution on old group is consuming a lot of time, but most of the time algorithm only executes on young group.

 

To solve drawback of generational algorithm, we use incremental GC algorithm. Incremental GC algorithm splits GC execution process into several processes and interleaves GC processes and Ruby processes which will convert one long pause into more short pauses. Incremental GC algorithm has four phase, first phase is to mark all existing objects as unmarked objects. Phase two, living objects such as objects on the stack are marked as objects that may have reference to other unmarked objects. Phase three, Pick the object that may have reference to other unmarked objects and visit each object it references and mark it as object that can have reference to other object. Now original object is marked as object that doesn’t have reference to any unmarked object. To make this process incremental, we must repeat phase three until there is no objects that references other unmarked objects. Collect all objects that are left unmarked because all living objects are marked. There is one problem with this marking, objects that are marked as objects that doesn’t have reference to any unmarked objects  can have reference to unmarked objects while Ruby executes, to prevent this we need to use ‘write-barrier’. A write-barrier is invoked every time an object obtains a new reference to a new object. The barrier detects when a reference from one object to another is made. When this happens object that is marked as object that have no reference is changed to object with reference to unmarked objects.

 

We can measure pause times in algorithms with gc_tracer gem which can be found on https://ruby gems.org. Ruby provides module called ‘ObjectSpace’ to interact with living object inside of GC which can be found on the official documentation of the language.MRI (Matz's ruby interpreter) stores objects aka. RVALUEs in heaps, each heap is approx 16KB. RVALUE structs consume different amounts of memory depending on the machine architecture. On x64 machines they consume 40 bytes, on x32 machines they consume 20 to 24 bytes depending on the sub-architecture. To check how much object can be stored per heap we can use  ObjectSpace module in IRB, also to see GC variables we can ran GC.stat. 

 

2.1.2 :001 > require 'objspace'
 => true 

2.1.2 :002 > ObjectSpace.count_objects[:TOTAL] / GC.stat[:heap_used]
 => 407 

 

Garbage collector frees the programmer from manually dealing with memory deallocation, and with that some bugs are prevented or entirely eliminated such as: dangling pointer bugs which occur when a piece of memory is freed while there are still pointers attached to it and one of those pointers is dereferenced, double free bugs which occur where the program tries to free a region of memory that has already been freed, and some kinds of memory leaks in which program is unable to fee memory. Although GC helps programmer with deallocation memory it has certain disadvantages that reflects on performance program execution and incompatibility with manual resource management. To speed things up Ruby can use gem called TuneMyGC to tune Ruby application. Tuning is basically tradeoff between tuning for speed which will result in using more memory and tuning for low memory usage which will result in giving up speed. For tuning purposes GC collector exposes us tuning variables in three categories:

  • Heap slot values - is the place where Ruby objects live
  • Malloc limits - off heap storage for large strings arrays and other structures
  • Growth factors: by how much to grow slots, malloc limits etc.

 

To be sure that tuning of GC is successful, we need to have some reference on state of memory before tuning, for that purpose we will use gem ‘rack-mini-profiler’. Installation of gem is pretty easy, all you need to do is add gem to your gemfile run bundle and you are ready to go. After installation, add ‘?pp=profile-gc’ at the end of url and you will get report of memory usage. In my case report before tuning looks like this

 

GC Stats: count : 39, heap_used : 1909, heap_length : 2368, heap_increment : 488, heap_live_slot : 371716, heap_free_slot : 406392, heap_final_slot : 0, heap_swept_slot : 406393, heap_eden_page_length : 1421, heap_tomb_page_length : 488, total_allocated_object : 2729592, total_freed_object : 2357876, malloc_increase : 624, malloc_limit : 32883343, minor_gc_count : 30, major_gc_count : 9, remembered_shady_object : 45994, remembered_shady_object_limit : 91988, old_object : 320040, old_object_limit : 640080, oldmalloc_increase : 1008, oldmalloc_limit : 23221058

 

With default configuration of GC, we got count of thirty-nine which means that GC is runned thirty-nine times, minor_gc_count of thirty which means that collection has been executed thirty times on young group of objects, and major_gc_count  of nine which means that collection has been executed nine times on old group of objects. As we said tuning is tradeoff between speed and memory usage, so we can tune our garbage collector by modifying a series of environment variables. So for example to trigger young-object GC runs you can lower RUBY_GC_HEAP_GROWTH_FACTOR, or RUBY_GC_HEAP_GROWTH_MAX_MEMORY_SLOTS. You can set how much memory Ruby will allocate before running minor or major GC runs with RUBY_GC_MALLOC_LIMIT and RUBY_GC_OLDMALLOC_LIMIT. For my Rails application i used next environment variables:

 

RUBY_GC_HEAP_INIT_SLOTS=997339 RUBY_GC_HEAP_FREE_SLOTS=626600 RUBY_GC_HEAP_GROWTH_FACTOR=1.03 RUBY_GC_HEAP_GROWTH_MAX_SLOTS=88792 RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=2.4 RUBY_GC_MALLOC_LIMIT=34393793 RUBY_GC_MALLOC_LIMIT_MAX=41272552 RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR=1.32 RUBY_GC_OLDMALLOC_LIMIT=39339204 RUBY_GC_OLDMALLOC_LIMIT_MAX=47207045 RUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR=1.2 rails s

 

Memory usage with previous variables:

 

GC Stats: count : 6, heap_used : 2444, heap_length : 2486, heap_increment : 1027, heap_live_slot : 236262, heap_free_slot : 759909, heap_final_slot : 0, heap_swept_slot : 340919, heap_eden_page_length : 1417, heap_tomb_page_length : 1027, total_allocated_object : 1889731, total_freed_object : 1653469, malloc_increase : 3680, malloc_limit : 34393793, minor_gc_count : 3, major_gc_count : 3, remembered_shady_object : 5206, remembered_shady_object_limit : 12494, old_object : 225943, old_object_limit : 542263, oldmalloc_increase : 4064, oldmalloc_limit : 39339204

 

After modifying environment variables, GC getting runned only six times,  with major and minor collection of three times, which is pretty good in comparison of thirty-nine times.

Tuning application with TuneMyGC is pretty straightforward all we need to do is add gem to gem file, register your application, then boot it and that is all. You can find this gem on https://rubygems.org/gems/tunemygc/versions/1.0.4 and documentation on github. Cool thing about this gem is that it gives user suggested GC configuration based on samples that it gets while monitoring your app. Using gem is quite easy first of all you need to register your application with:

 

 bundle exec tunemygc -r email@domain.com

 

after that you will get next message ‘Application raddit registered. Use RUBY_GC_TOKEN=17ab2004c6b2b04ba1bfce38d08ec97 in your environment’, but your RUBY_GC_TOKEN will probably be different. After that you run your rails server with RUBY_GC_TOKEN

 

RUBY_GC_TOKEN=17ab2004c6b2b04ba1bfce38d08ec972 RUBY_GC_TUNE=1 bundle exec rails s 

 

After this when you turn off your server you will get url with recommended GC configurations which you can use to boost your application, GC stats with recommended configuration on my application looks like this

 

GC Stats: count : 7, heap_used : 1284, heap_length : 1322, heap_increment : 37, heap_live_slot : 236218, heap_free_slot : 287141, heap_final_slot : 0, heap_swept_slot : 284718, heap_eden_page_length : 1278, heap_tomb_page_length : 6, total_allocated_object : 1889444, total_freed_object : 1653226, malloc_increase : 3680, malloc_limit : 41272552, minor_gc_count : 4, major_gc_count : 3, remembered_shady_object : 5206, remembered_shady_object_limit : 10412, old_object : 225899, old_object_limit : 451798, oldmalloc_increase : 4064, oldmalloc_limit : 47207044

Popular posts