Larry’s Blog

Objects Serialization in Rails

| Comments

Let’s talk about serializing objects in Rails today.

So, what’s the difference of serialization between in Ruby and in Rails?

In short, Rails uses ActiveSupport::JSON module to deal with JSON instead of using gems directly.

Rails 3.2 & Rails 4.0

So why would we talk about Rails 3.2 version since Rails 4.2.0.rc1 is released and Rails 5.0 development has begun?

Because we still have applications using Rails 3.2, and it’s really such a pain to upgrade apps with so many complicated logics like us.

ActiveSupport::JSON

ActiveSupport::JSON module provides a super simple API composed by two methods:

  • ActiveSupport::JSON.encode(object)
  • ActiveSupport::JSON.decode(string)

ActiveSupport::JSON.encode(object) takes a Ruby object as value and returns a JSON-encoded string. On the opposite, ActiveSupport::JSON.decode(string) takes a JSON-encoded string and returns the corresponding Ruby object.

Here are a few excamples:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[1] pry(main)> require 'rails'
=> true
[2] pry(main)> Rails.version
=> "3.2.21"
[3] pry(main)> j = ActiveSupport::JSON
=> ActiveSupport::JSON
[4] pry(main)> j.encode(23)
=> "23"
[5] pry(main)> j.encode("A string")
=> "\"A string\""
[6] pry(main)> j.encode({ :color => ["red", "green", "jellow"] })
=> "{\"color\":[\"red\",\"green\",\"jellow\"]}"
[7] pry(main)> j.encode({ :color => ["red", "green", "jellow"], :date => Time.now })
=> "{\"color\":[\"red\",\"green\",\"jellow\"],\"date\":\"2014-11-30T00:02:29+08:00\"}"
[8] pry(main)> j.decode(j.encode({ :color => ["red", "green", "jellow"], :date => Time.now }))
=> {"color"=>["red", "green", "jellow"], "date"=>"2014-11-30T00:02:38+08:00"}

As I mentioned in my previous article, we should implement two more methods to make our custom object serialization work: to_json and self.json_create. Do we have to do that if we are using Rails a.k.a ActiveSupport::JSON?

The common usage of serializing custom objects in Rails is serializing them to a string field stored in database. And we could achieve that with writing our own custom serializer for custom objects. I won’t cover this in this post, maybe in the near future I will do that.

Performance

So, like the previous post, we will pay some attention on the performance of serialization in Rails.

ActiveSupport depends on multi_json gem, which is a simple library that allows you to semalessly provide multiple JSON backends with intelligent defaulting. And ActiveSupport::JSON uses json as its default engine, we could know that by ActiveSupport::JSON.engine # => MultiJson::Adapters::JsonGem.

Then let’s do some benchmarking, eg: setting oj as ActiveSupport’s default engine.

benchmark_active_support_and_json_and_oj.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
require 'rubygems'
require 'bundler/setup'

require 'active_support'
require 'active_support/core_ext/object/to_json'
require 'active_support/json/encoding'
require 'oj'
require 'benchmark/ips'

json_obj = {
  "a" => "Alpha",
  "b" => true,
  "c" => 12345,
  "d" => [true, [false, [-123456789, nil], 3.9676, ["Something else.", false], nil]],
  "e" => {
    "zero"  => nil,
    "one"   => 1,
    "two"   => 2,
    "three" => [3],
    "four"  => [0, 1, 2, 3, 4]
  },
  "f" => nil,
  "h" => {"a"=>{"b"=>{"c"=>{"d"=>{"e"=>{"f"=>{"g"=>nil}}}}}}},
  "i" => [[[[[[[nil]]]]]]]
}

ActiveSupport::JSON.engine = :oj

Benchmark.ips do |x|
  x.report("active_support") { json_obj.to_json }
  x.report("oj") { Oj.dump json_obj }
end

Output:

1
2
3
4
5
6
Calculating -------------------------------------
      active_support   250.000  i/100ms
                  oj    26.304k i/100ms
-------------------------------------------------
      active_support      2.532k (± 4.0%) i/s -     12.750k
                  oj    319.969k (± 5.7%) i/s -      1.605M

And the result may just blow your mind. We already specify our engine to Oj, right? They are supposed to behave like each other . What’s wrong with active_support?

The root cause is ActiveSupport’s implementation problem. In active_support/core_ext/object/to_json.rb file :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Hack to load json gem first so we can overwrite its to_json.
begin
  require 'json'
rescue LoadError
end

# The JSON gem adds a few modules to Ruby core classes containing :to_json definition, overwriting
# their default behavior. That said, we need to define the basic to_json method in all of them,
# otherwise they will always use to_json gem implementation, which is backwards incompatible in
# several cases (for instance, the JSON implementation for Hash does not work) with inheritance
# and consequently classes as ActiveSupport::OrderedHash cannot be serialized to json.
[Object, Array, FalseClass, Float, Hash, Integer, NilClass, String, TrueClass].each do |klass|
  klass.class_eval do
    # Dumps object in JSON (JavaScript Object Notation). See www.json.org for more info.
    def to_json(options = nil)
      ActiveSupport::JSON.encode(self, options)
    end
  end
end

Did you see the to_json method? It will use ActiveSupport::JSON.encode every time with hard-coded guaranteed.

This monkeypatch was introduced to fix a problem (see comments above the code), but also caused another pain: If you are currently using the JSON/oj/whatever gem adapter with Rails (AS::JSON::Encoding), and you have been calling object#to_json, you are actually using Rail’s pure Ruby JSON encoder. And this explains why the performance suck so much.

Then how to fix this? The solution is quite simple: apply a similar change as the one in active_support/core_ext/object/to_json.rb, monkeypatch to override to_json method again, let it use Oj to do encoding job.

benchmark_with_hard_coded_patch.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
require 'rubygems'
require 'bundler/setup'

require 'active_support'
require 'active_support/core_ext/object/to_json'
require 'active_support/json/encoding'
require 'oj'
require 'benchmark/ips'

json_obj = {
  "a" => "Alpha",
  "b" => true,
  "c" => 12345,
  "d" => [true, [false, [-123_456_789, nil], 3.9676, ["Something else.", false], nil]],
  "e" => {
    "zero"  => nil,
    "one"   => 1,
    "two"   => 2,
    "three" => [3],
    "four"  => [0, 1, 2, 3, 4]
  },
  "f" => nil,
  "h" => {"a"=>{"b"=>{"c"=>{"d"=>{"e"=>{"f"=>{"g"=>nil}}}}}}},
  "i" => [[[[[[[nil]]]]]]]
}

[Object, Array, FalseClass, Float, Hash, Integer, NilClass, String, TrueClass].each do |klass|
  klass.class_eval do
    def to_json(opts = nil)
      Oj.dump(self, opts)
    end
  end
end

Benchmark.ips do |x|
  x.report("active_support") { json_obj.to_json }
  x.report("oj") { Oj.dump json_obj }
end

Output:

1
2
3
4
5
6
Calculating -------------------------------------
      active_support    24.917k i/100ms
                  oj    25.228k i/100ms
-------------------------------------------------
      active_support    289.260k (± 6.9%) i/s -      1.445M
                  oj    300.640k (± 6.4%) i/s -      1.514M

Someone has created a gem called rails-patch-json-encode to fix this error. You could just use it in your app.

Rails 4.1

Rails 4.1 removes multi_json dependency, and the previous patch/gem is unnecessary and will no longer work. Instead, if you want to use oj to deal with JSON, use the oj_mimic_json gem with oj in your Gemfile to have Oj mimic the JSON gem and be used in its place by ActiveSupport JSON handling:

1
2
gem 'oj'
gem 'oj_mimic_json'

And the rest is just easy.

References

Comments