Skip to content

Psych.load (for YAML loading) slow #2089

Open
@LillianZ

Description

@LillianZ

Psych.load is slower in truffleruby than MRI, especially when cold. Below is code to produce a YAML file with similar format and size, the test script (thermometer reports hot after ~40s for me), and output . The actual YAML file takes ~8s and has ~1M characters.

Cold Hot
MRI 0.198234s 5.306ips
Truffleruby 6.483627s 3.237ips
letters = "qwertyuiopasdfghjklzxcvbnm"
file_name = "YAML.txt"
fruits = ["apple", "banana", "kiwi", "orange", "pear"]

File.write(file_name, "# This is a very large yml file\n")
File.write(file_name, "---\n", mode: "a")
1835.times do
  length = rand(4..15)
  object_name = ""
  length.times {object_name += letters[rand(26)]}
  modifier = letters[rand(26)]+rand(10).to_s
  File.write(file_name, object_name + "_" + modifier + ":\n", mode: "a")
  File.write(file_name, "  reference: " + object_name + "_" + modifier + "\n", mode: "a")
  File.write(file_name, "  name: " + object_name.capitalize + "\n", mode: "a")
  File.write(file_name, "  name_again: " + object_name.capitalize + "\n", mode: "a")
  File.write(file_name, "  modifier: " + modifier+ "\n", mode: "a")
  File.write(file_name, "  random_float: " + rand(0..0.2).to_s + "\n", mode: "a")
  File.write(file_name, "  another_field: another_value\n", mode: "a")
  File.write(file_name, "  list:\n", mode: "a")
  fruit = fruits[rand(5)]
  File.write(file_name, "  - " + fruit + "\n", mode: "a")
  File.write(file_name, "  fruit: " + fruit + "\n", mode: "a")
  random_string = ""
  40.times {random_string += letters[rand(26)]}
  File.write(file_name, "  random_strings:\n", mode: "a")
  File.write(file_name, "    string1: " + object_name + "/" + object_name + "_" + modifier + "." + random_string + "\n", mode: "a")
  random_string = ""
  40.times {random_string += letters[rand(26)]}
  File.write(file_name, "    string2: " + object_name + "/" + object_name + "_" + modifier + "." + random_string + "\n", mode: "a")
  random_string = ""
  40.times {random_string += letters[rand(26)]}
  File.write(file_name, "  random_strings_again:\n", mode: "a")
  File.write(file_name, "    string1: " + object_name + "/" + object_name + "_" + modifier + "_again." + random_string + "\n", mode: "a")
  random_string = ""
  40.times {random_string += letters[rand(26)]}
  File.write(file_name, "    string2: " + object_name + "/" + object_name + "_" + modifier + "_again." + random_string + "\n", mode: "a")
end
require 'psych'
require 'benchmark'
require 'benchmark/ips'

file_name = "YAML.txt"

puts Benchmark.measure {Psych.load(File.read(file_name))}

Benchmark.ips do |x|
  x.config(:time => 5, :warmup => 50)
  x.report('Psych.load') { Psych.load(File.read(file_name))}
end

puts Benchmark.measure {Psych.load(File.read(file_name))}
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19]
  0.191097   0.005769   0.196866 (  0.198234)
Warming up --------------------------------------
          Psych.load     1.000  i/100ms
Calculating -------------------------------------
          Psych.load      5.306  (±18.8%) i/s -     27.000  in   5.140672s
  0.189628   0.001635   0.191263 (  0.191423)
truffleruby (Shopify) 20.3.0-dev-151cfcb0, like ruby 2.6.6, GraalVM CE JVM [x86_64-darwin]
 25.490235   0.578915  26.069150 (  6.483627)
Warming up --------------------------------------
          Psych.load     1.000  i/100ms
Calculating -------------------------------------
          Psych.load      3.237  (± 0.0%) i/s -     17.000  in   5.268170s
  0.529269   0.001507   0.530776 (  0.310655)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions