February 5, 20190 mins read
A recently discovered vulnerability in NumPy, the widely used open source package for scientific computing in Python, allows for the execution of arbitrary, potentially malicious code.
NumPy is part of the SciPy ecosystem, which is a collection of open source software packages for mathematics, science, and engineering. NumPy is used in both industry and academia and has over 9,000 stars on GitHub and more than 700 contributors. According to the records available through PyPI’s BigQuery, NumPy has been downloaded nearly 4 million times in the last three months.
On 16th Jan 2019, GitHub user nanshihui opened an issue stating that the numpy.load function was vulnerable to the execution of arbitrary code because of how it was using Python pickle module. The Python pickle module serializes and deserializes Python objects. “Pickling” converts the Python object hierarchy into a byte stream and “unpickling” takes a byte stream and converts it back into an object hierarchy. Accepting pickled arrays from untrusted sources is dangerous because there is no way to verify it before unpickling. If malicious shellcode is provided as input, it could execute once unpickled.
The vulnerability is present in NumPy versions between 1.10 (released in October of 2015) and the most recent version, 1.16. More details on this vulnerability can be found here in Snyk’s database.
The affected NumPy versions default to allowing pickled object arrays to be loaded. This should only be done when the source is trusted but is bad practice as default behavior.
Currently there are no known instances of this vulnerability being exploited, but as awareness of the problem grows that could change. The exploit does not require authentication, nor does it require a lot of technical knowledge to take advantage of it.
At the time of this writing, neither a patch or an upgrade has been made available, although the NumPy team are working on it. You can make sure that you are not vulnerable by passing
allow_pickle=False to any instance of
numpy.load that loads data from an untrusted source.
Are you affected?
If you are a Snyk user you will be notified through your routine alerts if you have projects with vulnerable NumPy dependencies.