Skip to content

Commit 8d1d7cd

Browse files
authoredSep 10, 2021
Fix potential ReDoS (#37)
1 parent c1b5e45 commit 8d1d7cd

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed
 

β€Žindex.js

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
export default function ansiRegex({onlyFirst = false} = {}) {
22
const pattern = [
3-
'[\\u001B\\u009B][[\\]()#;?]*(?:(?:(?:[a-zA-Z\\d]*(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\\u0007)',
3+
'[\\u001B\\u009B][[\\]()#;?]*(?:(?:(?:(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]+)*|[a-zA-Z\\d]+(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\\u0007)',
44
'(?:(?:\\d{1,4}(?:;\\d{0,4})*)?[\\dA-PR-TZcf-ntqry=><~]))'
55
].join('|');
66

13 commit comments

Comments
 (13)

BeigeBox commented on Sep 23, 2021

@BeigeBox

Which versions of ansi-regex are impacted by this vulnerability? Several scanners are flagging all versions of this library other than 5.0.1 and 6.0.1. I suspect some of the older versions aren't impacted, because the pattern variable is different (just spot checked I think v2.1.1 and v3).

Qix- commented on Sep 26, 2021

@Qix-
Member

Not sure because the security researchers are not doing their due diligence and are instead farming bounty money on e.g. huntr.dev, targeting the most used repositories for clout.

I'm really not sure what to do about this and it's eating a lot of time away. The vulnerability itself is not major - unless you're allowing long AND unsanitized user input to hit the API directly, the vulnerability doesn't affect you.

I have no idea which versions it actually affects because they didn't test it correctly.

thoger commented on Sep 27, 2021

@thoger

Confirmed 4.1.0 and 3.0.0 as affected testing using the provided reproducer. 2.1.1 does not reproduce the issue.

3.0.0 is the first affected, as that's the first version that includes 69bebf6 that the problematic part of the regex.

Qix- commented on Sep 27, 2021

@Qix-
Member

Thank you @thoger. πŸ™‚

arungpro commented on Oct 12, 2021

@arungpro

@thoger Does 5.0.0 or above, are not affected this? Can you confirm on the same?

Qix- commented on Oct 12, 2021

@Qix-
Member

@arungpro 5.0.0 is affected, 5.0.1 is not. Likewise, 6.0.0 is affected, 6.0.1 is not.

OnlineCop commented on Jan 10, 2022

@OnlineCop

Would someone be able to help me understand the exact formatting going on here?

I assume that the \ backslashes are escaped because this is within a string, meaning \u001B must be represented as \\u001B ?

I copied the "proposed" escaped original into regex101 and removed the escape characters to produce this:

[\u001B\u009B][[\]()#;?]*(?:(?:(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007)

I then plugged that unescaped version back in to determine whether there were any problems with the pattern itself, and see that there is an incomplete/unclosed (?: which should probably be removed:

[\u001B\u009B][[\]()#;?]*(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007

The next outer (?: ... ) group appears to have no alternations or quantifiers, so can probably also be removed:

(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007)

The next (?: ... | ... )? group does contain an alternation, and is qualified by its ? quantifier. Immediately after this optional group, it looks for \u0007 at the end.

Within both alternations, the character groups [-a-zA-Z\d\/#&.:=?%@~_] can be shortened: [a-zA-Z0-9_] or [a-zA-Z\d_] can be replaced with \w, so: [-\w\/#&.:=?%@~].

That would look like this:

[\u001B\u009B][[\]()#;?]*(?:(?:;[-\w\/#&.:=?%@~]+)*|[a-zA-Z\d]+(?:;[-\w\/#&.:=?%@~]*)*)?\u0007

So, within the first alternation, it matches a semicolon ; followed by one or more of those character-group characters. Within the second alternation, it first looks for letters or numbers, followed by any number of semicolon ; followed by zero or more of those character-group characters. I'm not sure whether that would want to be a + one-or-more, since that would allow "abc;;;;;;;;;" to match there instead of requiring characters between those semicolons.

If you look at this example, I've changed the outer non-capturing (?: ... )? to a capturing group just for illustrative purposes so you can see how the semicolon is matched in one case but not in the other due to this zero or more quantifiers. You can see this when the semicolon is added to the text in this next example.

If both ;[-\w\/#&.:=?%@~] character groups are quantified with zero-or-more *, then subsequent semicolons ;; will match, while if they are quantified with one-or-more +, the match fails due to subsequent semicolons ;;.

lorand-horvath commented on Apr 2, 2022

@lorand-horvath

ansi-regex v3 and v4 should now include the vulnerability fixes:
3.0.1 in 419250f and c57d4c2
4.1.1 in 75a657d

romans-ovo commented on Jul 13, 2022

@romans-ovo

ansi-regex v3 and v4 should now include the vulnerability fixes: 3.0.1 in 419250f and c57d4c2 4.1.1 in 75a657d

Neither 3.0.1, nor 4.1.1 are listed on https://www.npmjs.com/package/ansi-regex (under Versions), btw.

Qix- commented on Jul 13, 2022

@Qix-
Member

image

You sure?

papb commented on Aug 1, 2022

@papb

Would someone be able to help me understand the exact formatting going on here?

@OnlineCop can you first help me understand what exactly is your question here? Are you trying to understand what the code is doing? Are you proposing a readability improvement? Or performance improvement? Sorry, you said a lot of things but I couldn't even understand what is your goal, roughly...

I assume that the \ backslashes are escaped because this is within a string, meaning \u001B must be represented as \\u001B ?

Yes, exactly.

OnlineCop commented on Aug 1, 2022

@OnlineCop

In the original, lines 2-5 were looking for:

  1. either \u001B (ESC) or \u009B (Β’ cent sign) followed by 0 or more of these characters: []()#;?

After that, it was looking for either:
2. any number of letters, numbers, semi-colons, or special characters that came before \u0007 (BEL)
- There were a lot of 0-or-more groups, all of which was also within its own (?:...)? 0-or-1 (optional) group, before that BEL.
- It was definitely something that could cause ReDoS.
3. or an optional group of digits and semi-colons, followed by a single character in the range [\dA-PR-TZcf-ntqry=><~].
- This optional group also had potential for a ReDoS.

In the revised version, lines 2-5 were looking for the same characters as 1. above, but increased the variety of characters allowed before \u0007 and still has a lot of potential for ReDoS (possibly more than before).

I don't understand what exactly is supposed to be matched here. Both BEFORE and AFTER versions have a high probability to cause catastrophic backtracking. Are there any tests that should match either of the BEFORE or AFTER regexes?

Qix- commented on Aug 2, 2022

@Qix-
Member

Feel free to create a reproduction case that exploits exponential time complexity and file an issue. Otherwise I'm not sure what you're trying to say.

Please sign in to comment.