Vulnerable Regular Expressions #212

cristianstaicu · 2017-09-06T14:45:41Z

The following regular expressions used in underscore and unescapeHTML methods are vulnerable to ReDoS:

/([A-Z\d]+)([A-Z][a-z])/g
/\&([^;]+);/g

The slowdown is moderately low (for 50,000 characters around 2 seconds matching time). I would suggest one of the following:

remove the regex,
anchor the regex,
limit the number of characters that can be matched by the repetition,
limit the input size.

If needed, I can provide an actual example showing the slowdown.

kvanbere · 2017-09-26T03:40:40Z

anchor the regex

Could you please elaborate on the fix, I'm curious. Thank you!

cristianstaicu · 2017-09-26T11:53:40Z

Sorry, that was a standard comment I sent around. In this case it would not work! The anchoring solution is well suited when the regex is used for validation purposes. Let us say you want to check an input string is an email address. Instead of having a simple regex like:

/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]/

you should have something like:

/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]$/

Word bounadries can also be used as explained here:
https://www.regular-expressions.info/email.html

This will decrease the complexity of the matching since the engine will only try to match the regex once, not n (length of input string) times like in the case of the first regex.

kvanbere · 2017-09-26T12:56:34Z

Thanks for the explanation!

I hope this gets fixed soon, it's giving my package red lights :(

burt202 · 2017-09-26T14:17:18Z

This issue now seems to being picked up by an nsp check making our builds red too. See https://nodesecurity.io/advisories/536

danzGentrack · 2017-09-29T03:07:23Z

The slowdown is moderately low (for 50,000 characters around 2 seconds matching time).

Just wondering why it is a problem? For example, the string "underscore" algorithm could be a very complex one and may need lots of seconds to process a 50,000 characters long string.

cristianstaicu · 2017-09-29T08:32:36Z

I recommend the following two articles for understanding the issue:
https://blog.liftsecurity.io/2014/11/03/regular-expression-dos-and-node.js/
https://dl.acm.org/citation.cfm?id=3065916

Basically, the problem comes from the fact that the regex matching is done in the main loop of Node.js and thus blocking this loop for 2 seconds is equivalent to blocking all the incoming requests for 2 seconds.

danzGentrack · 2017-10-01T20:21:09Z

Thanks for the info. The example in the first link is a classic evil regex and it can be written in a more efficient way.
var emailExpression = /^([a-zA-Z0-9_.-])+@(([a-zA-Z0-9-])+.)+([a-zA-Z0-9]{2,4})+$/;

The regex used in the "underscore" case is fine: /([A-Z\d]+)([A-Z][a-z])/g , which may not incur much performance penalty compared to an alternative implementation. I understand the impact of slowing regex matching to an application. My point is that if a regular expression is used correctly, then it is not a problem to be slow.

revelt · 2017-10-01T22:13:51Z

For posterity, Snyk also reports it:
https://snyk.io/vuln/npm:string:20170907

I'm looking forward to solution. The html-stripping features, for example, at the competition are very poor and I can't find the alternative for String.js..

ericnorris · 2017-10-02T18:36:29Z

Hey there! I haven't looked too closely at this issue but it was linked by @revelt above. I'd definitely suggest using the striptags library for removing HTML. It is based on the PHP function strip_tags, which is widely known to be safe.

In the README I link to two articles, one indicating the regular expressions are not safe for removing dangerous HTML (here), and the second demonstrating that PHP's strip_tags function is not vulnerable (here).

https://github.com/ericnorris/striptags take a look to see if that works for you - it uses a state machine very similar to the one that is in the PHP internals instead of a regular expression. As a result it is super fast and just as secure!

stevenvachon · 2017-10-11T13:36:26Z

This is a critical issue. The author and the contributors don't seem to care at all. Kill this project with fire.

revelt · 2017-10-11T13:38:18Z

@Steve come on, it's not that bad :) team is just regrouping, I'm sure guys will fix that eventually

…used so little there is no need for the extra dependency in Swagger-tools. Source: CERT Name: https://nodesecurity.io/advisories/536 Url: https://nodesecurity.io/advisories/536 Source: CERT Name: jprichardson/string.js#212 Url: jprichardson/string.js#212

yieme · 2017-10-18T15:41:01Z

@jprichardson, string.js is a nice library and I would prefer to continue using it; however, such a vulnerability isn't something that can be used in production. Are you aware of this issue or has the project been abandoned as the last update was about 11 months ago?

nconf-base · 2017-10-18T15:54:08Z

@revelt, I can appreciate the concerns raised by @Steve and @yieme correctly points out lack of update activity for almost a year. This vulnerability has been live for over a month. At what point should we consider "eventually" unacceptable?

#542) * Remove stringjs dependency due to vulnerability in string 3.3. It is used so little there is no need for the extra dependency in Swagger-tools. Source: CERT Name: https://nodesecurity.io/advisories/536 Url: https://nodesecurity.io/advisories/536 Source: CERT Name: jprichardson/string.js#212 Url: jprichardson/string.js#212 * Suggested changes.

revelt · 2017-10-18T16:20:33Z

@nconf-base I think now, because if you have string.js in production, your code is vulnerable now. Personally, I'm going to switch to alternatives asap and equally come back asap when it's fixed.

As far as my stuff is concerned, I already replaced every string.js methods with alternatives on my libraries except I can't find a good quality, nothing-assumed tool to strip HTML tags from strings. Within October I plan to release the alternative, string-only, no-regex library which strips HTML, which will untie me completely from string.js.

I'm thinking, it can't be that all the methods of string.js are affected, can it? If some are OK, maybe we could split string.js into separate libs like lodash does with lodash.* and so on? This way, affected methods/libraries would be isolated and it would be easier to pinpoint the culprits. You know, divide and conquer approach...

nconf-base · 2017-10-18T16:38:30Z

@revelt, I think you are correct the issue only affects a couple of the methods, specifically underscore and unescapeHTML, as per https://nodesecurity.io/advisories/536

I think your idea of decomposing the larger string.js library into smaller pieces might help those who use the library without those particular methods. Splitting out the problem methods on a fork might be a start.

How does your strip tags method differ from https://github.com/ericnorris/striptags that eric pointed out?

revelt · 2017-10-18T16:45:17Z

@nconf-base it's assuming too much what results in bunch of false positives. I just scratched the surface with that issue, many of my unit tests fail because of striptags if I switch. Same with other similarly-named libraries from the first npm results page, I raised issues on at least two others.

For the record, string.js copes fine stripping HTML from stuff like aaa<bbb.

ericnorris · 2017-10-18T18:12:25Z

@revelt, I'm not sure I would agree that striptags results in "false positives". Any function that ostensibly strips HTML must assume all input is HTML, and as such must remove tags, incomplete or not.

Leaving aaa<bbb may result in unsafe HTML if it is concatenated with stuff on the page. A more concrete example is if it is fed aaa<script src="evil.js". The tag is incomplete, but when placed on a page (or concatenated with other safe HTML) may end up producing valid HTML that results in evil.js executing. That's not acceptable for a library responsible for removing HTML.

If the input is "safe", then < should be HTML encoded, e.g. aaa<bbb.

@nconf-base I would strongly suggest using striptags! I am, of course, biased, but it is based on the very popular function from PHP (http://php.net/manual/en/function.strip-tags.php) and is battle-tested.

revelt · 2017-10-18T20:21:49Z

@ericnorris I see your point. Basically, there are two kinds of HTML-stripping cases and we both are proponents of each.

My case is Detergent.js, email text copy cleaning app - HTML can come in the input, there is high risk of false positives, but zero chance of rogue code being executed and code if does come, it comes unescaped. This way, code a < b and c > d is very likely and legit (consider even b as valid tag name).

Your case is a regular web-dev ops where you clean HTML in order to defuse potentially malicious code. Prioritising security over risk of false positives. All code assumed to be escaped and unescaped-one a no-no.

String.js was this middle-ground solution, being able to detect aaa<bbb correctly, yet, at the same time, reliably strip malicious HTML for years in everybody's apps. It tended both sides. I'd gladly use striptags if it tended both sides as well.

Do you see my point?

ghost · 2017-11-22T16:09:37Z

@revelt that's not tending both sides. < is never safe in HTML and under some legacy parsing modes I think the browser may even auto-complete what it thinks is an incorrectly truncated tag with >. Any library that leaves any < inside something to be embedded into HTML is simply unsafe to use for HTML output, plain and simple. (since < outside of a HTML tag is never valid)

Edit: If you refer to the use case of stripping first and then replacing any remaining < and > afterwards with &lt and &gt anyway, I suppose that's fair enough.

revelt · 2017-11-22T16:24:16Z

@Jonast Yes, you are right, it's not safe and not valid in HTML. But... Detergent accepts both HTML and pure text and outputs text with sprinkled HTML or without. There is a thin path in here where > is valid both as input and output. For example, if people want to remove invisible characters from some text copy. Like a < b and c > d\u0003 (invisible ETX character in the end). That's a legit request and Detergent will do that. I want to produce a < b and c > d, not a d\u0003 or a d. Do you see my point?

Since user pastes the copy into Detergent, input is assumed to be safe, so I see no problems letting them output unencoded < if they wish (encoding is turned on by default, but it can be turned off).

Having said that, I'm about to publish my type of HTML stripping tool which behaves like that.

StorytellerCZ · 2017-12-07T12:53:35Z

@az7arul Are you currently maintaining this or who is in charge here?

az7arul · 2017-12-07T14:11:36Z

Hi @StorytellerCZ, I wasn't maintaining this for a while. But for this I can take a look this weekend, meanwhile any PR that address this is welcome.

ghost · 2017-12-07T16:55:54Z

// This function converts a binary string (can be HTML or JavaScript) to a safe string that can be used in a HTML file within JavaScript tags.
function EncodeHTML(S) { var i, C; S = S.split(""); for (i = 0; i < S.length; i++) { C = S[i].charCodeAt(0); if (C == 7) S[i] = "\t"; else if (C == 10) S[i] = "\r"; else if (C == 13) S[i] = "\n"; else if (C == 92) S[i] = "\\"; else if (C == 34 || C == 38 || C == 39 || C == 60 || C == 62 || C < 32 || C > 126) S[i] = "\x" + toHex(C); } return '"' + S.join("") + '"'; }

// This function converts a Js safe string with quotation marks around it to its original source. It's the opposite of EncodeHTML()
function DecodeHTML(S) { S += ""; if (S.length > 1) { if (S.charAt(0) == '"' || S.charAt(0) == "'") S = S.slice(1, -1); } S = S.split("\t").join("\t").split("\r").join("\r").split("\n").join("\n").split("\\").join("\").split("\x"); for (var i = 1; i < S.length; i++) S[i] = String.fromCharCode(parseInt(S[i].substr(0, 2), 16)) + S[i].slice(2); return S.join(""); }

ghost · 2017-12-07T16:58:22Z

Oops. I just noticed that the above code didn't come out right. I should have put that code within CODE section, not as plain text, because the double backslash characters have been replaced with single. :/

najibk · 2017-12-26T14:03:20Z

Hello, any news about this ?
Thanks

angleman · 2017-12-26T16:46:56Z

@az7arul, thank you for looking into this. We have nsp check as part of our deploy process and this security issue breaks the build.

revelt · 2018-03-01T11:30:05Z

@ankurloriya No, the all-in-one monolithic replacement doesn't exist.

But you can replace its methods one-by-one with libraries from npm. For example, I replaced replaceAll() and with replace-string, lines() with split-lines. There were no suitable HTML tag stripping libraries to replace stripTags() so I coded my own. If your unit tests are thorough they will pick up any inconsistencies (or their absence) of a replacement library. I found it difficult to judge replacements by their readmes, - my unit tests were the judge.

Performance-wise, switching from monolithic string.js to separate libs will reduce your bundle's footprint too. You might even get some of them as Rollup'ed 3-in-1 builds today, where your Webpack/Rollup will tap their ES module's build directly. On other hand, it will take time for string.js to migrate to Rollup, it doesn't have an ES Module build at the moment.

String package have a vulnerability and was little used in project jprichardson/string.js#212

yieme · 2018-07-27T18:54:50Z

We changed to using various other libraries due to this security issue. Appears the project may be abandoned with #218 the http://stringjs.com/ domain apparently going away. Sad, it was a good library to have a lot of useful tools in one place. Hopefully it gets turned around for people. We just couldn't wait any longer for the security issue to be resolved. Cheers.

We use string@3.3.3 to generate our header id. string@3.3.3 is reported to be unsafe and contains vulnerabilities [1]. As a step towards removing string@3.3.3 as a dependencies, let's replace all usage of the 'string' methods with suitable equivalents. [1] - jprichardson/string.js#212

We no longer use string@3.3.3 in our codebase. string@3.3.3 is reported to be unsafe and contains vulnerabilities [1]. Let's remove string@3.3.3 as one of our dependencies. [1] - jprichardson/string.js#212

We use string@3.3.3 to generate our header id. string@3.3.3 is reported to be unsafe and contains vulnerabilities [1]. As a step towards removing string@3.3.3 as a dependencies, let's replace all usage of the 'string' methods with suitable equivalents. [1] - jprichardson/string.js#212

We no longer use string@3.3.3 in our codebase. string@3.3.3 is reported to be unsafe and contains vulnerabilities [1]. Let's remove string@3.3.3 as one of our dependencies. [1] - jprichardson/string.js#212

markdown-it-anchor@4.0.0 uses string@3.3.3. string@3.3.3 is reported to be unsafe and contains vulnerabilities [1]. It is also no longer in use in markdown-it-anchor@5.0.0. Let's upgrade markdown-it-anchor, so that it no longer uses string@3.3.3. [1] - jprichardson/string.js#212

markdown-it-table-of-contents@0.3.2 uses string@3.3.3. string@3.3.3 is reported to be unsafe and contains vulnerabilities [1]. It is also no longer in use in markdown-it-table-of-contents@0.4.0. Let's upgrade markdown-it-table-of-contents, so that it no longer uses string@3.3.3. [1] - jprichardson/string.js#212

See jprichardson/string.js#212

We no longer use string@3.3.3 in our codebase. string@3.3.3 is reported to be unsafe and contains vulnerabilities [1]. Let's remove string@3.3.3 as one of our dependencies. [1] - jprichardson/string.js#212

smarusa mentioned this issue Oct 17, 2017

Remove stringjs dependency due to vulnerability in string 3.3. It is … apigee-127/swagger-tools#542

Merged

commenthol added a commit to commenthol/string.js that referenced this issue Mar 4, 2018

[jprichardson#212] Fix Vulnerable Regular Expressions

eab9511

commenthol mentioned this issue Mar 4, 2018

[#212] Fix Vulnerable Regular Expressions #217

Open

robsonbittencourt added a commit to hubot-js/hubot.js that referenced this issue Apr 21, 2018

Replace string package by voca

3a88914

String package have a vulnerability and was little used in project jprichardson/string.js#212

stuft2 mentioned this issue May 21, 2018

nsp string vulnerability apigee-127/swagger-tools#569

Closed

Manvel mentioned this issue Jun 5, 2018

High vulnerability error on "npm audit" valeriangalliat/markdown-it-anchor#44

Closed

rodrigooler mentioned this issue Jun 7, 2018

Vulnerability found in npm audit open-cli-tools/chokidar-cli#55

Closed

lgodmer mentioned this issue Jun 29, 2018

Remove dependency on string@3.3.3 which has a security vulnerability (ReDos) cmaas/markdown-it-table-of-contents#29

Closed

pjquirk mentioned this issue Jul 25, 2018

Remove dependency on strings.js 3.3.3 due to vulnerability report microsoft/azure-pipelines-tasks#7834

Merged

nicojs mentioned this issue Aug 1, 2018

Replace dependency string with voca leff/markdown-it-named-headers#9

Open

yamgent mentioned this issue Aug 3, 2018

Remove strings package MarkBind/vue-strap#90

Merged

dlong500 mentioned this issue Aug 9, 2018

remove string module dependency pofider/node-script-manager#15

Closed

This was referenced Sep 10, 2018

Fix vulnrerable dependency contentful/contentful-migration#133

Merged

Fix vulnrerable dependency contentful/contentful-cli#85

Merged

xinghengwang mentioned this issue Sep 11, 2018

Regular Expression Denial of Service Moesif/moesif-nodejs#4

Closed

kuceb mentioned this issue Sep 18, 2018

remove stringjs dependency, add grunt-cli danhper/node-git-cli#10

Merged

daohoangson added a commit to xfrocks/node_pubhubsubbub_pushserver that referenced this issue Sep 25, 2018

Use sanitize-html instead of string for security reason.

13f36b0

See jprichardson/string.js#212

Xenonym mentioned this issue Jan 15, 2019

Restore old anchor generation behaviour from before v1.15.0 MarkBind/markbind#578

Merged

mkonikov mentioned this issue Apr 17, 2020

High Severity Vulnerability Alert #226

Open

djfdyuruiry mentioned this issue May 23, 2021

High vulnerability NPM with inner dependency djfdyuruiry/pretty-markdown-pdf#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulnerable Regular Expressions #212

Vulnerable Regular Expressions #212

cristianstaicu commented Sep 6, 2017

kvanbere commented Sep 26, 2017

cristianstaicu commented Sep 26, 2017

kvanbere commented Sep 26, 2017

burt202 commented Sep 26, 2017

danzGentrack commented Sep 29, 2017 •

edited

Loading

cristianstaicu commented Sep 29, 2017 •

edited

Loading

danzGentrack commented Oct 1, 2017

revelt commented Oct 1, 2017

ericnorris commented Oct 2, 2017 •

edited

Loading

stevenvachon commented Oct 11, 2017

revelt commented Oct 11, 2017

yieme commented Oct 18, 2017

nconf-base commented Oct 18, 2017

revelt commented Oct 18, 2017 •

edited

Loading

nconf-base commented Oct 18, 2017

revelt commented Oct 18, 2017

ericnorris commented Oct 18, 2017

revelt commented Oct 18, 2017

ghost commented Nov 22, 2017 •

edited by ghost

Loading

revelt commented Nov 22, 2017 •

edited

Loading

StorytellerCZ commented Dec 7, 2017

az7arul commented Dec 7, 2017

ghost commented Dec 7, 2017

ghost commented Dec 7, 2017

najibk commented Dec 26, 2017

angleman commented Dec 26, 2017

revelt commented Mar 1, 2018 •

edited

Loading

yieme commented Jul 27, 2018

Vulnerable Regular Expressions #212

Vulnerable Regular Expressions #212

Comments

cristianstaicu commented Sep 6, 2017

kvanbere commented Sep 26, 2017

cristianstaicu commented Sep 26, 2017

kvanbere commented Sep 26, 2017

burt202 commented Sep 26, 2017

danzGentrack commented Sep 29, 2017 • edited Loading

cristianstaicu commented Sep 29, 2017 • edited Loading

danzGentrack commented Oct 1, 2017

revelt commented Oct 1, 2017

ericnorris commented Oct 2, 2017 • edited Loading

stevenvachon commented Oct 11, 2017

revelt commented Oct 11, 2017

yieme commented Oct 18, 2017

nconf-base commented Oct 18, 2017

revelt commented Oct 18, 2017 • edited Loading

nconf-base commented Oct 18, 2017

revelt commented Oct 18, 2017

ericnorris commented Oct 18, 2017

revelt commented Oct 18, 2017

ghost commented Nov 22, 2017 • edited by ghost Loading

revelt commented Nov 22, 2017 • edited Loading

StorytellerCZ commented Dec 7, 2017

az7arul commented Dec 7, 2017

ghost commented Dec 7, 2017

ghost commented Dec 7, 2017

najibk commented Dec 26, 2017

angleman commented Dec 26, 2017

revelt commented Mar 1, 2018 • edited Loading

yieme commented Jul 27, 2018

danzGentrack commented Sep 29, 2017 •

edited

Loading

cristianstaicu commented Sep 29, 2017 •

edited

Loading

ericnorris commented Oct 2, 2017 •

edited

Loading

revelt commented Oct 18, 2017 •

edited

Loading

ghost commented Nov 22, 2017 •

edited by ghost

Loading

revelt commented Nov 22, 2017 •

edited

Loading

revelt commented Mar 1, 2018 •

edited

Loading