Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harden url fetcher and don't crash on non-ASCII urls #219

Merged
merged 1 commit into from
Mar 28, 2016

Conversation

xPaw
Copy link
Member

@xPaw xPaw commented Mar 25, 2016

Sending https://google.com\x0f would crash some lounge instances due to some obscurities in node.

This PR does following:

  • Strip IRC formatters from messages before searching for urls
  • Encode urls to prevent crashing
  • Set timeout and max redirects

@xPaw xPaw added the Type: Bug Issues that report and PRs that solve any defects that cause unexpected behaviors. label Mar 25, 2016
@@ -103,6 +102,8 @@ function fetch(url, cb) {
try {
var req = request.get({
url: url,
maxRedirects: 3,
timeout: 3000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these arbitrary? If so, maybe we could go a bit higher, like 5/10000, but I don't have an opinion.
I guess we would empirically adjust these values, worst case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly arbitrary, 3 seconds should be enough to grab page title though. Having too big of a timeout won't be very nice to the user.

@maxpoulin64
Copy link
Member

Looks good to me, tested working. 👍

I agree with @astorije here that 3 redirects and 3 seconds is a bit low. It's somewhat easy to blow the 3 redirects. Short url + access to invalid area (= redirect to home page of the site) + redirect to regional page, we've reached 3. Some sites can take over 3 seconds to load when crowded (Reddit/imgur are easy examples) too, so 5/10000 sounds reasonable without being abusive.

@xPaw
Copy link
Member Author

xPaw commented Mar 27, 2016

Do keep in mind that timeout does not kick in while content is loading

On Sun, 27 Mar 2016, 02:23 Max-P, notifications@github.com wrote:

Looks good to me, tested working. [image: 👍]

I agree with @astorije https://github.com/astorije here that 3
redirects and 3 seconds is a bit low. It's somewhat easy to blow the 3
redirects. Short url + access to invalid area (= redirect to home page of
the site) + redirect to regional page, we've reached 3. Some sites can take
over 3 seconds to load when crowded (Reddit/imgur are easy examples) too,
so 5/10000 sounds reasonable without being abusive.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#219 (comment)

@maxpoulin64
Copy link
Member

So that's a timeout for establishing the connection? I'm fine with that then.

@astorije
Copy link
Member

@xPaw, it makes sense re: title/content, but I often notice that long loading times happen even before the page starts getting in. I'm fine with 5 or 10 seconds then, still wondering if 3 is not a bit too short in some situations.
What about the redirects? @maxpoulin64 has a good point.

@xPaw
Copy link
Member Author

xPaw commented Mar 27, 2016

It should be, but needs to be looked in more closely to know for sure.

On Sun, 27 Mar 2016, 02:31 Max-P, notifications@github.com wrote:

So that's a timeout for establishing the connection? I'm fine with that
then.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#219 (comment)

@xPaw
Copy link
Member Author

xPaw commented Mar 27, 2016

I've tweaked it to 5/5000

@@ -16,10 +16,9 @@ module.exports = function(irc, network) {
}

var links = [];
var split = data.message.split(" ");
var split = data.message.replace(/\x02|\x1D|\x1F|\x16|\x0F|\x03(?:[0-9]{1,2}(?:,[0-9]{1,2})?)?/g, "").split(" ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record/archive purpose, this is the same check as this one, to clean up messages from colors, boldness, ...

@astorije
Copy link
Member

👍, thanks for this @xPaw!

@astorije astorije merged commit df5e032 into master Mar 28, 2016
@astorije astorije changed the title Harden url fetcher Harden url fetcher and don't crash on non-ASCII urls Mar 28, 2016
@astorije astorije deleted the xpaw/limit-req branch March 28, 2016 03:18
@astorije astorije self-assigned this Apr 2, 2016
@astorije astorije added this to the 1.4.1 milestone Apr 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Issues that report and PRs that solve any defects that cause unexpected behaviors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants