Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to set the timeout of an HTTP request #502

Merged
merged 5 commits into from
Jul 6, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions src/HtmlAgilityPack.Shared/HtmlWeb.cs
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ public partial class HtmlWeb
private bool _usingCacheAndLoad;
private bool _usingCacheIfExists;
private string _userAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:x.x.x) Gecko/20041107 Firefox/x.x";
private int _timeout = 100000;

/// <summary>
/// Occurs after an HTTP request has been executed.
Expand Down Expand Up @@ -801,6 +802,15 @@ internal static HttpClient GetSharedHttpClient(string userAgent)
/// <value>The automatic decompression.</value>
public DecompressionMethods AutomaticDecompression { get; set; }

/// <summary>
/// Gets or sets the timeout value in milliseconds.
/// </summary>
public int Timeout
{
get { return _timeout; }
set { if (value < 0) { _timeout = 0; } else { _timeout = value; } }
Copy link
Contributor

@elgonzo elgonzo Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should an assignment like HtmlWeb.Timeout = -4567 be equivalent to HtmlWeb.Timeout = 0, and why and how would some HAP user possibly expect -4567 being the same as 0 for timeout values?

In my opinion it would rather make more sense to throw an ArgumentOutOfRangeException for TimeOut values less than or equal to zero except Timeout.Infinite (Timeout.Infinite is the value -1 which indicates inifinite timeout), equivalent to the behavior of HttpWebRequest.Timeout (rejecting negative values except Timeout.Infinite) and HttpClient.Timeout (rejecting TimeSpan values that are zero or less than zero except TimeSpan.Infinite).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I adjusted the setter to throw a ArgumentOutOfRangeException if the value is less than or equal to zero. Thanks!

Copy link
Contributor

@elgonzo elgonzo Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that i edited my comment, as i originally did not account for Timeout.Infinite (which is the value -1)

So, basically, the setter should throw if value <= 0 && value != Timeout.Infinite.

(My apologies for this oversight in my original comment.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I do not expect any users to be waiting infinitely for a document to load. The default value for HttpClient.Timeout and HttpWebRequest.Timeout is 100,000 ms, or 100 seconds. Adding an infinite timeout is a bit troublesome since HttpWebRequest requires an integer while HttpClient requires a Timespan object to be passed instead. I think throwing an exception if the value is zero or less is more of a practical approach.

Copy link
Contributor

@elgonzo elgonzo Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an infinite timeout is a bit troublesome since HttpWebRequest requires an integer while HttpClient requires a Timespan object to be passed instead.

No, it's actually not troublesome :-) because Timeout.InfiniteTimeSpan is per definition a TimeSpan value of -1 millisecond. ThereforeTimeSpan.FromMilliseconds(-1) (i.e., TimeSpan.FromMilliseconds(Timeout.Infinite)) equals Timeout.InfiniteTimeSpan. No special treatment necessary here, just pass the integer timeout value to TimeSpan.FromMilliseconds and Bob's your uncle...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh really? I thought TimeSpan.FromMilliseconds(-1) would throw an exception that it's out of bounds. In this case, we can change that if statement to be this if (value <= 0 && value != -1) { throw new ArgumentOutOfRangeException(); } There you go, works as expected.

}

/// <summary>
/// Gets or Sets a value indicating if document encoding must be automatically detected.
/// </summary>
Expand Down Expand Up @@ -1571,6 +1581,7 @@ private HttpStatusCode Get(Uri uri, string method, string path, HtmlDocument doc
bool oldFile = false;

req = WebRequest.Create(uri) as HttpWebRequest;
req.Timeout = Timeout;
req.Method = method;
req.UserAgent = UserAgent;
req.AutomaticDecompression = AutomaticDecompression;
Expand Down Expand Up @@ -1841,6 +1852,8 @@ private HttpStatusCode Get(Uri uri, string method, string path, HtmlDocument doc
using (var handler = new HttpClientHandler())
using (var client = new HttpClient(handler))
{
client.Timeout = TimeSpan.FromMilliseconds(Timeout);

if(CaptureRedirect)
{
handler.AllowAutoRedirect = false;
Expand Down