You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found there is a same issue,but it was closed and not solved,so I want to recreate a new issue.
Sometimes gocrawl do not meet our requirements and we need to use goquery to parse and fetch data,if we use goquery to frequently,it may be blocked by the website,so user-agent configuration is necessary.
Do you have any plan to implement it?
If you do not plan to add this function, can you provide us some useful suggest for how can we add user-agent or proxy when we invoke the method below doc, err := goquery.NewDocument(url)
Thanks a lot
The text was updated successfully, but these errors were encountered:
As mentioned in #173 , NewDocument(url) is just a helper function that should not have been added in the first place. Goquery is not concerned with how you get the html, it is about manipulating this html.
To set the user-agent, the same recommendation I made in the issue you linked still stands - use Go's stdlib (or any other network request package) to make the request - with full support to set user-agent and anything you want about the request, and once you get a response you're happy with, pass it to NewDocumentFromReader.
I found there is a same issue,but it was closed and not solved,so I want to recreate a new issue.
Sometimes gocrawl do not meet our requirements and we need to use goquery to parse and fetch data,if we use goquery to frequently,it may be blocked by the website,so user-agent configuration is necessary.
Do you have any plan to implement it?
If you do not plan to add this function, can you provide us some useful suggest for how can we add user-agent or proxy when we invoke the method below
doc, err := goquery.NewDocument(url)
Thanks a lot
The text was updated successfully, but these errors were encountered: