Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Sleep time because timeout error #2029

Closed
wants to merge 1 commit into from

Conversation

Nabeelshar
Copy link
Contributor

All this temple websites are countering the crawlers when you visit alot urls at the same time and error Timeout error, to counter that have added sleep timeout of total of 30 seconds.
Suggestions

-maybe someone can make this better by looking how websites are check the url like cache ,IP etc

@Nabeelshar Nabeelshar changed the title Added Sleep time because timeout counter Added Sleep time because timeout error Aug 5, 2023
@dipu-bd
Copy link
Owner

dipu-bd commented Aug 9, 2023

This should not work as intended. The concurrency is being handled by TaskManager.

Check a code sample of how to reduce concurrency here:

self.init_executor(1)

@Nabeelshar
Copy link
Contributor Author

Nabeelshar commented Aug 19, 2023

It is working, I will try to look into task manger code later.

This should not work as intended. T

@dipu-bd
Copy link
Owner

dipu-bd commented Aug 27, 2023

You can now use the ratelimiter thanks to #2053. Example:

self.init_executor(ratelimit=1.4)

@dipu-bd dipu-bd closed this Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants