Replies: 6 comments 18 replies
-
No there is no way anyone can smell blocking just like in the messenger apps. But you can definitely change the user-agent string
|
Beta Was this translation helpful? Give feedback.
-
Yes there is this equivalent config key |
Beta Was this translation helpful? Give feedback.
-
Well you have to manually blacklist or whitelist domains in the session object as of pywebcopy 7.0.2, this is a design limitation. |
Beta Was this translation helpful? Give feedback.
-
The record of all the urls along with their downloaded path is stored in a
|
Beta Was this translation helpful? Give feedback.
-
You have the |
Beta Was this translation helpful? Give feedback.
-
What happens when you visit a social media site daily or even hourly? Nothing. |
Beta Was this translation helpful? Give feedback.
-
Intended goal: Downloading whole "self-help" websites to hot-plugging individual page entries into LLMs, and use it to cluster webpage content.
Requirement:
open_in_browser
would not make much sense, similar to howbs4
is always faster thanselenium
for other scraper solutions)bypass_robots
may be an issue, butdelay=None
would be goodthreaded
between multiple IPs/computers would be better than just concurrently downloading from multiple websites)Questions:
Beta Was this translation helpful? Give feedback.
All reactions