-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloudflare protection handling can recurse infinitely with some user-agents #1490
Comments
Title is misleading; @jef Recommend renaming title to "Cloudflare protection handling can recurse infinitely with some user-agents" |
@jef We should re-open this issue. It still exists: Config, not that it should matter:
|
To clarify, I guess the recursion issue has "technically" been resolved, so maybe we don't need to re-open this issue after all. However, there is absolutely zero chance of ever properly scraping Zotac, because it says |
This means you've been identified as a bot and are being blocked. There is
nothing wrong with the code, you are just hitting zotac too frequently and
getting blocked. I am still successfully running this code and getting
results from zotac.
…On Sun, Jan 31, 2021, 11:03 AM DeeJayhX ***@***.***> wrote:
To clarify, I guess the recursion issue has "technically" been resolved,
so maybe we don't need to re-open this issue after all. However, there is
absolutely zero chance of ever properly scraping Zotac, because it says cloudflare,
waiting every single time.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1490 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABA3WH3GGGVSSEIWFWAEAKLS4WSRHANCNFSM4VEYAVNQ>
.
|
What frequency are you hitting them? Also, all my amazon hits are coming up as CAPTCHA. Same problem? I have them all set to the defaults, by the way. |
Even with more relaxed settings, I am still getting cloudflare timeouts for Zotac. The amazon CAPTCHA seems to have cleared up. What settings do you use? |
If you've already been flagged and blocked then you will not be able to hit
their site again without cycling your IP address.
Have you changed IP address since you were first sent into the infinite
cloudflare loop?
…On Sun, Jan 31, 2021 at 12:12 PM DeeJayhX ***@***.***> wrote:
This means you've been identified as a bot and are being blocked. There is
nothing wrong with the code, you are just hitting zotac too frequently and
getting blocked. I am still successfully running this code and getting
results from zotac.
… <#m_-8534991575536036615_>
On Sun, Jan 31, 2021, 11:03 AM DeeJayhX *@*.***> wrote: To clarify, I
guess the recursion issue has "technically" been resolved, so maybe we
don't need to re-open this issue after all. However, there is absolutely
zero chance of ever properly scraping Zotac, because it says cloudflare,
waiting every single time. — You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#1490 (comment)
<#1490 (comment)>>,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABA3WH3GGGVSSEIWFWAEAKLS4WSRHANCNFSM4VEYAVNQ
.
PAGE_BACKOFF_MIN=60000
PAGE_SLEEP_MIN=10000
PAGE_SLEEP_MAX=12022
Even with more relaxed settings, I am still getting cloudflare timeouts
for Zotac. The amazon CAPTCHA seems to have cleared up. What settings do
you use?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1490 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABA3WH4EUDHPNVJTUOVNDJDS4W2S5ANCNFSM4VEYAVNQ>
.
--
Brian "Neatchee" Resnik
|
No. I'm able to access their site just fine and view the stock myself from the very same IP address. |
That's not the same. Puppeteer (which we use for automating the browser)
leaves some "fingerprints" so they can and will block those access attempts
while allowing your normal browser.
Please try getting a new IP address and see if you have better results.
…On Mon, Feb 1, 2021 at 4:08 PM DeeJayhX ***@***.***> wrote:
If you've already been flagged and blocked then you will not be able to
hit their site again without cycling your IP address. Have you changed IP
address since you were first sent into the infinite cloudflare loop?
No. I'm able to access their site just fine and view the stock myself from
the very same IP address.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1490 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABA3WHYR3EQXV3SDCHJBCRLS4466LANCNFSM4VEYAVNQ>
.
--
Brian "Neatchee" Resnik
|
|
Hmm. You're right, they changed something. I am now only getting very
intermittent success from Zotac though it usually 403s rather than
infinitely looping cloudflare.
Anyway, it's definitely nothing to do with the way we're handling it. They
just seem to have gotten savvy to our methods.
I'll try to do some troubleshooting when I get the chance
…On Mon, Feb 1, 2021, 5:01 PM DeeJayhX ***@***.***> wrote:
Here's an attempt with multiple IP address changes (pretty much a new IP
between each failure. I guess Zotac, of all places, must just have the best
bot detection on the planet.
[image: image]
<https://user-images.githubusercontent.com/1514573/106537325-f0161c00-64ae-11eb-9525-8c6a442e22f6.png>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1490 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABA3WH3GZ2CYFUMGU45QHKLS45FGRANCNFSM4VEYAVNQ>
.
|
Expected Behavior
Stores with Cloudfare protection should be able to wait and correctly load the webpage to capture the stock information.
Current Behavior
Stores such as CCL (UK) do not work with the current implementation of DDoS protection handling due to the use of the
top-user-agents
module. It stays on the same product and shows asCLOUDFARE, WAITING
I did some digging and found that the reason was due to the use of unconventional user agents being utilised by the randomiser and therefore any Cloudfare stores do not progress onto the actual intended webpage.
Steps to Reproduce
CLOUDFARE, WAITING
endlesslyThe text was updated successfully, but these errors were encountered: