Skip to content

Commit

Permalink
feat: remove stale user agents, use top 50
Browse files Browse the repository at this point in the history
This change should vastly improve success rate.
  • Loading branch information
jef committed Dec 18, 2020
1 parent f86a825 commit 6e2a162
Show file tree
Hide file tree
Showing 4 changed files with 530 additions and 58 deletions.
Loading

5 comments on commit 6e2a162

@gigi2006
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @jef, I think it's great how you keep going here, with every update I'm one of the first to join in :)

but a question here about the top user agents, do you think that makes sense with the top 50? because then we have relatively all soon all through.

so I alone have 5 VMs (via VPN are all different IP) and if the Top 50 run through soon all have a captchas problem I think.

@jef
Copy link
Owner Author

@jef jef commented on 6e2a162 Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your question correctly, you're curious if using top 50 will get captchas eventually?

It's very possible, but this makes the site believe that we are modern users and hopefully not get as many captchas.

@gigi2006
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jef

yeah you heard me right, i actually still get captchas.

But it's only amazon, mediamarkt and saturn, all from germany.

have also tried umpteen different things, vpn austria, vpn switzerland, I change vpn almost every 6 hours, but still always problems. not always but, I also have 1-2 hours of rest from it.

@jef
Copy link
Owner Author

@jef jef commented on 6e2a162 Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow... That's pretty odd. I know some users with VPN and even using cloud and they don't suffer from much. Perhaps @neatchee can chime in. They have a similar setup and haven't ran into many captchas.

@neatchee
Copy link
Contributor

@neatchee neatchee commented on 6e2a162 Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gigi2006 Are those the only sites you're hitting? I do occasionally get a captcha here or there but it typically resolves. I have needed to rotate my IP a few times when they catch on but it's no more than once a day at the absolute worst.

I've found that the number of sites + products has a huge impact; fewer sites means more frequently hitting the same pages, and the more you hit the same page/site, the more likely you are to get captcha'd.

This change actually resolved a big set of my issues; prior to this it was almost non-stop catpcha and 403 because the useragents were so old.

I will also note that certain sites have become more aggressive about filtering over time; Zotac is a great example.

Additionally, if you're using a well-known VPN provider, MANY sites have those entire IP blocks set up for much stricter captcha rules.

Please sign in to comment.