Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[twitter] search now requires login #3942

Open
ClosedPort22 opened this issue Apr 21, 2023 · 8 comments
Open

[twitter] search now requires login #3942

ClosedPort22 opened this issue Apr 21, 2023 · 8 comments

Comments

@ClosedPort22
Copy link
Contributor

ClosedPort22 commented Apr 21, 2023

It seems that the search endpoint now returns 403 when not logged in.

Related issues:
JustAnotherArchivist/snscrape#846
trevorhobenshield/twitter-api-client#22

Edit: Nitter has switched to the newer GraphQL-based search API, which doesn't seem to be login-gated yet: zedeus/nitter@1ac389e

@Twi-Hard
Copy link

I'm surprised it's not been mentioned yet.. It's been working again for a couple days now. I don't have any form of authentication.

❯ gdl --version
1.25.3-dev

@ClosedPort22
Copy link
Contributor Author

It still shows 403 Forbidden for me.

@Twi-Hard
Copy link

Twi-Hard commented Apr 26, 2023

I tried running it in docker with a vpn in 25 different cities and it worked fine there too. I'm not sure why it's only working for me. The syndication API 404s but I assume that was already brought up. If any logs or anything could help, let me know

Edit: I didn't use a vpn when not using docker

@rosswillett
Copy link

I'm also running within docker and regardless of vpns it 403's

@Twi-Hard can you attempt a connection using this command and/or provide back a similar command that I could test against?

gallery-dl -o videos=true https://twitter.com/search?q=from%3Aelonmusk%20since%3A2023-04-10&src=typed_query&f=video

Here's my output:

root@c7a8ce023c2b:/app# gallery-dl -o videos=true https://twitter.com/search?q=from%3Aelonmusk%20since%3A2023-04-10&src=typed_query&f=video
[1] 23987
[2] 23988
root@c7a8ce023c2b:/app# [twitter][error] 403 Forbidden (Forbidden.)

adding --verbose further explains the issue, showing a first request to graphql comes back 200 but the follow-up request to adaptive.json 403's

@Twi-Hard
Copy link

Oh, I completely missed this is only for searches.. I only download users and didn't realize it didn't do the search part. I'm sorry about this

@mikf
Copy link
Owner

mikf commented Apr 30, 2023

Edit: Nitter has switched to the newer GraphQL-based search API, which doesn't seem to be login-gated yet: zedeus/nitter@1ac389e

I tried implementing this, but it does not work anymore as is also being reported on Nitter's issue tracker.
Here's the patch in case someone wants it:

patch
diff --git a/gallery_dl/extractor/twitter.py b/gallery_dl/extractor/twitter.py
index 5e68f138..0eb126f3 100644
--- a/gallery_dl/extractor/twitter.py
+++ b/gallery_dl/extractor/twitter.py
@@ -1053,6 +1053,8 @@ class TwitterAPI():
             cookies.set("ct0", csrf_token, domain=cookiedomain)
 
         auth_token = cookies.get("auth_token", domain=cookiedomain)
+        if not auth_token:
+            self.search_adaptive = self.search_graphql
 
         self.headers = {
             "Accept": "*/*",
@@ -1265,6 +1267,18 @@ class TwitterAPI():
         params["spelling_corrections"] = "1"
         return self._pagination_legacy(endpoint, params)
 
+    def search_graphql(self, query):
+        endpoint = "/graphql/gkjsKepM6gl_HmFWoWKfgg/SearchTimeline"
+        variables = {
+            "rawQuery": query,
+            "count": 20,
+            "product": "Latest",
+            "withDownvotePerspective": False,
+            "withReactionsMetadata": False,
+            "withReactionsPerspective": False
+        }
+        return self._pagination_tweets(endpoint, variables)
+
     def live_event_timeline(self, event_id):
         endpoint = "/2/live_event/timeline/{}.json".format(event_id)
         params = self.params.copy()

@github-userx
Copy link

Edit: Nitter has switched to the newer GraphQL-based search API, which doesn't seem to be login-gated yet: zedeus/nitter@1ac389e

I tried implementing this, but it does not work anymore as is also being reported on Nitter's issue tracker. Here's the patch in case someone wants it:

patch

Thanks.

looks like not only Twitter is going downhill Wehen it comes to open (API) access and scraping. Soon we probably won’t be able to download reddit content anymore..

https://old.reddit.com/r/reddit/comments/12qwagm/an_update_regarding_reddits_api/

https://old.reddit.com/r/apolloapp/comments/12ram0f/had_a_few_calls_with_reddit_today_about_the/

https://old.reddit.com/r/redditsync/comments/12qwwjh/an_update_regarding_reddits_api_changes_to_how/

@mikf mikf unpinned this issue May 20, 2023
mikf added a commit that referenced this issue Jun 1, 2023
for guest users; selectable with 'search-endpoint' option.

adapted from JustAnotherArchivist/snscrape@9c7b888
@mikf
Copy link
Owner

mikf commented Jun 1, 2023

Searching without login should work again with 54cf1fa. Let's see how long it will last.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants