You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using Gallery-DL to archive posts on Twitter only a small fraction of the total replies in a thread are being downloaded.
This behavior seems to partly come from the _expand_tweets function:
def_expand_tweets(self, tweets):
seen=set()
fortweetintweets:
obj=tweet["legacy"] if"legacy"intweetelsetweetcid=obj.get("conversation_id_str")
ifnotcid:
tid=obj["id_str"]
self.log.warning(
"Unable to expand %s (no 'conversation_id')", tid)
continueifcidinseen:
self.log.debug(
"Skipping expansion of %s (previously seen)", cid)
continueseen.add(cid)
try:
yieldfromself.api.tweet_detail(cid)
exceptException:
yieldtweet
Adding the conversation_id to seen anywhere, either before or after the tweet is yielded--results in all of the top-level replies to a tweet being added to seen before they are expanded.
[twitter][debug] Skipping expansion of 1812956662950400214 (previously seen)
[twitter][debug] Skipping expansion of 1812956662950400214 (previously seen)
[twitter][debug] Skipping expansion of 1812956662950400214 (previously seen)
[twitter][debug] Skipping expansion of 1812956662950400214 (previously seen)
[twitter][debug] Skipping expansion of 1812956662950400214 (previously seen)
[twitter][debug] Skipping expansion of 1812956662950400214 (previously seen)
Removing seen.add(cid) results in more (but not all) replies being downloaded--but causes the function to loop until "skip": kicks in.
The problem is replies which are deeper than a reply to the main tweet, I.E. a reply to a reply to a reply--do not get downloaded. With the only exception being when author[name] is the same as user[name], and only when its one reply level deep.
I've tested out other configs and have ran into the same issue, butI haven't seen anybody else mention this, so Im wondering if there is a fix within the extarctor config or if it's an API limitation.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
When using Gallery-DL to archive posts on Twitter only a small fraction of the total replies in a thread are being downloaded.
This behavior seems to partly come from the
_expand_tweets
function:Adding the
conversation_id
toseen
anywhere, either before or after the tweet is yielded--results in all of the top-level replies to a tweet being added toseen
before they are expanded.Removing
seen.add(cid)
results in more (but not all) replies being downloaded--but causes the function to loop until"skip":
kicks in.The problem is replies which are deeper than a reply to the main tweet, I.E. a reply to a reply to a reply--do not get downloaded. With the only exception being when
author[name]
is the same asuser[name]
, and only when its one reply level deep.I've tested out other configs and have ran into the same issue, butI haven't seen anybody else mention this, so Im wondering if there is a fix within the extarctor config or if it's an API limitation.
My current Twitter config:
Beta Was this translation helpful? Give feedback.
All reactions