-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tumblr, not all images are downloaded #48
Comments
Just as I was tinkering around with Tumblr and thinking about opening a new issue.. 😄 Testing with The returned array for When I'm home I'll try it with this: Might depend on the post type.. PS: |
DiSiqueira/TumblrDownloader and gallery-dl are both using tumblr's old API, which only reports 123 posts (not images) in total. https://github.com/bbolli/tumblr-utils/blob/master/tumblr_backup.py actually downloaded more than 300 images for wrapmagazine/illustration. It might be time to switch to the new API version ... |
Tumblr API v2 is up and running and produces the same results as the old API, so the initial amount of 123 images appears to be correct. It seems that using tags gives some pretty counter-intuitive results for
|
How do you get to 123 posts in total without any tags set? From Additionally: Can also be obtained via: Besides, 150 posts for illustration, while 123 in total, does not really make much sense 😄 [1] https://www.tumblr.com/docs/en/api/v2#posts As I understand it.. |
The numbers above are for posts with "Type" set to There are indeed 287 posts in total ( Applying the illustration tag changes There is also other "weird" or unexpected behavior when using tags: requesting for example posts 51 to 100 sometimes only gets you, lets say, 32 posts instead of the expected 50, even though there are more posts after that. |
Well, it does make sense, not accounting for type, 287 posts in total, of which 150 have the tag 'illustration'.
Indeed. I see what you mean. I think this basically means that you can't do something like that Because the API simply does not support it (because of additional load?). Or to be more specific, maybe you actually can, but have to ignore I just tried But okay, this is all not really the issue, I'd say, because relying on The crux is the way how Tumblr works, which is a bit needlessly complicated (others might argue it's flexible), I'd say. So I'm really not surprised that this discussion thread here exists 😄 The 'Make a post' functionality on your Tumblr Dashboard lets you pick between the seven types, but not all Blogs on Tumblr make the sensible choice to only use Photo (which can be a single photo post or a photo set) for images. Some users have the habit to use the Link feature, which automatically creates embedded images if used in conjunction with certain sites (I can definitely say Instagram, and I think Flickr as well, probably more). And there's of course the Text post, which lets you insert photos and even videos with the click of a single button (and the obligatory GIF search, obviously.) for more joyful inlined content. On top of that, you can do the same for Quote. So, basically, full HTML as the post body. |
This adds support for audio and video posts (most videos are shared from youtube/instagram which isn't supported -> youtube-dl), as well as link posts and image-search inside of text posts. Most of this is just WIP and will need some sort of improvement and options to enable/disable different media types etc.
- posts : list of post-types to inspect - inline : scan post bodies for inline images - external: follow external links
I think the last commit pretty much implements everything @Hrxn's last paragraph hints at (even if it took me far longer than it should have):
By default everything should behave like it did before and only get images from "Photo" posts, but it is now possible to configure gallery-dl to get everything ... hopefully. (*)
Tumblr even supports Danbooru and Pixiv, which is really not what I would have expected. |
Hey, great news!
Probably best used together with
Great idea!
Expected, doesn't work in the browser either.
Expected, I think. Probably caused by a set of "standard" GIFs on Tumblr, displayed in the editor interface etc. for quick access as "reaction GIFs", I presume. As far as I know, they still use an older URL address scheme. And I saw this is already fixed with b14de6f, basically.
That is true. But this is usual behaviour, I'd say, not just for Tumblr. And it is still better than what youtube-dl does, for example, which doesn't use these 'raw' URLs and thus returns 720p at best. |
For the most part yes, that is what it is being useful for, as most external links seem to point to youtube, instagram, vine, etc., but I have also found a user linking to his flickr images, which would then be downloaded using the flickr extractors.
I don't think this has necessarily something to do with "standard" GIFs, especially when looking at what kind of GIFs are affected by this, but then again I don't use Tumblr myself. What I've figured out so far is that all GIF URLs end in either There are even some audio files which have similar problems (403 Forbidden, infinite redirect) and there is nothing that can be done about that. |
First of all, great project! The Tumblr extractor seems to download only a limited amount of images.
E.g.
gallery-dl gives me 78 images, while DiSiqueira/TumblrDownloader gives me all 123 images.
The text was updated successfully, but these errors were encountered: