Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flickr not downloading #6360

Closed
hammerheaddf opened this issue Oct 22, 2024 · 17 comments
Closed

flickr not downloading #6360

hammerheaddf opened this issue Oct 22, 2024 · 17 comments

Comments

@hammerheaddf
Copy link

Running gallery-dl 1.27.6 on Ubuntu arm64 (Raspberry Pi).
Created custom api keys following documentation, and added them to .gallery-dl.conf.
When trying to download from user Jo Watt (NSFW), the API doesn't fetch anything:

$ gallery-dl -v https://www.flickr.com/photos/196255447@N06/
[gallery-dl][debug] Version 1.27.6
[gallery-dl][debug] Python 3.12.3 - Linux-6.8.0-1013-raspi-aarch64-with-glibc2.39
[gallery-dl][debug] requests 2.32.3 - urllib3 2.2.3
[gallery-dl][debug] Configuration Files ['${HOME}/.gallery-dl.conf']
[gallery-dl][debug] Starting DownloadJob for 'https://www.flickr.com/photos/196255447@N06/'
[flickr][debug] Using FlickrUserExtractor for 'https://www.flickr.com/photos/196255447@N06/'
[flickr][debug] Using custom api_key authentication
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): api.flickr.com:443
[urllib3.connectionpool][debug] https://api.flickr.com:443 "GET /services/rest/?url=https%3A%2F%2Fwww.flickr.com%2Fphotos%2F196255447%40N06&method=flickr.urls.lookupUser&format=json&nojsoncallback=1&api_key=myverylongapikey HTTP/11" 200 None
[flickr][debug] Sleeping 1.90 seconds (request)
[urllib3.connectionpool][debug] https://api.flickr.com:443 "GET /services/rest/?user_id=196255447%40N06&extras=description%2Cdate_upload%2Ctags%2Cviews%2Cmedia%2Cpath_alias%2Cowner_name%2Curl_o%2Curl_6k%2Curl_5k%2Curl_4k%2Curl_3k%2Curl_k%2Curl_h%2Curl_l&page=1&method=flickr.people.getPhotos&format=json&nojsoncallback=1&api_key=myverylongapikey HTTP/11" 200 None
[flickr][info] No results for https://www.flickr.com/photos/196255447@N06/

Funny enough, when I access the same API entry point over Firefox on the same machine, the requests get the expected JSON. The only difference I spotted is that FF uses HTTP/2, when urllib goes on HTTP/1.1.
This is the back-and-forth from FF:

GET	https://api.flickr.com/services/rest/?url=https://www.flickr.com/photos/196255447@N06&method=flickr.urls.lookupUser&format=json&nojsoncallback=1&api_key=myverylongapikey

{"user":{"id":"196255447@N06","username":{"_content":"Hotwife Jo Watt"}},"stat":"ok"}

I configured gallery-dl to pull cookies from the local FF, and even to emulate it via the browser option. The results were the same as above. Here's my config snippet for flickr:

"flickr": {
	"user-agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:131.0) Gecko/20100101 Firefox/131.0",
	"cookies": ["firefox"],
	"cookies-update": "cookies.txt",
	"browser": "firefox",
	"api-key": "myverylongapikey",
	"api-secret": "myapisecret",
            "access-token": "myverylongtoken",
            "access-secret": "myothersecret"
	}

Does anybody have an idea why flickr is acting up?

@mikf
Copy link
Owner

mikf commented Oct 22, 2024

            "access-secret": "myothersecret"

This should be access-token-secret, but I'm not sure if it makes a difference here.

@hammerheaddf
Copy link
Author

hammerheaddf commented Oct 22, 2024

This should be access-token-secret, but I'm not sure if it makes a difference here.

It does not indeed. I corrected the config and had the same results.

@ramiropistoia
Copy link

Downloading without login download few images and then keep getting 403 forbidden errors. Formerly it didn't happen

@hammerheaddf
Copy link
Author

UPDATE:
I dropped the browser emulation config and tried to scrape the user I mentioned on OP.
Here's the current config:

        "flickr": {
                "api-key": "myverylongapikey",
                "api-secret": "myapisecret",
                "access-token": "myverylongtoken",
                "access-token-secret": "myothersecret"
                }

It decided to download some of the files. Some others, however, failed with either errors 403 or 429.
Here's a snippet of the results, running gallery-dl -v:

[urllib3.connectionpool][debug] Resetting dropped connection: live.staticflickr.com
[urllib3.connectionpool][debug] https://live.staticflickr.com:443 "GET /65535/53131855411_51145ae91c_3k.jpg HTTP/11" 403 93
[downloader.http][warning] '403 Forbidden' for 'https://live.staticflickr.com/65535/53131855411_51145ae91c_3k.jpg'
[download][error] Failed to download flickr_53131855411.jpg
[urllib3.connectionpool][debug] https://api.flickr.com:443 "GET /services/rest/?photo_id=53131855436&method=flickr.photos.getSizes&format=json&nojsoncallback=1 HTTP/11" 200 None
/mypath/flickr/Hotwife Jo Watt/flickr_53131855436.jpg
[urllib3.connectionpool][debug] Resetting dropped connection: live.staticflickr.com
[urllib3.connectionpool][debug] https://live.staticflickr.com:443 "GET /65535/53132337418_0bb7087959_4k.jpg HTTP/11" 429 117
[downloader.http][warning] '429 Too Many Requests' for 'https://live.staticflickr.com/65535/53132337418_0bb7087959_4k.jpg'

What in the world could be happening now?

@BrokenBrainiac
Copy link

BrokenBrainiac commented Nov 1, 2024

A few days ago I started having issues with Flickr images not downloading (403 errors) whereas before everything was working 100%. I am not a professional computer guy, just an image collector. I went through my conf file and redid all of the fields mentioned above by hammerheaddf but for my system, exported cookies for Flickr, correct User Agent (this field is very imported for Twitter), created my own thing in the Flickr App Garden, used OAuth to get the extra two access fields. I have delayed the timers, sleep is 5.1 seconds for instance.

But when I start downloading I fetch a few images, then 403 errors alternate with fetches, then just endless '403 Forbidden'. I start a new session 10 minutes later and the same thing happens. The file names are 'funny' because I am trying out renaming rules to get more useful information than just a string of numbers.

Bits from the log:
[gallery-dl][debug] Starting DownloadJob for 'https://www.flickr.com/photos/biodivlibrary/albums/72157718896659461'
[flickr][debug] Using FlickrAlbumExtractor for 'https://www.flickr.com/photos/biodivlibrary/albums/72157718896659461'
[flickr][debug] Using custom OAuth1.0 authentication
[flickr][debug] Sleeping 2.10 seconds (extractor)
[urllib3.connectionpool][debug] https://live.staticflickr.com:443 "GET /65535/50144350763_192199ef2f_o.jpg HTTP/11" 200 1170900
[flickr][debug] Sleeping 5.10 seconds (download)
[urllib3.connectionpool][debug] https://live.staticflickr.com:443 "GET /65535/50145133807_bcc4eb70a7_o.jpg HTTP/11" 403 93
[downloader.http][warning] '403 Forbidden' for 'https://live.staticflickr.com/65535/50145133807_bcc4eb70a7_o.jpg'
[download][error] Failed to download The birds of North AmericaNew York, U.S.A. ꞉Published under__n72_w1150 - 50145133807_2087x2784.jpg
[flickr][debug] Sleeping 5.10 seconds (download)
[urllib3.connectionpool][debug] Resetting dropped connection: live.staticflickr.com
[urllib3.connectionpool][debug] https://live.staticflickr.com:443 "GET /65535/50144895516_3d4334bce9_o.jpg HTTP/11" 200 1261589
[flickr][debug] Sleeping 5.10 seconds (download)
...
[flickr][debug] Sleeping 5.10 seconds (download)
[urllib3.connectionpool][debug] Resetting dropped connection: live.staticflickr.com
[urllib3.connectionpool][debug] https://live.staticflickr.com:443 "GET /65535/50058214946_41bc28037c_o.jpg HTTP/11" 403 93
[downloader.http][warning] '403 Forbidden' for 'https://live.staticflickr.com/65535/50058214946_41bc28037c_o.jpg'
[download][error] Failed to download The birds of AmericaNew York ꞉G.R. Lockwood,[1871], c1839.__n262_w1150 - 50058214946_1979x3200.jpg
[flickr][debug] Sleeping 5.10 seconds (download)
[urllib3.connectionpool][debug] Resetting dropped connection: live.staticflickr.com
[urllib3.connectionpool][debug] https://live.staticflickr.com:443 "GET /65535/50058215281_973648e23c_o.jpg HTTP/11" 403 93
[downloader.http][warning] '403 Forbidden' for 'https://live.staticflickr.com/65535/50058215281_973648e23c_o.jpg'
[download][error] Failed to download The birds of AmericaNew York ꞉G.R. Lockwood,[1871], c1839.__n266_w1150 - 50058215281_1985x3200.jpg
[flickr][debug] Sleeping 5.10 seconds (download)

Interestingly, when I copy and paste the Jpeg URLs into my Browser I get: 403 Forbidden
Request forbidden by administrative rules.
The OAuth Access Token thingy doesn't seem to be bypassing the "Administrative Rule'. I can still view and download the full size originals _o by browsing the site normally in my browser, it's just automatic image ripping that is the problem. Downloading a set of 100 images one by one is unenjoyable and a massive waste of our limited time resource.

@clouds321
Copy link

I am also having problems with Flickr recently, I think they rate limiting the free API heavily now.

After around 15 consecutive downloads I start to get errors, 403 but also 429

[downloader.http][warning] '429 Too Many Requests' for 'https://live.staticflickr.com/

@clouds321
Copy link

increasing even more the sleep time seems to have helped

@mikf
Copy link
Owner

mikf commented Nov 7, 2024

The retry-codes option together with a high-enough retry count might also be useful.

"retry-codes": [403, 420],
"retries": 50,

@Hrxn
Copy link
Contributor

Hrxn commented Nov 9, 2024

It seems that you cannot view all content on flickr anymore, at least that is what I am seeing here with my account.

There used to be this setting in the account options which allows you to see restricted content (as in NSFW content), that setting is still there but I cannot activate it anymore, because apparently a flickr Pro account is now required.

Not sure if this is just my flickr test account, or if you guys are seeing the same thing as well.

@mikf
Copy link
Owner

mikf commented Nov 9, 2024

I think I found a solution.

Instead of using regular image URLs, transform them to "download" URLs by adding _d to the end:

https://live.staticflickr.com/7463/16089302239_de18cd8017_b.jpg
https://live.staticflickr.com/7463/16089302239_de18cd8017_b_d.jpg

Patch:

diff --git a/gallery_dl/extractor/flickr.py b/gallery_dl/extractor/flickr.py
index df252ee3..eb5c6418 100644
--- a/gallery_dl/extractor/flickr.py
+++ b/gallery_dl/extractor/flickr.py
@@ -45,7 +45,7 @@ class FlickrExtractor(Extractor):
                 self.log.debug("", exc_info=exc)
             else:
                 photo.update(data)
-                url = photo["url"]
+                url = self._file_url(photo)
                 yield Message.Directory, photo
                 yield Message.Url, url, text.nameext_from_url(url, photo)
 
@@ -57,6 +57,13 @@ class FlickrExtractor(Extractor):
     def photos(self):
         """Return an iterable with all relevant photo objects"""
 
+    def _file_url(self, photo):
+        if "video" in photo:
+            return photo["url"]
+
+        path, _, ext = photo["url"].rpartition(".")
+        return path + "_d." + ext
+
 
 class FlickrImageExtractor(FlickrExtractor):
     """Extractor for individual images from flickr.com"""
@@ -98,7 +105,7 @@ class FlickrImageExtractor(FlickrExtractor):
                 if isinstance(value, dict):
                     location[key] = value["_content"]
 
-        url = photo["url"]
+        url = self._file_url(photo)
         yield Message.Directory, photo
         yield Message.Url, url, text.nameext_from_url(url, photo)
 

@gamer191
Copy link

@mikf Is there a reason you haven't merged that patch into the main project?

@misteramazingyt
Copy link

misteramazingyt commented Nov 10, 2024

@mikf this is now unfortunately returning the following:

[downloader.http][warning] '403 Forbidden' for 'https://live.staticflickr.com/7085/7345096684_8b6e616120_o.jpg'

@mikf
Copy link
Owner

mikf commented Nov 10, 2024

@misteramazingyt
You are not using the patch from #6360 (comment), otherwise it would use https://live.staticflickr.com/7085/7345096684_8b6e616120_o_d.jpg as URL.

@tommyylc
Copy link

I think I found a solution.

Instead of using regular image URLs, transform them to "download" URLs by adding _d to the end:

https://live.staticflickr.com/7463/16089302239_de18cd8017_b.jpg
https://live.staticflickr.com/7463/16089302239_de18cd8017_b_d.jpg

Patch:

diff --git a/gallery_dl/extractor/flickr.py b/gallery_dl/extractor/flickr.py
index df252ee3..eb5c6418 100644
--- a/gallery_dl/extractor/flickr.py
+++ b/gallery_dl/extractor/flickr.py
@@ -45,7 +45,7 @@ class FlickrExtractor(Extractor):
                 self.log.debug("", exc_info=exc)
             else:
                 photo.update(data)
-                url = photo["url"]
+                url = self._file_url(photo)
                 yield Message.Directory, photo
                 yield Message.Url, url, text.nameext_from_url(url, photo)
 
@@ -57,6 +57,13 @@ class FlickrExtractor(Extractor):
     def photos(self):
         """Return an iterable with all relevant photo objects"""
 
+    def _file_url(self, photo):
+        if "video" in photo:
+            return photo["url"]
+
+        path, _, ext = photo["url"].rpartition(".")
+        return path + "_d." + ext
+
 
 class FlickrImageExtractor(FlickrExtractor):
     """Extractor for individual images from flickr.com"""
@@ -98,7 +105,7 @@ class FlickrImageExtractor(FlickrExtractor):
                 if isinstance(value, dict):
                     location[key] = value["_content"]
 
-        url = photo["url"]
+        url = self._file_url(photo)
         yield Message.Directory, photo
         yield Message.Url, url, text.nameext_from_url(url, photo)
 

Excuse me, I am using the standalone EXE version, how can I use this patch?

@mikf
Copy link
Owner

mikf commented Nov 19, 2024

gallery-dl --update-to dev

@tommyylc
Copy link

It works now! Thank you very much!

@mikf mikf unpinned this issue Nov 22, 2024
@misteramazingyt
Copy link

Working now! Thank you for the assistance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants