Questions, Feedback, Suggestions #5 #6582

mikf · 2024-12-01T18:51:45Z

Continuation of the previous issue as a central place for any sort of question or suggestion not deserving their own separate issue.

Links to older issues: #11, #74, #146, #5262.

SpiffyChatterbox · 2024-12-02T00:31:15Z

Any news or thoughts on version 2.0? Anything we can do to help?

noshii117 · 2024-12-05T13:20:45Z

is Facebook supported? whenever I try to download from a page using this

gallery-dl --cookies-from-browser firefox https://www.facebook.com/pagenamehere

it says Unsupported URL

Hrxn · 2024-12-05T16:30:48Z

@noshii117 Yes, it should be, make sure you are actually running the latest version of gallery-dl..

biggestsonicfan · 2024-12-06T20:41:23Z

Got a bit of a pickle. I use gallery-dl inside a WSL instance. Not usually a problem because I can point to a mounting path. However, I've come across an instance where I need to use yt-dlp with gallery-dl, and I don't know how I'd pass my Window's browser cookies to yt-dlp in the .gallery-dl.conf file.

mikf · 2024-12-06T20:58:24Z

@biggestsonicfan
Not entirely sure if this works in WSL, but try exporting them as Netscape cookies.txt and pass this file's path via cmdline-args or raw-options.

            "cmdline-args": [
                "--cookies", "C:/path/to/cookies.txt"
            ],
            
            "raw-options": {
                "cookiefile": "C:/path/to/cookies.txt"
            }

When using yt-dlp as downloader, it can directly use gallery-dl's cookies via forward-cookies.

@SpiffyChatterbox
I'll try to start working on it with the start of 2025, but no promises.

nightbrd · 2024-12-09T18:17:42Z

Hey there, is there any way to change how an enumeration index is put into a file name when using "skip": "enumerate"? eg. instead of file.1.ext, turn it into file (1).ext?

mikf · 2024-12-09T18:46:14Z

@nightbrd
There isn't. The format for "skip": "enumerate" is currently hardcoded.

Hrxn · 2024-12-09T18:49:31Z

@nightbrd You can simply use on of the bajillion already existing tools for renaming files, or write a shell script on your own.

As long as the names are consistent, you can easily turn something like file.1.ext into file (1).ext

noshii117 · 2024-12-10T22:28:51Z

@noshii117 Yes, it should be, make sure you are actually running the latest version of gallery-dl..

very late but, yes, it's the first thing I did and it's the same result.

biggestsonicfan · 2024-12-14T21:56:12Z

@mikf

When using yt-dlp as downloader, it can directly use gallery-dl's cookies via forward-cookies.

Tried about everything at this point and I am unsure if it's a race condition or something is just not getting passed correctly, but please see attached verbose output.
verbose-ytdlp.txt

The link itself is a publicly shared video on patreon.

EDIT: Not gonna double post, but I will ask: Is there a way I can run my own automated tests to see if my configuration will give me the results I want instead of constantly jumping back and forth between my config file? I'm still trying to figure out how to dump my json metadata into "paid-posts", "unpaid-posts", and categorizing the metadata into the corresponding cost tier. I feel like it shouldn't be this hard but I am ashamed to have as spent as much time as I have trying to get it to work.

Hrxn · 2024-12-14T23:38:57Z

Is there a way I can run my own automated tests to see if my configuration will give me the results I want instead of constantly jumping back and forth between my config file? I'm still trying to figure out how to dump my json metadata into "paid-posts", "unpaid-posts", and categorizing the metadata into the corresponding cost tier. I feel like it shouldn't be this hard but I am ashamed to have as spent as much time as I have trying to get it to work.

Not sure what kind of automated testing exactly you mean here, but I think (have not really tried this for myself, to be honest, but I want to as well, because it seems very useful to me) gallery-dl has these --print options since a couple of months, basically, and they should be able to help with everything related to conditional formatting options etc.

I'm not sure if --print implies something like --simulate, but if not it's not a problem to use them together.

roastme · 2024-12-15T04:34:06Z

Is artfight.net supported?

biggestsonicfan · 2024-12-15T06:07:11Z

I'm not sure if --print implies something like --simulate, but if not it's not a problem to use them together.

I actually can't get --print to work with --simulate but I think I can use it with --no-download for now, and I do think print will give me the output of what I want. Thanks much!

EDIT: I feel so stupid. Ever since this post, I thought I needed to use locals().get in the filter. Instead isRestricted == False and isRestricted == True are just working a treat.

rsn-yk · 2024-12-21T01:52:33Z

Is it possible to store the Deviation UUID for each download?

As you can only download 10,000 images it would help if I could un-favourite some to download more. To un-fave them I could use curl - https://www.deviantart.com/developers/console/collections/collections_unfave/af7303d7e9023da0bbd6df11c2f38728.

Even the Deviant Art site fails to display more than 10,000 itself - it shows I have 1700 pages, but stops at 417 (at 24 items per page that's 10,008 images).

mikf · 2024-12-21T17:10:46Z

@rsn-yk
You could use --print-to-file file:deviationid FILENAME or a custom metadata post processor in general to write the UUID of each downloaded file to FILENAME. An archive with {deviationid} as archive-format might also work.

@roastme
No, see docs/supportedsites.

mikf · 2024-12-21T17:16:18Z

I'm not sure if --print implies something like --simulate, but if not it's not a problem to use them together.

--print does not imply any other options. It is implemented as a metadata post processor with "filename": "-", and therefore works only for the default DownloadJob. --simulate (SimulationJob), --get-urls (UrlJob), or any other jobs don't run post processors.

mikf · 2024-12-21T17:57:43Z

@biggestsonicfan

Tried about everything at this point and I am unsure if it's a race condition or something is just not getting passed correctly, but please see attached verbose output.

For whatever reason your link downloads without any problems for me:
https://gist.githubusercontent.com/mikf/7b18358c40e4a8051651c14605fbaae1/raw/05b82e50f4acaa29fc47794857be3613f8720c38/patreon_ytdlp_video.log

Maybe because I'm not passing a session_id cookie, so it doesn't do this extra HEAD request before invoking yt-dlp:

urllib3.connectionpool: https://www.patreon.com:443 "HEAD /api/video/387527528/video.m3u8 HTTP/11" 302 0
urllib3.connectionpool: Starting new HTTPS connection (1): stream.mux.com:443
urllib3.connectionpool: https://stream.mux.com:443 "HEAD /nimWUWbI5FNH0200g7tzpKbNFd01en1dSk9Mvj7S9WdpRM.m3u8?token=eyJhbGciOiJSUzI1NiIsImtpZCI6Ik5CY3o3Sk5RcUNmdDdWcmo5MWhra2lEY3Vyc2xtRGNmSU1oSFUzallZMDI0IiwidHlwIjoiSldUIn0.eyJzdWIiOiJuaW1XVVdiSTVGTkgwMjAwZzd0enBLYk5GZDAxZW4xZFNrOU12ajdTOVdkcFJNIiwiZXhwIjoxNzM0MzAwMDAwLCJhdWQiOiJ2IiwicGxheWJhY2tfcmVzdHJpY3Rpb25faWQiOiJGUVExRFZGS2dTZEtKSnFQTVg1T1ZBdk0wMnYyZk9vVll5UnVjV1hGUlVUdyJ9.OV3XcWjfz8U5PZyScwoKu18AdIR8fydlPL0BolC9ikCQzFp2UgDqGYPyFep-5vai_CdCpk24TJVUyXEXNqIaBOQ23SxxQucoN17Pk_92FXEGfCAYsrEeAzxEoE7pOb1qTkTAn65CuVo1URv5K9p2dTTjc8X7YKNb1tOpUwUDKNyDdbaAo8Ah_1e4G-APVYEpUfzBIH4QqcK2xHFBxMGtk2Mr4rDkBi0nyf26WUhFJRZojDdTXM0jbfHfHCBgAMd2FbMZYs82jjpsE61hTHpObGo4iZ6uNuEsAPa-Wby-AoiNwYMQ_uQCFDiIB_QEx4Up7k_GZJSIYOhVHovhtl6PFQ HTTP/11" 200 0

It is also not forwarding gallery-dl cookies to yt-dlp for you. Try it with -o forward-cookies=1

[downloader.ytdl][debug] Forwarding cookies to yt_dlp.YoutubeDL

Is there a way I can run my own automated tests to see if my configuration will give me the results I want

If your results include post processor files, then --simulate is not an option.

Maybe specifying a different base-directory (-d), disabling archives, changing any absolute paths, and then just letting gallery-dl run would be suitable: -d "/tmp/" -o archive= -O archive=

Ever since this post, I thought I needed to use locals().get in the filter. Instead isRestricted == False and isRestricted == True are just working a treat.

locals().get('isRestricted') should work, it will return None when isRestricted is not defined, but you get its actual value if it is defined.

There have been some changes to how accessing undefined variables in filters work in general, see filters-environment. Instead of raising a NameError exception, it now silently evaluates any filter expression as false whenever it would raise an exception.

isRestricted == True and isRestricted == False

I'd recommend isRestricted and not isRestricted instead as those are more general.

Hrxn · 2024-12-21T18:02:11Z

Okay, so for the record, when doing something like "debugging" your conditional naming settings used in your config, you probably want to use --print together with --no-download

Wiiplay123 · 2024-12-21T18:51:25Z

Is there a way to make the current extractor abort only if it encounters X amount of threads with no new posts? It's downloading them in order from old to new, so it has to go through all the old posts in a thread before finding any new ones.

mikf · 2024-12-21T18:58:38Z

@Wiiplay123
There is -A / --abort:

  -A, --abort N               Stop current extractor run after N consecutive
                              file downloads were skipped

Depending on the site and if it provides date metadata, you could also use something like the following to stop when encountering a file before 2024-12-01:

--filter "date >= datetime(2024, 12, 1) or abort()"

Wiiplay123 · 2024-12-21T19:05:26Z

I use that for other cases, but it doesn't work here because it's encountering the old files first.
Basically, I want it to act like this:

Thread 1 Post 1 (Old)
Thread 1 Post 2 (New)
Thread 2 Post 1 (Old)
Thread 2 Post 2 (Old)
(Abort here)
Thread 3 Post 1 (Old)

biggestsonicfan · 2024-12-21T19:36:12Z

@mikf

locals().get('isRestricted') should work, it will return None when isRestricted is not defined, but you get its actual value if it is defined.

That was my issue, it always returned None, which the filter refused to evaluate so I couldn't get conditional postprocessors to run at all.

It is also not forwarding gallery-dl cookies to yt-dlp for you. Try it with -o forward-cookies=1

That did the trick! So I will use gallery-dl patreon.com/home -o forward-cookies=1 from now on!

EDIT: Actually, is there a global forward-cookies I can use for the extractor config in .gallery-dl.conf?

mikf · 2024-12-21T20:51:55Z

@biggestsonicfan
forward-cookies is a ytdl downloader option. You can enable it there.

{
    "downloader": {
        "ytdl": {
            "forward-cookies": true
        }
    }
}

Actually, forward-cookies is enabled by default since v1.28.0 so you probably have it disabled somewhere in your config file.

mikf · 2024-12-22T08:58:49Z

@Wiiplay123
If I understand your problem correctly, it is possible to achieve something like this with a bunch of python post processors. The following will stop after processing 3 (THREAD_MAX) threads without new files:

config.json

{
    "extractor": {
        "postprocessors": [
            {
                "name": "python",
                "event": "init",
                "function": "/tmp/chan.py:thread_init"
            },
            {
                "name": "python",
                "event": "finalize",
                "function": "/tmp/chan.py:thread_done"
            },
            {
                "name": "python",
                "event": "file",
                "function": "/tmp/chan.py:reset"
            }
        ]
    }
}

chan.py

from gallery_dl import exception

THREAD_MAX = 3
THREAD_CNT = 0


def reset(_):
    global THREAD_CNT
    THREAD_CNT = 0

def thread_init(_):
    global THREAD_CNT
    THREAD_CNT += 1

def thread_done(_):
    if THREAD_CNT >= THREAD_MAX:
        print("DONE")
        raise exception.TerminateExtraction()

Wiiplay123 · 2024-12-25T16:10:40Z

Took me a while to get around to trying it, just made a few changes and it works perfectly! Makes the extractor go a LOT faster.

I added a second reset for a "metadata" event that I added to metadata.py that runs every time metadata is written without skipping, to account for text-only posts. I upgraded a couple of the extractors to work better with text posts. I'll push the changes to my repo when I have time.

Hrxn · 2024-12-29T13:47:58Z

^ #6721 please don't post (even more) spam here

biggestsonicfan · 2024-12-30T06:25:48Z

Actually, forward-cookies is enabled by default since v1.28.0 so you probably have it disabled somewhere in your config file.

Ding ding ding! I did, in my downloader settings, oops! All fixed!

WyohKnott · 2025-01-02T14:53:37Z

[SubscribeStar.adult] Embeds and attachments not downloading

^ #6721 please don't post (even more) spam here

Fixing it in #6758. It needs a review by owner/contributors.

tisfyx · 2025-01-03T09:22:35Z

I have access to a danbooru instance that is running on a custom domain.
Is there a way to use gallery-dl to download from it? Right now it says [gallery-dl][error] Unsupported URL which makes sense, but i'm pretty sure just using the danbooru extractor on it should work if i could figure out how to.

mikf · 2025-01-03T09:25:59Z

@tisfyx
Prefix its URLs with Danbooru: or add an entry for Danbooru to your config file as outlined here: #1658

tisfyx · 2025-01-03T09:31:11Z

That worked, thank you for the quick reply!

OutshineIssue · 2025-01-03T22:44:34Z

Can someone help me write a script that opens the downloaded image? If one image is downloaded, it should open that image; if multiple images are downloaded, it should open the containing folder. If it's not possible let me know.

WyohKnott · 2025-01-04T06:13:07Z

Can someone help me write a script that opens the downloaded image? If one image is downloaded, it should open that image; if multiple images are downloaded, it should open the containing folder. If it's not possible let me know.

Use a postprocessor in your config file:

            "postprocessors": [
                {
                    "name": "exec",
                    "event": "post-after",
                    "command": "mpv --loop-playlist=inf --image-display-duration=5 {_directory}"
                }

or for example

                {
                    "name": "exec",
                    "event": "post-after",
                    "command": "gwenview --slideshow {_directory}"
                }
            ]

The event post-after happens after all files have been downloaded. You can pass either 3 parameters :

{_path} for the full path to the last file downloaded
{_directory} for the path to the directory where files have been downloaded
{_filename} for only the filename of the last file downloaded

WyohKnott · 2025-01-05T09:16:51Z

Is there a proper way to add multiple formatter in filenames?

I wanna do somehing like {content!H!g[:120]} but I have an error : FilenameFormatError: Applying filename format string failed (ValueError: expected ':' after conversion specifier)

mikf · 2025-01-05T09:21:57Z

Use the C format specifier:

{content:CHg/[:120]}

purple5pumpkin235 · 2025-01-05T12:51:08Z

If I want to install using pipx on Ubuntu 24.04, is this the correct install command?:
pipx install gallery-dl

WyohKnott · 2025-01-05T22:58:17Z

If I want to install using pipx on Ubuntu 24.04, is this the correct install command?: pipx install gallery-dl

this is not officially supported, but you can do it that way, yes.

OutshineIssue · 2025-01-08T07:06:49Z

Can someone help me write a script that opens the downloaded image? If one image is downloaded, it should open that image; if multiple images are downloaded, it should open the containing folder. If it's not possible let me know.

Use a postprocessor in your config file:
            "postprocessors": [
                {
                    "name": "exec",
                    "event": "post-after",
                    "command": "mpv --loop-playlist=inf --image-display-duration=5 {_directory}"
                }
or for example
                {
                    "name": "exec",
                    "event": "post-after",
                    "command": "gwenview --slideshow {_directory}"
                }
            ]
The event post-after happens after all files have been downloaded. You can pass either 3 parameters :
* `{_path}` for the full path to the last file downloaded

* `{_directory}` for the path to the directory where files have been downloaded

* `{_filename}` for only the filename of the last file downloaded

@WyohKnott Here's the code I put together based on yours and some information I found, but I'm running into an error: "'exec' initialization failed: KeyError: 'command'".

{
	"extractor": {
		"instagram": {
			"postprocessors": ["exec"]
		}
	},
	"postprocessor":  {
            "name": "exec",
            "event": "post-after",
            "command": "explorer.exe {_directory}"
        }
}

mikf · 2025-01-08T07:16:22Z

@OutshineIssue

{
    "extractor": {
        "instagram": {
            "postprocessors": ["exec-explorer"]
        }
    },

    "postprocessor":  {
        "exec-explorer": {
            "name"   : "exec",
            "event"  : "post-after",
            "command": "explorer.exe {_directory}"
        }
    }
}

baodrate · 2025-01-11T15:34:56Z

re-raising this suggestion (#5262 (comment)) since it might have been missed the first time (feel free to shoot it down though)

could it be allowed that the default config be in toml? so the user does not have to specify --config-toml FILE on the command line every time?

i.e. add to gallery_dl.config._default_configs:
* `${XDG_CONFIG_HOME}/gallery-dl/config.toml`

* `~/.config/gallery-dl/config.toml`
(And it would probably make sense to also add the equivalent yaml paths)

ghbook · 2025-01-12T10:29:00Z

Hi @mikf , How to use g: or general extractor with input txt file like here gallery-dl -i <txtfile>.

and is it possible to define postprocessors in extractor file.

mikf · 2025-01-19T14:14:04Z

@baodrate
I'd like to avoid adding even more possible config file paths to the list if possible, but I guess two more wouldn't be that bad.

so the user does not have to specify --config-toml FILE on the command line every time

What about creating an alias that includes --config-toml?

alias gallery-dl='gallery-dl --config-ignore --config-toml FILE'

@ghbook
Either prefix all URLs in <txtfile> with g: or generic:,

or disable all extractor modules except generic and enable it to be used for all otherwise unsupported URLs:

gallery-dl -o extractor.modules=generic -o extractor.generic.enabled=1 -i <txtfile>

extractor file

What do you mean by that? A file given by --input-file?

arisboch · 2025-01-22T08:27:31Z

How do I download all the replies to a Bluesky post made by the post's author themselves, I can't even manage to download all the replies, here's the relevant config section:

        "bluesky":
        {
        	"filename": "bluesky {author['handle']} {post_id} {num}.{extension}",
        	"directory": ["bluesky {author['handle']} {post_id}"],
        	"include": ["posts", "replies", "media"],
			"metadata": true,
			"reposts": true,
			"quoted": true
        },

ghbook · 2025-01-22T21:24:55Z

and is it possible to define postprocessors in extractor file.

extractor file

What do you mean by that? A file given by --input-file?

Its an another question, not related to input file. I was talking about .py file like reddit.py in extractor folder. I never seen postprocessors defined in .py file along with directory_fmt, filename_fmt properties, Its always defined in config.json file. So I have been thinking if its possible to define in class or init method. Any example would be helpful.

Also last question, are there any helper methods to get real directory, filename in items method. I need to check if the file already exists in one of extractor before making a request in items method. given URL already has all the keys required. Reason being api rate limit per day.

mikf · 2025-01-24T14:34:00Z

@arisboch
For post URLs like https://bsky.app/profile/mikf.bsky.social/post/3l46q5glfex27, you can use the depth option to get replies and --filter "user['did'] == author['did']" to filter out any from users other than the post's author.

gallery-dl -o depth=5 -o metadata=1 --filter "user['did'] == author['did']" https://bsky.app/profile/mikf.bsky.social/post/3l46q5glfex27

    "depth": 50,
    "metadata": true,
    "image-filter": "user['did'] == author['did']"

For "timeline" URLs like https://bsky.app/profile/bsky.app, this is not supported yet.

@ghbook
An extractor's only job is data extraction. It has no concept of a file system, files, directories, etc. and doesn't care how its extracted data is eventually used. There is no builtin way to specify default post processors or to access the paths where files are downloaded to. You should be able to modify the code to add a reference of the current job object to an extractor and check files and access paths using the Job's internals, but this is "officially" not supported.

mikf added Questions Meta labels Dec 1, 2024

mikf mentioned this issue Dec 1, 2024

Questions, Feedback, and Suggestions #4 #5262

Closed

mikf pinned this issue Dec 1, 2024

This comment was marked as spam.

Sign in to view

mikf mentioned this issue Jan 8, 2025

What's the difference between "postprocessors" and "postprocsesor" in the config file #6770

Closed

Questions, Feedback, Suggestions #5 #6582

Questions, Feedback, Suggestions #5 #6582

Comments

mikf commented Dec 1, 2024

SpiffyChatterbox commented Dec 2, 2024

noshii117 commented Dec 5, 2024

Hrxn commented Dec 5, 2024

biggestsonicfan commented Dec 6, 2024

mikf commented Dec 6, 2024 • edited Loading

nightbrd commented Dec 9, 2024

mikf commented Dec 9, 2024

Hrxn commented Dec 9, 2024

noshii117 commented Dec 10, 2024

biggestsonicfan commented Dec 14, 2024 • edited Loading

Hrxn commented Dec 14, 2024

roastme commented Dec 15, 2024

biggestsonicfan commented Dec 15, 2024 • edited Loading

rsn-yk commented Dec 21, 2024

mikf commented Dec 21, 2024 • edited Loading

mikf commented Dec 21, 2024

mikf commented Dec 21, 2024

Hrxn commented Dec 21, 2024

Wiiplay123 commented Dec 21, 2024

mikf commented Dec 21, 2024

Wiiplay123 commented Dec 21, 2024

biggestsonicfan commented Dec 21, 2024 • edited Loading

mikf commented Dec 21, 2024

mikf commented Dec 22, 2024

Wiiplay123 commented Dec 25, 2024

This comment was marked as spam.

Hrxn commented Dec 29, 2024

biggestsonicfan commented Dec 30, 2024

WyohKnott commented Jan 2, 2025

tisfyx commented Jan 3, 2025

mikf commented Jan 3, 2025

tisfyx commented Jan 3, 2025

OutshineIssue commented Jan 3, 2025

WyohKnott commented Jan 4, 2025 • edited Loading

WyohKnott commented Jan 5, 2025

mikf commented Jan 5, 2025

purple5pumpkin235 commented Jan 5, 2025

WyohKnott commented Jan 5, 2025

OutshineIssue commented Jan 8, 2025 • edited Loading

mikf commented Jan 8, 2025

baodrate commented Jan 11, 2025

ghbook commented Jan 12, 2025 • edited Loading

mikf commented Jan 19, 2025 • edited Loading

arisboch commented Jan 22, 2025

ghbook commented Jan 22, 2025 • edited Loading

mikf commented Jan 24, 2025

mikf commented Dec 6, 2024 •

edited

Loading

biggestsonicfan commented Dec 14, 2024 •

edited

Loading

biggestsonicfan commented Dec 15, 2024 •

edited

Loading

mikf commented Dec 21, 2024 •

edited

Loading

biggestsonicfan commented Dec 21, 2024 •

edited

Loading

WyohKnott commented Jan 4, 2025 •

edited

Loading

OutshineIssue commented Jan 8, 2025 •

edited

Loading

ghbook commented Jan 12, 2025 •

edited

Loading

mikf commented Jan 19, 2025 •

edited

Loading

ghbook commented Jan 22, 2025 •

edited

Loading