-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions, Feedback, and Suggestions #4 #5262
Comments
For most sites I'm able to sort files into year/month folders like this:
However for redgifs it doesn't look like there's a date keyword available for |
There's a typo in
|
There's also another typo in |
Can you grab all the media from quoted tweets? Example. |
#5262 (comment) It's implemented as a search for 'quoted_tweet_id:…' on Twitter.
#5262 (comment) This on was on the same line as the previous one ... (9fd851c)
Regarding typos, thanks for pointing them out. @biggestsonicfan |
EDIT: Actually, I think there's just something wrong with that URL. I had it saved for a long time and searching that tag normally gives a different URL ( |
You could use |
Is there support to remove metadata like this?
Post-processor: "filter-metadata":
{
"name": "metadata",
"mode": "delete",
"event": "prepare",
"fields": ["preview[images][0][resolutions]"]
} I've tried a few variations but no dice. "fields": ["preview[images][][resolutions]"] "fields": ["preview[images][N][resolutions]"] "fields": ["preview['images'][0]['resolutions']"] |
Hello, I left a comment in #4168 . Does the |
@taskhawk def remove_resolutions(metadata):
for image in metadata["preview"]["images"]:
del image["resolutions"] (untested, might need some check whether @YuanGYao |
@mikf |
Not sure if I'm missing something, but are directory specific configurations exclusive to running gallery-dl via the executable? Basically, I have a directory for regular tags, and a directory for artist tags. For regular tags I use So right now the only way I know to get this per-directory configuration to work, is to copy the gallery-dl executable everywhere I want to use a master configuration override. Am I missing something? It feels like there should be a better way. |
Huh? No, the configuration works always in the same way. You're simply using different configuration files? |
From the readme:
I want to override my master configuration |
You can load additional configuration files from the console with:
You just need to specify the path to the file and any options there will overwrite your main configuration file. Edit: From my understanding, yeah, automatic loading of local config files in each directory is only possible having the standalone executable in each directory. Are different directory options the only thing you need? |
Thanks, that's exactly what I was looking for! Guess I didn't read the documentation thoroughly enough. For now the only thing I'd want to override is the directory structure for artist tags. I don't think it's possible to determine from the metadata alone if a given tag is the name of an artist or not, so I thought the best way to go about it is to just have a separate directory for artists, and use a configuration override. So yeah, loading that override with the -c flag works great for that purpose, thanks again! |
You kinda can, but you need to enable "gelbooru": {
"directory": {
"search_tags in tags_artists": ["{category}", "{search_tags[0]!u}", "{search_tags}", "{date:%Y}", "{date:%m}"],
"" : ["{category}", "{search_tags}", "{date:%Y}", "{date:%m}"]
},
"tags": true
}, Set Of course, this depends on the artists being correctly tagged. Not sure if it happens on Gelbooru, but at least in other boorus and booru-like sites I've come across posts with the artist tagged as a general tag instead of an artist tag. Another limitation is that your search tag can only include one artist at a time, doing more will require a more complex expression to check all tags are present in What I do instead is that I inject a keyword to influence where it will be saved, like this:
And in my config I have "gelbooru": {
"directory": ["boorus", "{search_tags_type}", "{search_tags}"]
}, You can have: "gelbooru": {
"directory": {
"search_tags_type == 'artists'": ["{category}", "{search_tags[0]!u}", "{search_tags}", "{date:%Y}", "{date:%m}"],
"" : ["{category}", "{search_tags}", "{date:%Y}", "{date:%m}"]
}
}, You can do this for other tag types, like general, copyright, characters, etc. Because it's a chore to type that option every time I made a wrapper script, so I just call it like this because artists is my default:
For other tag types I can do:
|
Thanks for pointing out there's a tags option available for the gelbooru extractor. I already used it in the kemono extractor to get the name of the artist, but it didn't occur to me that gelbooru might also have such an option (and just accepted that the tags aren't categorized). For artists I store all the url's in their respective gelbooru.txt, rule34.txt, etc files like so:
And then just run |
When I'm making an extractor, what do I do if the site doesn't have different URL patterns for different page types? Every single page is just a numerical ID that could be a forum post, image, blog post, or something completely different. |
@Wiiplay123 You handle everything with a single extractor and decide what type of result to return on the fly. The |
Hi, what options should I use in my config file to change the format of dates in metadata files? I would like to use And would it also be possible to do this for json files that ytdl creates? I downloaded some videos with gallery-dl but the dates got saved as |
Now it says 1.26.9.dev0. despite the built wheel clearly says |
I feel like something has gone awry for sure. Try creating a fresh venv and installing in that, just in case? |
Ah thanks, I figured it out. Apparently I have billions of And doing So, I have to run I suspect this is caused by
Maybe we shouldn't let the users use it unless really needed, @mikf ? (Or change to --force-reinstall instead.) Log if interested
|
Is there a way to download specifically the revisions on an artist's page on kemono.su? For example, one artist has had many of their posts updated with a revision that removed the content, while the original revision retains them. There are hundreds of posts on their page like that, so I was wondering if there was a way to set it to download the original revisions for all of them automatically. |
I just noticed that the latest gallery-dl release made this "just werk":
I still don't know whether it was possible to download such artwork using gallery-dl before (I thought it was, so I was just asking for someone to explain to me in simple terms how to do it), but, again, it "just werks" now, so, much appreciated. |
So Seiga is now region-locked. Can I proxy/wireguard just that extractor? EDIT: I've managed to get Wireguard locally to proxy via a port using wireproxy, but I just need a post(pre)processor to launch it as a daemon and close it when it's done. EDIT2: Figured it out:
|
I hate posting so frequently here but I hate making new issues more. This is once again an issue for me. I've just supported a user that has a preview image and download urls in their post. I normally parse the json files with a python script, however this preview image had been downloaded previously and I don't overwrite json data anymore. So I will re-run the user with skip set to EDIT: I also don't get how the metadata archive works either. Will the metadata entry be the same as the one for the extractor? |
@biggestsonicfan
A
|
could it be allowed that the default config be in toml? so the user does not have to specify i.e. add to
(And it would probably make sense to also add the equivalent yaml paths) |
The example link I provided no longer seems to be online, but I just noticed when downloading a profile on misskey.gg that the 5/5 timeout error no longer happened. But, it also doesn't appear that it added any older media that I assumed was being skipped. I didn't actually look at what was causing the 5/5 timeout error to see if it was media, but, since it appears to "just werk" at this point, I assume what was timing out simply was not media at all. I don't know. Either way, I am saying that I just noticed this is no longer reproducible. |
@mikf |
|
#5262 (comment) fixes regression introduced in 9e72968 'argparse' sets a flag and changes its behavior when using something that looks like a negative number as option string, '-4' and '-6' in this case.
Can the postprocessor use multiple filters? I'm trying but I'm getting |
@God-damnit-all @biggestsonicfan |
Gotcha. It might be nice to clarify that in the post-processor docs, as that's where I got the idea to use it as a list. My idea is to use filters to run specific postprocessors in order if:
Which I think would resolve to:
|
#5262 (comment) allow (theoretically*) all filter expression statements to be a list of individual filters (*) except for 'filename' and 'directory' conditionals, as dict keys cannot be lists
@biggestsonicfan |
I am gently moving away from PixivUtil2 for my pixiv downloads and would like to configure gallery-dl to match it's settings configuration as closely as possible. I am unable to actually endocde ugoira files, however. Pixiv config settings:
Post processor settings:
|
@biggestsonicfan Set your I'd also recommend installing |
I know issues about anti-crawling from Instagram has been asked billions of times, but I never saw it being so strict. I download newly added posts/stories from merely 8 accounts once a day. Given that I barely download anything most of time (these 8 accounts don't really update that frequently), I wonder how do they even detect. Is the way we request their API endpoints too easy to spot? Anyway, if there is any insight to avoid or mitigate this, it would be very helpful. |
I already notice a slight delay in loading this, so I'd suggest to close this and open |
The page is quite laggy now, yes. |
Would you please consider uploading nightly builds onto WinGet, Scoop, Chocolatey, or some other package manager? It would be great to have the latest changes with the ability of package managers to keep gallery-dl updated automatically. Especially with the frequency of updates. For example, the latest release is v1.27.7 (October 25th, 2024) but currently 123 commits behind It would be great to have the latest commits and support in cases where broken modules were updated for sites that constantly change or additional support was added. |
Closing this as suggested by Hrxn (#5262 (comment)). |
@fireattack @Infinitay And if you really want the latest commits, you can always |
Continuation of the previous issue as a central place for any sort of question or suggestion not deserving their own separate issue.
Links to older issues: #11, #74, #146.
The text was updated successfully, but these errors were encountered: