-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pixiv downloading "Work cannot be displayed" image #4327
Comments
Can confirm, seeing this too. Refreshing the login token doesn't help and it doesn't seem to be related to whether the image is r18 or not. Update: Update2: Also might be related issue here: upbit/pixivpy#275 Update3: If you check https://www.pixiv.net/info.php?cid=1&lang=en there are announcements about the suspension and reinstantiation of their mobile apps from app stores, related to content in the apps. I think they might be doing some kind of semi-manual filtering now which causes this lag between the mobile app API and the website. Update4: The metadata is incomplete too for these images (no tags). |
Doesn't appear to be happening on my end. |
I ran into this but I can no longer reproduce it, so must've been something temporary. |
It is still happening (as I'm writing this), but the lag between what is available on the mobile app API and what's visible on the site has decreased. Currently I see about a 8-10 minute lag until an image shows up on the mobile app (looking at the posting times). You can reproduce this if you go to the site, search for some very common tag like "illustration" and try to download the newest entry chronologically (check if it was posted in the last few mins). The other question is, is there any content that won't be available on the mobile API at all? I haven't encountered anything like that yet but since this whole thing might be because the mobile app does some additional filtering due to appstore requirements then it can't be discounted. For now I think a temporary solution would be to catch these cases when an invalid image is returned (easy from the URL) and either error or try to wait 5-10-15 mins like in the case of rate limits. If the lag between the site and image availability in the API remains low then this might be enough, maybe along with some informational message in these cases. Ultimately if the time lag between the mobile API and the site keeps randomly increasing/decreasing or the mobile API becomes filtered in some other way then a switch to the non-mobile API (the one the website uses) might be needed. |
Still happening on my end. Also only for basically brand new pictures. |
Happens for me when I'm downloading from my bookmarks, but doesn't happen when I use the search page or individual posts. |
https://s.pximg.net/common/images/limit_sanity_level_360.png images now get ignored To manually ignore them, enable url-metadata and
I've noticed that search results, and only those, do not include R-18G works. |
Works for me, it might be your account settings (there is a separate toggle for r18g iirc) |
These settings are enabled for all of my accounts. Are these "Work cannot be displayed" images still a thing or did Pixiv somehow fix whatever these were meant for? (I've never encountered one of these or a "Skipping 'sanity_level' warning" logging message myself) |
There's still at least a few minutes of lag before images displayed on the website also appear in the app, so if you happen to download very recent image URLs those will still produce the sanity_level image. I think we will just have to live with this for the time being, since it probably not worth a rewrite to switch to the API that the website uses. |
Yeah, I'd really want to avoid using the website API if at all possible. It is a lot slower, requires an extra request for each individual post, and, more importantly, would need exported cookies for authentication, which expire in a month or so. I did try to rewrite the current extractor back when auth with username and password got disabled and it wasn't a "pleasant" experience, to say the least. |
Just a short resume of #4421 (comment) In Pixiv's Android application, and therefore in gallery-dl too:
|
Seems, the caption "bug" is "fixed". Upd 2023.10.08: The "bug" was returned. |
Encountered another image that won't download (giving skip sanity_level warning in the log), |
Same issue here: #4760 (comment) |
Seeing gallery-dl seems to silently skip it, maybe add more explicit error/warning? I only noticed it was being skipped, after passing |
Might be a good idea to add these post URLs to the output of |
The best solution would be falling back to a secondary extractor that doesn't use Pixiv's mobile API. It's like @\thatfuckingbird pointed out: Pixiv is taking measures to keep their mobile apps in the stores. Unfortunately, the automatic flagging is rather triggerhappy, producing many false positives. There also seems to be no publicly visible indicator or any way to appeal the flag from what I saw, so finding a way around is very important for every data hoarder. |
I think it makes sense to add a support to use web API additionally to the Android app's API. Since mobile API does not return shadow banned artworks it would require to use an extra call to get all artworks IDs with site's API: Object.keys((await (await fetch("https://www.pixiv.net/ajax/user/1657441/profile/all?lang=en")).json()).body.illusts) So, you can find the missed artworks. To get the info for them: (await (await fetch("https://www.pixiv.net/ajax/illust/113897896?lang=en")).json()).body For ugoira, also: (await (await fetch("https://www.pixiv.net/ajax/illust/113897896/ugoira_meta?lang=en")).json()).body However, it seems it's not possible to detect when the caption is removed (in app API) due to a soft shadow ban, or just the author did not add it. For example: While these So, it needs to use the site's API each time when
JS code to collect all infos from const headers = {
// "user-agent": `...`,
// "cookie": `...`,
};
const profileId = document.location.pathname.match(/(?<=users\/)\d+/)[0]; // https://www.pixiv.net/en/users/7386235
const ids = Object.keys((await (await fetch(`https://www.pixiv.net/ajax/user/${profileId}/profile/all?lang=en`, {
headers: {
"referer": `https://www.pixiv.net/en/users/${profileId}`,
...headers
}
})).json()).body.illusts);
const json = {};
for (const id of ids) {
const body = (await (await fetch(`https://www.pixiv.net/ajax/illust/${id}?lang=en`, {
headers: {
"referer": `https://www.pixiv.net/en/artworks/${id}`,
...headers,
}
})).json()).body;
json[id] = body;
}
downloadBlob(new Blob([JSON.stringify(json, null, " ")]), `[pixiv][json] ${profileId}—${json[ids[0]]?.userName} (${ids.length}).json`, document.location);
function downloadBlob(blob, name, url) {
const anchor = document.createElement("a");
anchor.setAttribute("download", name || "");
const blobUrl = URL.createObjectURL(blob);
anchor.href = blobUrl + (url ? ("#" + url) : "");
anchor.click();
setTimeout(() => URL.revokeObjectURL(blobUrl), 3000);
} |
Optional mixed mode:
It's the less problem than the missed images/descriptions (that may contain useful links). Using an other API endpoints seems very simple, however, they return the JSON data is formatted a bit different way, as I see. |
@AlttiRi Is there a way for us to manually implement this in the meantime? I think what you propose makes the most sense, which is to keep the current default behavior, but if anything is missing or if there is an error thrown (eg 'sanity_level' warning) then the web API should take over. I've found pixiv to be extremely inconsistent with their application of 'sanity_level' labels and it would be of great use to not be obstructed by it. I'm not sure how difficult of an addition this would be, or if there are any other tools out there that avoid it, but until it is bypassed I wonder what can be done temporarily to preserve the functionality. |
I only explained how it should be implemented in Python code in pixiv.py. |
The first step towards a complete workaround is done: c5be50f. Now it is at least possible to download |
make extra requests for empty captions independent of 'sanity'
Pixiv changed the "Work cannot be displayed" / limit_sanity_level_360.png URLs to I think |
The App API now returns https://s.pximg.net/common/images/limit_unviewable_360.png as URL for "Work cannot be displayed" artworks.
Starting today, when I tried to download an image from Pixiv instead of downloading the image, it instead downloads this image with Japanese text that says "This work cannot be displayed".
It seems to only happen on posts that were recently posted, as images that were uploaded yesterday and older download fine.
The text was updated successfully, but these errors were encountered: