Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with fanbox coverImageUrl (#1181) #1185

Merged
merged 7 commits into from
Oct 28, 2022

Conversation

KamenReader
Copy link
Contributor

@KamenReader KamenReader commented Oct 26, 2022

This is to address issue #1181. It looks like the post list and posts themselves use a different key for the cover image (I am assuming this is new, due to the error only showing up now) so calling ParsePost on an entry from the list (which uses a "cover" object which either has a type and url key or nothing) during the post initialization in parsePosts will break as it is expecting simply a coverImageUrl key. The change here will handle it differently depending on whether it finds the coverImageUrl key or not. I am unsure if any other checking needs to be done in the situation where "cover" is found instead. The url key/value pair seems to always come with another key of "type" and value of "cover_image", so it may be necessary to also check for that before assuming there will always be a "url" whenever cover is not None, in the event that there is another cover "type" that I am unaware of.

It looks like the post list and posts themselves use a different key for the cover image (I am assuming this is new, due to the error only showing up now) so calling ParsePost on the list (which uses a "cover" object which either has a url key or nothing) will break as it is expecting simply a coverImageUrl key.  The change here will handle it differently depending on whether it finds the coverImageUrl key or not.
Apparently the embed json uses cover.url instead of coverImageUrl now, similarly to the post list.
While gross, this change is likely necessary to prevent problems with an embedded post that has no cover image.  If someone can come up with a better check, please provide one.  If the cover URL behaves similarly to the post list, in the event that there is no cover image for a post, the "cover" object will be empty, so trying to retrieve the value for the "url" key will fail and crash PixivUtil2.  This change verifies that the "url" key exists before retrieving the value.
@KamenReader
Copy link
Contributor Author

Upon further testing, it seems there exists another place where coverImageUrl is no longer used and has been replaced by a cover object which may or may not contain a url, and that is embedded posts. The subsequent updates should address that case as well.

@KamenReader
Copy link
Contributor Author

It looks like the build checker is using an outdated embed test, since it still uses the old coverImageUrl format.

Update test data with new cover url format
@Nandaka
Copy link
Owner

Nandaka commented Oct 26, 2022

do they also change the jsons for the old posts? sometimes pixiv kept the old json format for the old post.

@shinji257
Copy link

shinji257 commented Oct 26, 2022

Downloading by post id actually works fine still but it breaks by artist ID. The examples below may be considered NSFW although I think the posts themselves may be safe...

Example Artist: 2029329
Example Post: 4654957

I intentionally picked a post that was uploaded yesterday, and it is available to everyone (no sub required).

EDIT: Looks like there is another break with the changes but this one does break when the post is trying to be retrieved.... also triggers the coverImageUrl error that this patch is working to fix when attempted on the current release code (without this patch).
Example Post (same artist): 4526574

Traceback (most recent call last):
  File "PixivUtil2.py", line 1736, in main
  File "PixivUtil2.py", line 1470, in main_loop
  File "PixivUtil2.py", line 970, in menu_fanbox_download_by_post_id
  File "PixivBrowserFactory.pyc", line 1032, in fanboxGetPostById
  File "PixivModelFanbox.pyc", line 68, in __init__
  File "PixivModelFanbox.pyc", line 73, in parse_post_details
  File "PixivModelFanbox.pyc", line 290, in parseBody
  File "PixivModelFanbox.pyc", line 344, in get_embed_url_data
AttributeError: 'NoneType' object has no attribute 'get'
Unknown Error, please check the log file: (<class 'AttributeError'>, AttributeError("'NoneType' object has no attribute 'get'"), <traceback object at 0x000001FE96763E00>)

@KamenReader
Copy link
Contributor Author

KamenReader commented Oct 26, 2022

do they also change the jsons for the old posts? sometimes pixiv kept the old json format for the old post.

I did an apples to apples comparison by signing up for the creator (kurikara) and using the same post (4071336) that was used in the original embed test file. I could have replaced it with my json dump, but the only part that seemed to be relevant was the changed cover url format (which indeed follows the new format). Also, my own dump of the json fails on the descriptionUrlList count check because I don't have the higher tier from the embed (I bought the 100JPY tier for the test), so I can't access the 1500JPY postInfo.excerpt that is used for the preview text (which the link checker combs through).

Downloading by post id actually works fine still but it breaks by artist ID. The examples below may be considered NSFW although I think the posts themselves may be safe...

Using my patch-1 branch, I had no issues downloading the artist above ("f2", "shigemarushigeru"), including the post you specified. What error did you get?

EDIT: Looks like there is another break with the changes but this one does break when the post is trying to be retrieved.... also triggers the coverImageUrl error that this

Looks like the change for specifically checking for cover.url actually didn't address the situation where cover is an empty object (i.e. no cover image). The next version should fix that.

Address situation where embedded post's cover object is None
break instead of redundant assignment of root = None, since it's not longer necessary
@shinji257
Copy link

Those changes seem to have taken care of it. I can now go through the artist without any errors. I'll do the full pass here with all my followed fanbox artists and see if anything else crops up.

@Hecatom
Copy link

Hecatom commented Oct 26, 2022

Ok, I am not sure what I am doing wrong, but when I run the program after downloading the updates/patches, it keeps giving me the error

Getting artist information from https://api.fanbox.cc/creator.get?userId=78786499
Member Url: https://www.pixiv.net/ajax/user/78786499/profile/all
Processing FanboxArtist(78786499, nivi, NiVi), page 1
Getting posts from https://api.fanbox.cc/post.listCreator?creatorId=nivi&limit=10
Traceback (most recent call last):
File "PixivUtil2.py", line 1736, in main
np_is_valid, op_is_valid, selection = main_loop(ewd, op_is_valid, selection, np_is_valid, args, options)
File "PixivUtil2.py", line 1468, in main_loop
menu_fanbox_download_by_id(op_is_valid, args, options)
File "PixivUtil2.py", line 1000, in menu_fanbox_download_by_id
PixivFanboxHandler.process_fanbox_artist_by_id(sys.modules[name],
File "PixivFanboxHandler.pyc", line 50, in process_fanbox_artist_by_id
File "PixivBrowserFactory.pyc", line 1018, in fanboxGetPostsFromArtist
File "PixivModelFanbox.pyc", line 566, in parsePosts
File "PixivModelFanbox.pyc", line 67, in init
File "PixivModelFanbox.pyc", line 94, in parsePost
KeyError: 'coverImageUrl'
Unknown Error, please check the log file: (<class 'KeyError'>, KeyError('coverImageUrl'), <traceback object at 0x043CF7E8>)

I assume that I applied them incorrectly

@KamenReader
Copy link
Contributor Author

Ok, I am not sure what I am doing wrong, but when I run the program after downloading the updates/patches, it keeps giving me the error

Are you sure you applied them to the most recent version of the source code (v20220924)? Did you also try simply downloading the entire branch and testing that? That error looks like it's what the original code did, which is check for coverImageUrl in the post initialization, which should no longer happen after the first change.

@shinji257
Copy link

I checked Artist 78786499 (nivi) and it is working fine here with my build using KamenReader's code. With that said I did come across a new issue here while doing my pass.

Post  = 1300379
Title = Mobius
Type  = image
Created Date  = 2020-08-11T17:19:05+09:00
Is Restricted = False
No Cover Image for post: 1300379.
Image Count = 2
Downloading image 0 from https://downloads.fanbox.cc/files/post/1300379/s5EPhPTPjS36RAut7RSyvQfB.pdf
Saved to I:\PixivUtil2\images\FANBOX (5418138)\1300379_p0_s5EPhPTPjS36RAut7RSyvQfB - Mobius.pdf
Local file exists: I:\PixivUtil2\images\FANBOX (5418138)\1300379_p0_s5EPhPTPjS36RAut7RSyvQfB - Mobius.pdf
Downloading image 1 from https://downloads.fanbox.cc/files/post/1300379/Pv0Hpu3BtvofEHSvcwV9S7rU.pdf
Saved to I:\PixivUtil2\images\FANBOX (5418138)\1300379_p1_Pv0Hpu3BtvofEHSvcwV9S7rU - Mobius.pdf
Local file exists: I:\PixivUtil2\images\FANBOX (5418138)\1300379_p1_Pv0Hpu3BtvofEHSvcwV9S7rU - Mobius.pdf
Processing FanboxArtist(5418138, ytsnow, YTsnow), page 19
Getting posts from https://api.fanbox.cc/post.listCreator?creatorId=ytsnow&maxPublishedDatetime=2020-08-10%2016%3A40%3A30&maxId=1297594&limit=10
Traceback (most recent call last):
  File "PixivUtil2.py", line 1736, in main
  File "PixivUtil2.py", line 1472, in main_loop
  File "PixivUtil2.py", line 944, in menu_fanbox_download_from_list
  File "PixivFanboxHandler.pyc", line 50, in process_fanbox_artist_by_id
  File "PixivBrowserFactory.pyc", line 1018, in fanboxGetPostsFromArtist
  File "PixivModelFanbox.pyc", line 576, in parsePosts
  File "PixivModelFanbox.pyc", line 67, in __init__
  File "PixivModelFanbox.pyc", line 99, in parsePost
KeyError: 'url'
Unknown Error, please check the log file: (<class 'KeyError'>, KeyError('url'), <traceback object at 0x000002D3036CAFC0>)

@KamenReader
Copy link
Contributor Author

KamenReader commented Oct 26, 2022

I checked Artist 78786499 (nivi) and it is working fine here with my build using KamenReader's code. With that said I did come across a new issue here while doing my pass.

Could you provide a json dump of the the post? If you could add an

import json

to the top of your PixivModelFanbox.py file then insert the following at line 93, that should give you a file with some formatted json from the post where it breaks.

        with open('postjson.txt', 'w', encoding='utf-8') as f:
            json.dump(jsPost, f, indent=4, ensure_ascii=False)

This looks like the situation I hypothesized earlier where "cover" isn't an empty object but it also for some reason doesn't have a "url" key, so I'll need to see what the cover object has instead.

@Hecatom
Copy link

Hecatom commented Oct 26, 2022

v20220924)

That is the thing, I am not really sure if I am applying it correctly, or if the steps I am doing are not correct.

As it is now, I am using a clean version of 20220924, apply the master and then the patch, then run the exe.

I assume that instead or running th exe I should use python3 PixivUtil2.py?

@KamenReader
Copy link
Contributor Author

I assume that instead or running th exe I should use python3 PixivUtil2.py?

Correct. You should either run the PixivUtil2.py file directly or create a new executable using something like pyinstaller and replacing the old one.

@Hecatom
Copy link

Hecatom commented Oct 27, 2022

I assume that instead or running th exe I should use python3 PixivUtil2.py?

Correct. You should either run the PixivUtil2.py file directly or create a new executable using something like pyinstaller and replacing the old one.

Well, this seems that there is something on my end that doesnt't allow me to neither run the python or compile a new exe that works.
Runing the .py gives me an error due the import of the _sockets and creating a .exe with pyinstaller simply crashes when trying to run it :/

@shinji257
Copy link

I checked Artist 78786499 (nivi) and it is working fine here with my build using KamenReader's code. With that said I did come across a new issue here while doing my pass.

Could you provide a json dump of the the post? If you could add an

import json

to the top of your PixivModelFanbox.py file then insert the following at line 93, that should give you a file with some formatted json from the post where it breaks.

        with open('postjson.txt', 'w', encoding='utf-8') as f:
            json.dump(jsPost, f, indent=4, ensure_ascii=False)

This looks like the situation I hypothesized earlier where "cover" isn't an empty object but it also for some reason doesn't have a "url" key, so I'll need to see what the cover object has instead.

Here you go.

{
    "id": "1248166",
    "title": "YTmaskvideo_005",
    "feeRequired": 1080,
    "publishedDatetime": "2020-07-25T15:57:45+09:00",
    "updatedDatetime": "2020-07-25T15:57:45+09:00",
    "tags": [],
    "isLiked": false,
    "likeCount": 19,
    "commentCount": 2,
    "isRestricted": false,
    "user": {
        "userId": "5418138",
        "name": "YTsnow2013",
        "iconUrl": "https://pixiv.pximg.net/c/160x160_90_a2_g5/fanbox/public/images/user/5418138/icon/Hhr9hAN6HkLaXlt0e25BgeCg.jpeg"
    },
    "creatorId": "ytsnow",
    "hasAdultContent": true,
    "cover": {
        "type": "video",
        "video": {
            "serviceProvider": "vimeo",
            "videoId": "441540358"
        }
    },
    "excerpt": ""
}

@KamenReader
Copy link
Contributor Author

KamenReader commented Oct 27, 2022

Runing the .py gives me an error due the import of the _sockets and creating a .exe with pyinstaller simply crashes when trying to run it :/

Do you have all the dependencies from requirements.txt installed? Assuming you have pip, you should be able to just run

pip install beautifulsoup4
pip install certifi
pip install demjson3
pip install mechanize
pip install Pillow
pip install PySocks
pip install colorama
pip install cloudscraper

and that should take care of them.

Here you go.

The cover... is a vimeo video? What? What does it look like in the creator's post list? Is it an actual clickable video in the post list, even without opening the post itself? Or is it just a frame? Could you provide the json from the post by doing a download of that post specifically? The same code from before should provide a dump of the post's json via an "f3" "1248166". I'd like to see what the post's coverImageUrl says.

Anecdotally, I'm 78 artists deep in an "f1" with no page limit run, and I haven't encountered an error yet, so this must not be something most creators do.

Address issue that arises when cover is a type other than "cover_image"
@KamenReader
Copy link
Contributor Author

KamenReader commented Oct 27, 2022

Okay, it took a while, but I finally got to one with a cover of type video (this one from YouTube). The video itself was indeed embedded in the post list (though it had apparently been removed from YouTube). The coverImageUrl for the post itself was null, so I guess the best choice would be to filter out based on cover type. The newest fix resolves the issue of covers that are types other than "cover_image".

@shinji257
Copy link

shinji257 commented Oct 27, 2022

Runing the .py gives me an error due the import of the _sockets and creating a .exe with pyinstaller simply crashes when trying to run it :/

Do you have all the dependencies from requirements.txt installed? Assuming you have pip, you should be able to just run

pip install certifi
pip install demjson3
pip install mechanize
pip install Pillow
pip install PySocks
pip install colorama
pip install cloudscraper

and that should take care of them.

Here you go.

The cover... is a vimeo video? What? What does it look like in the creator's post list? Is it an actual clickable video in the post list, even without opening the post itself? Or is it just a frame? Could you provide the json from the post by doing a download of that post specifically? The same code from before should provide a dump of the post's json via an "f3" "1248166". I'd like to see what the post's coverImageUrl says.

Anecdotally, I'm 78 artists deep in an "f1" with no page limit run, and I haven't encountered an error yet, so this must not be something most creators do.

Late reply but yes. It was an embedded Video video fully playable from the posts list as well.

Okay, it took a while, but I finally got to one with a cover of type video (this one from YouTube). The video itself was indeed embedded in the post list (though it had apparently been removed from YouTube). The coverImageUrl for the post itself was null, so I guess the best choice would be to filter out based on cover type. The newest fix resolves the issue of covers that are types other than "cover_image".

That took care of it. It now correctly says there is no cover or images available for the post which makes sense. I will again run through all my artists with this new version and see if it catches any weird gotchas in the handling of the data.

EDIT: I was able to go through all my subscribed and followed artists (around 240) without getting any errors. ;)

@Nandaka
Copy link
Owner

Nandaka commented Oct 28, 2022

thanks @KamenReader for the fix and @shinji257 to verify for the Fanbox (I don't really use it) 😄

@Nandaka Nandaka merged commit 8ed988b into Nandaka:master Oct 28, 2022
@KamenReader KamenReader deleted the patch-1 branch November 30, 2022 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants