-
-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support multi-page URL #79
Conversation
Thanks for your contribution. As a general rule, you should check your code with In this case you should just add, for example ("https://chan.sankakucomplex.com/?tags=marie_rose&page=98&next=3874906", None), to the list. |
gallery_dl/extractor/idolcomplex.py
Outdated
r"/\?(?:[^&#]*&)*tags=([^&#]+)"] | ||
r"/\?(?:[^&#]*&)*tags=([^&#]+)" | ||
r"((?:[^&#]*&)*page=([^&#]+))*" | ||
r"((?:[^&#]*&)*next=([^&#]+))*"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of capturing every query-parameter on its own, capture the whole query string ((?:\?([^#]*)?
) and parse it with text.parse_query(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... OK ignore this one. What I wrote wouldn't work. Just leave it as it is.
gallery_dl/extractor/sankaku.py
Outdated
r"/\?(?:[^&#]*&)*tags=([^&#]+)"] | ||
r"/\?(?:[^&#]*&)*tags=([^&#]+)" | ||
r"((?:[^&#]*&)*page=([^&#]+))*" | ||
r"((?:[^&#]*&)*next=([^&#]+))*"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
gallery_dl/extractor/sankaku.py
Outdated
@@ -150,6 +152,8 @@ class SankakuTagExtractor(SankakuExtractor): | |||
def __init__(self, match): | |||
SankakuExtractor.__init__(self) | |||
self.tags = text.unquote(match.group(1).replace("+", " ")) | |||
self.start_page = 1 if match.group(3) is None else int(text.unquote(match.group(3))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use util.safe_int(..., 1)
.
You don't need to check if this particular group is None
, safe_int() already handles that.
Also don't unquote something that should be a number.
gallery_dl/extractor/sankaku.py
Outdated
@@ -150,6 +152,8 @@ class SankakuTagExtractor(SankakuExtractor): | |||
def __init__(self, match): | |||
SankakuExtractor.__init__(self) | |||
self.tags = text.unquote(match.group(1).replace("+", " ")) | |||
self.start_page = 1 if match.group(3) is None else int(text.unquote(match.group(3))) | |||
self.next = 0 if match.group(5) is None else int(text.unquote(match.group(5))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. util.safe_int(..., 0)
.
gallery_dl/extractor/sankaku.py
Outdated
@@ -170,6 +174,7 @@ def get_metadata(self): | |||
|
|||
def get_posts(self): | |||
params = {"tags": self.tags, "page": self.start_page} | |||
if self.next > 0 : params["next"] = self.next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No space between condition and :
. flake8 should tell you that as well.
Nice, thank you. The only remaining issue is the regex, which now also matches the URLs of the other two extractors.
|
@mikf Oh, I did not see your message, it seems we still need to commit it once.😂 |
Hello, I modified
sankaku.py
andidolcomplex.py
. It now supports multi-page URL like https://chan.sankakucomplex.com/?tags=marie_rose&page=98&next=3874906 (NSFW).Please review my code.