-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[deviantart] add /view URL support #3367
Conversation
The /view URL comes in very handy when only the index of the deviation (or its base36 equivalent) is known. |
gallery_dl/extractor/deviantart.py
Outdated
class DeviantartViewExtractor(DeviantartExtractor): | ||
"""Extractor for single deviations from a /view URL""" | ||
subcategory = "view" | ||
pattern = (r"(?:https?://)?(?:www\.)?deviantart\.com/()()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are ()()
in-dev placeholders ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the first two groups are treated as an optional username, so you need to have two empty groups for URLs without a username, like DeviantartWatchPostsExtractor
It would also be possible to slightly extend the existing Deviantart does not seem to care about username, type, or slug in deviation URLs, so we can just use default names if none are given. diff --git a/gallery_dl/extractor/deviantart.py b/gallery_dl/extractor/deviantart.py
index df59be4a..597abccf 100644
--- a/gallery_dl/extractor/deviantart.py
+++ b/gallery_dl/extractor/deviantart.py
@@ -854,7 +854,9 @@ class DeviantartDeviationExtractor(DeviantartExtractor):
"""Extractor for single deviations"""
subcategory = "deviation"
archive_fmt = "g_{_username}_{index}.{extension}"
- pattern = BASE_PATTERN + r"/(art|journal)/(?:[^/?#]+-)?(\d+)"
+ pattern = (BASE_PATTERN + r"/(art|journal)/(?:[^/?#]+-)?(\d+)"
+ r"|(?:https?://)?(?:www\.)?deviantart\.com/"
+ r"(?:view/|view(?:-full)?\.php/*\?(?:[^#]+&)?id=)(\d+)")
test = (
(("https://www.deviantart.com/shimoda7/art/For-the-sake-10073852"), {
"options": (("original", 0),),
@@ -919,11 +921,12 @@ class DeviantartDeviationExtractor(DeviantartExtractor):
def __init__(self, match):
DeviantartExtractor.__init__(self, match)
self.type = match.group(3)
- self.deviation_id = match.group(4)
+ self.deviation_id = match.group(4) or match.group(5)
def deviations(self):
url = "{}/{}/{}/{}".format(
- self.root, self.user, self.type, self.deviation_id)
+ self.root, self.user or "u", self.type or "art", self.deviation_id)
+
uuid = text.extract(self._limited_request(url).text,
'"deviationUuid\\":\\"', '\\')[0]
if not uuid:
|
83b33c8
to
6614d94
Compare
Was a bit on the fence about guessing the full URL since there's no guarantee that it'll continue to work, but I suppose it works |
URLs I found when searching for
deviantart.com/view
on web.archive.org:https://www.deviantart.com/view/<id>
http://
redirects you tohttps://
deviantart.com
redirects you towww.deviantart.com
https://www.deviantart.com/<username>/<art|journal>/<id>
offset
parameter, e.g. https://www.deviantart.com/view/14864502/?offset=80 (wayback machine), don't know what it meanshttps://www.deviantart.com/view.php?id=<id>
https://www.deviantart.com/view-full.php?id=<id>
http://
redirects you tohttps://
deviantart.com
redirects you towww.deviantart.com