-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@handle_urls() with item return type #84
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #84 +/- ##
=======================================
Coverage 99.81% 99.82%
=======================================
Files 17 18 +1
Lines 553 583 +30
=======================================
+ Hits 552 582 +30
Misses 1 1
|
hey @BurnzZ! I think we can solve "Handling the case of not needing all fields (recommendation for frameworks like scrapy-poet)" separately. These seem to be two independent features. |
4cc42c2
to
3a5499c
Compare
e38bd52
to
de15a86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed only 11.5/27 files so far, but I think this is enough feedback for a first round of thorough review.
I won’t fight on the capitalization front, but for the record I am still against it 😅
Co-authored-by: Adrián Chaves <adrian@chaves.io>
Co-authored-by: Adrián Chaves <adrian@chaves.io>
web_poet/rules.py
Outdated
If ``instead_of=None``, this simply means that the Page Object assigned in | ||
the ``use`` parameter will be utilized for all URLs matching the URL pattern | ||
in ``for_patterns``. However, if ``instead_of=ReplacedPageObject``, then it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look correct: if instead_of=None
, then Page Object won't be used to replace another Page Object, regardless of URL pattern. It might be used on the provided URL pattern if to_return
is set, but this has nothing to do with instead_of=None
, as it should work regardless of instead_of
value.
So, in the example below:
@handle_urls("example.com")
class MyPage(ItemPage):
# ...
handle_urls is a pure documentation. You can't use it instead of any other page object: instead_of is not set, so it is unknown which POs can MyPage replace. You also can't use it to return a certain data type, because the item class MyPage can return is unknown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BurnzZ let me send a PR with some doc-related suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the incorrect parts, and also removed some of the correct parts :) It seems this kind of documentation belongs somewhere else - it describes how the information stored in ApplyRule is used by other components.
Co-authored-by: Kevin Lloyd Bernal <kevin@zyte.com>
Documentation improvements for overrides (apply rules)
Thanks @BurnzZ, great work! |
Addresses approach 1 of #77
TODO:
to_return
item using@handle_urls
OverrideRule
to something else (candidate:ApplyRule
)Consider renaming(hopefully to match theOverrideRule.instead_of
to something else@handle_urls
parameter). Or the other way around: renaming@handle_urls
'soverrides
parameter.handle_urls
'sdata_type
to something else (ensure that it's consistent withOverrideRule
as well)Handling the case of not needing all fields (recommendation for frameworks likescrapy-poet
)