[#529] NEW: regex matched version of have.text & co

+ [#528] DOCS: docstrings for have.*_like collection conditions and also more verbose parameter names + Fix misleading absence of waiting in slicing behavior
yashaka · May 24, 2024 · 703df6b · 703df6b
1 parent a7da02d
commit 703df6b
Show file tree

Hide file tree

Showing 10 changed files with 416 additions and 43 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -105,35 +105,89 @@ TODOs:
 
 `.have.exact_texts(1, 2.0, '3')` is now possible, and will be treated as `['1', '2.0', '3']`
 
-### list globs, text wildcards and regex support in texts_like conditions
+### regex support for element conditions that assert element text
 
-List of conditions added (still marked as experimental with `_` prefix):
+List of element conditions added:
 
-- `have._exact_texts_like(*exact_texts_or_list_globs: Union[str, int, float])`
-- `have._exact_texts_like(*exact_texts_or_list_globs: Union[str, int, float]).where(**globs_to_override)`
-- `have._texts_like(*contained_texts_or_list_globs: Union[str, int, float])`
-- `have._texts_like(*contained_texts_or_list_globs: Union[str, int, float]).where(**glob_to_override)`
-- `have._texts_like(*regex_patterns_or_list_globs: Union[str, int, float]).with_regex`
+- `have.text_matching(regex_pattern: str | int | float)`
+  - = `match.text_pattern(regex_pattern: str | int | float)`
+
+Examples of usage:
+
+```python
+from selene import browser, have
+
+...
+# GivenPage(browser.driver).opened_with_body(
+#     '''
+#     <ul>Hello:
+#         <li>1) One!!!</li>
+#         <li>2) Two...</li>
+#         <li>3) Three???</li>
+#     </ul>
+#     '''
+# )
+
+# in addition to:
+browser.all('li').first.should(have.text('One'))
+# this would be an alternative to previous match, but via regex:
+browser.all('li').first.should(have.text_matching(r'.*One.*'))
+# with more powerful features:
+browser.all('li').first.should(have.text_matching(r'\d\) One(.)\1\1'))
+# ^ and $ can be used but don't add much value, cause work same as previous
+browser.all('li').first.should(have.text_matching(r'^\d\) One(.)\1\1$'))
+
+# there is also a similar collection condition that
+# matches each pattern to each element text in the collection
+# in the corresponding order:
+browser.all('li').should(have.texts_matching(
+  r'\d\) One!+', r'.*', r'.*'
+))
+# that is also equivalent to:
+browser.all('li').should(have._texts_like(
+  r'\d\) One(.)\1\1', ..., ...
+).with_regex)
+# or even:
+browser.all('li').should(have._texts_like(
+  r'\d\) One(.)\1\1', (...,)  # = one or more
+).with_regex)
+# And with smart approach you can mix to achieve more with less:
+browser.all('li')[:3].should(have.text_matching(
+  r'\d\) \w+(.)\1\1'
+).each)
+```
+
+### list globs, text wildcards and regex support in texts_like collection conditions
+
+List of collection conditions added (still marked as experimental with `_` prefix):
+
+- `have._exact_texts_like(*texts_or_item_placeholders: str | int | float)`
+- `have._exact_texts_like(*texts_or_item_placeholders: str | int | float).where(**placeholders_to_override)`
+- `have._texts_like(*contained_texts_or_item_placeholders: str | int | float)`
+- `have._texts_like(*contained_texts_or_item_placeholders: str | int | float).where(**placeholders_to_override)`
+- `have._texts_like(*regex_patterns_or_item_placeholders: str | int | float).with_regex`
   - is an alias to `have._text_patterns_like`
-- `have._text_patterns(*regex_patterns).with_regex`
+- `have._text_patterns(*regex_patterns)`
   - like `have.texts` but with regex patterns as expected, i.e. no list globs support
-- `have._texts_like(*texts_with_wildcards_or_list_globs: Union[str, int, float]).with_wildcards`
-- `have._texts_like(*texts_with_wildcards_or_list_globs: Union[str, int, float]).where_wildcards(**to_override)`
+- `have._texts_like(*texts_with_wildcards_or_item_placeholders: Union[str, int, float]).with_wildcards`
+- `have._texts_like(*texts_with_wildcards_or_item_placeholders: Union[str, int, float]).where_wildcards(**to_override)`
 - corresponding `have.no.*` versions of same conditions
 
 Where:
 
-- default list globs are:
+- default list glob placeholders are:
   - `[...]` matches **zero or one** item of any text in the list
   - `...` matches **exactly one** item of any text in the list
   - `(...,)` matches one **or more** items of any text in the list
   - `[(...,)]` matches **zero** or more items of any text in the list
-- all globs can be mixed in the same list of expected items in any order
+- all globbing placeholders can be mixed in the same list of expected items in any order
 - regex patterns can't use `^` (start of text) and `$` (end of text)
     because they are implicit, and if added explicitly will break the match
 - supported wildcards can be overridden and defaults are:
   - `*` matches **zero or more** of any characters in a text item
   - `?` matches **exactly one** of any character in a text item
+- expected list items flattening is not supported like in `have.texts` and `have.exact_texts`
+    because `[]` are used in list globs. So, you can't use nested lists or tuples to format the expected list of items. 
 
 Warning:
 
@@ -160,7 +214,7 @@ browser.all('li').should(have._exact_texts_like(
     '1) One!!!', '2) Two!!!', ..., ..., ...  # = exactly one
 ))
 browser.all('li').should(have._texts_like(
-    '\d\) One!+', '\d.*', ..., ..., ...
+    r'\d\) One!+', r'\d.*', ..., ..., ...
 ).with_regex)
 browser.all('li').should(have._texts_like(
     '?) One*', '?) Two*', ..., ..., ...
@@ -296,6 +350,19 @@ Providing a brief overview of the modules and how to define your own custom comm
 
 Just "autocomplete" is disabled, methods still work;)
 
+### Fix misleading absence of waiting in slicing behavior
+
+Now this will fail:
+
+```python
+from selene import browser, have
+...
+browser.all('.non-existing')[:1].should(have.text('something').each)
+```
+– and that's good, because we are identifying the expected number of elements in a slice.
+
+But before it would pass, that contradicted with other "get element by index" behavior:D
+
 ### Fix path of screenshot and pagesource for Windows
 
 Thanks to [Cameron Shimmin](https://github.com/cshimm) and Edale Miguel for PR [#525](https://github.com/yashaka/selene/pull/525)

diff --git a/selene/core/entity.py b/selene/core/entity.py
@@ -653,7 +653,7 @@ def __call__(self) -> typing.Sequence[WebElement]:
 
     @property
     def cached(self) -> Collection:
-        webelements = self()
+        webelements = self.locate()
         return Collection(Locator(f'{self}.cached', lambda: webelements), self.config)
 
     def __iter__(self):
@@ -670,6 +670,8 @@ def __len__(self):
         return self.get(query.size)
 
     # TODO: add config.index_collection_from_1, disabled by default
+    # TODO: consider additional number param, that counts from 1
+    #       if provided instead of index
     def element(self, index: int) -> Element:
         def find() -> WebElement:
             webelements = self.locate()
@@ -720,7 +722,20 @@ def sliced(
         step: int = 1,
     ) -> Collection:
         def find() -> typing.Sequence[WebElement]:
-            webelements = self()
+            webelements = self.locate()
+            length = len(webelements)
+            if start is not None and start != 0 and start >= length:
+                raise AssertionError(
+                    f'not enough elements to slice collection '
+                    f'from START on index={start}, '
+                    f'actual elements collection length is {length}'
+                )
+            if stop is not None and stop != -1 and length < stop:
+                raise AssertionError(
+                    'not enough elements to slice collection '
+                    f'from {start or "START"} to STOP at index={stop}, '
+                    f'actual elements collection length is {length}'
+                )
 
             # TODO: assert length according to provided start, stop...
 
@@ -758,7 +773,7 @@ def by(
         condition = (
             condition
             if isinstance(condition, Condition)
-            else Condition(str(condition), condition)
+            else Condition(str(condition), condition)  # TODO: check here for fn name
         )
 
         return Collection(

diff --git a/selene/core/match.py b/selene/core/match.py
@@ -105,6 +105,14 @@ def element_has_exact_text(expected: str) -> Condition[Element]:
     return element_has_text(expected, 'has exact text', predicate.equals)
 
 
+def text_pattern(expected: str) -> Condition[Element]:
+    return ElementCondition.raise_if_not_actual(
+        f'has text matching {expected}',
+        query.text,
+        predicate.matches(expected),
+    )
+
+
 def element_has_js_property(name: str):
     # TODO: should we keep simpler but less obvious name - *_has_property ?
     def property_value(element: Element):
@@ -398,7 +406,7 @@ def actual_visible_texts(collection: Collection) -> List[str]:
 
     return CollectionCondition.raise_if_not_actual(
         f'has texts {expected_}',
-        actual_visible_texts,
+        Query('visible texts', actual_visible_texts),
         predicate.equals_by_contains_to_list(expected_),
     )
 
@@ -816,6 +824,11 @@ def __init__(
 
 
 # TODO: add an alias from texts(*expected).with_regex to text_patterns_like
+#       hm, but then it would be natural
+#       if we disable implicit ^ and $ for each item text
+#       and so we make it inconsistent with the behavior of *_like versions
+#       then probably we should explicitly document that we are not going
+#       to add such type of condition at all
 class _text_patterns(_text_patterns_like):
     """Condition to match visible texts of all elements in a collection
     with supported item placeholders to include/exclude items from match
@@ -836,7 +849,7 @@ def __init__(
         _name='text patterns',
     ):  # noqa
         super().__init__(
-            *expected,
+            *helpers.flatten(expected),  # TODO: document
             _process_patterns=_process_patterns,
             _negated=_negated,
             _name_prefix=_name_prefix,
@@ -845,6 +858,11 @@ def __init__(
         # disable globs (doing after __init__ to override defaults)
         self._globs = ()
 
+    # TODO: consider refactoring so this attribute is not even inherited
+    def where(self):
+        """Just a placeholder. This attribute is not supported for this condition"""
+        raise AttributeError('.where(**) is not supported on text_patterns condition')
+
     # TODO: can and should we disable here the .where method?
     #       shouldn't we just simply implement it in a straightforward style
     #       similar to match.exact_texts?

diff --git a/selene/support/conditions/have.py b/selene/support/conditions/have.py
@@ -32,15 +32,19 @@
 no = _not_
 
 
-def exact_text(value) -> Condition[Element]:
+def exact_text(value: str) -> Condition[Element]:
     return match.element_has_exact_text(value)
 
 
 # TODO: consider accepting int
-def text(partial_value) -> Condition[Element]:
+def text(partial_value: str) -> Condition[Element]:
     return match.element_has_text(partial_value)
 
 
+def text_matching(regex_pattern: str) -> Condition[Element]:
+    return match.text_pattern(regex_pattern)
+
+
 # TODO: should we use here js.property style (and below for js.returned(...))
 def js_property(name: str, value: Optional[str] = None):
     if value:
@@ -137,28 +141,127 @@ def size_greater_than_or_equal(number: int) -> Condition[Collection]:
 
 
 # TODO: consider accepting ints
-def texts(*partial_values: Union[str, Iterable[str]]) -> Condition[Collection]:
+def texts(*partial_values: str | Iterable[str]) -> Condition[Collection]:
     return match.collection_has_texts(*partial_values)
 
 
 def exact_texts(*values: str | int | float | Iterable[str]):
     return match.collection_has_exact_texts(*values)
 
 
-def _exact_texts_like(*values: str | int | float | Iterable):
-    return match._exact_texts_like(*values)
+def _exact_texts_like(*texts_or_item_placeholders: str | int | float | Iterable):
+    """List-globbing version of
+    [have.exact_texts(*texts)][selene.support.conditions.have.exact_texts]
+    allowing to use item placeholders instead of text items.
+
+    Default list globbing placeholders are:
+
+    - `[...]` matches **zero or one** item of any text in the list
+    - `...` matches **exactly one** item of any text in the list
+    - `(...,)` matches one **or more** items of any text in the list
+    - `[(...,)]` matches **zero** or more items of any text in the list
+
+    Placeholders can be overridden in the following manner:
+    `have._texts_like(*text_items_or_placeholders).where(**placeholders_to_override)`
+
+    Nested lists with text items for better formatting of expected texts –
+    are not supported, unlike in `have.exact_texts(*items)`,
+    because list literals are used as placeholders for list globbing."""
+    return match._exact_texts_like(*texts_or_item_placeholders)
+
+
+# could be named as texts_matching_like
+# but seems like "matching like" confuses too much...
+# yet, we want to keep _like suffix
+# as identifier of "globbing" nature of the list match
+def _text_patterns_like(
+    *regex_patterns_or_item_placeholders: str | int | float | Iterable,
+):
+    """List-globbing version of
+    [have.texts_matching(*regex_patterns)][selene.support.conditions.have.texts_matching]
+    allowing to use item placeholders instead of text items.
+
+    Default list globbing placeholders are:
+
+    - `[...]` matches **zero or one** item of any text in the list
+    - `...` matches **exactly one** item of any text in the list
+    - `(...,)` matches one **or more** items of any text in the list
+    - `[(...,)]` matches **zero** or more items of any text in the list
+
+    Placeholders can be overridden in the following manner:
+    `have._texts_like(*text_items_or_placeholders).where(**placeholders_to_override)`
+
+    !!! warning
+
+        Nested lists with text items for better formatting of expected texts –
+        are not supported,
+        unlike in [`have.texts(*texts)`][selene.support.conditions.have.texts],
+        because list literals are used as placeholders for list globbing.
+
+    !!! warning
+
+        Unlike in [`have.texts_matching(*regex_patterns)`][selene.support.conditions.have.texts_matching],
+        regex patterns for this condition
+        can't use `^` (start of text) and `$` (end of text),
+        because they are implicit as a result of merging for globbing implementation,
+        and if added explicitly will break the match.
+    """
+    return match._text_patterns_like(*regex_patterns_or_item_placeholders)
+
+
+def texts_matching(*regex_patterns: str | int | float | Iterable):
+    """Regex version of [have.texts(*partial_values)][selene.support.conditions.have.texts]
+    allowing to use regex patterns instead of text items matched by contains.
+    """
+    return match._text_patterns(*regex_patterns)
+
+
+def _texts_like(*contained_texts_or_item_placeholders: str | int | float | Iterable):
+    """List-globbing version of [have.texts(*partial_values)][selene.support.conditions.have.texts]
+    allowing to use item placeholders instead of text items.
+
+    Default list globbing placeholders are:
+
+    - `[...]` matches **zero or one** item of any text in the list
+    - `...` matches **exactly one** item of any text in the list
+    - `(...,)` matches one **or more** items of any text in the list
+    - `[(...,)]` matches **zero** or more items of any text in the list
+
+    Placeholders can be overridden in the following manner:
+    `have._texts_like(*text_items_or_placeholders).where(**placeholders_to_override)`
+
+    !!! warning
+
+        Nested lists with text items for better formatting of expected texts –
+        are not supported, unlike in
+        [`have.texts(*texts)`][selene.support.conditions.have.texts],
+        because list literals are used as placeholders for list globbing.
 
+    Text items are matched by contains, but can be matched by regex patterns
+    if modified via `.with_regex` property making the actual signature be equivalent to
+    `have._texts_like(*regex_patterns_or_item_placeholders).with_regex`.
+    Actually calling `.with_regex` just forward implementation to
+    [have._text_patterns_like(*regex_patterns_or_item_placeholders)][selene.support.conditions.have._text_patterns_like].
 
-def _text_patterns_like(*values: str | int | float | Iterable):
-    return match._text_patterns_like(*values)
+    !!! warning
 
+        Unlike in [`have.texts_matching(*regex_patterns)`][selene.support.conditions.have.texts_matching],
+        Regex patterns can't use `^` (start of text) and `$` (end of text)
+        because they are implicit, and if added explicitly will break the match.
 
-def _text_patterns(*values: str | int | float | Iterable):
-    return match._text_patterns(*values)
+    If modified via `.with_wildcards`
+    then switch regex to wildcards-based pattern matching,
+    making the actual signature be equivalent to:
+    `have._texts_like(*texts_with_wildcards_or_item_placeholders).with_wildcards`
+    or
+    `have._texts_like(*texts_with_wildcards_or_item_placeholders).where_wildcards(**to_override)`
 
+    Supported wildcards can be overridden and defaults are:
 
-def _texts_like(*values: str | int | float | Iterable):
-    return match._texts_like(*values)
+    - `*` matches **zero or more** of any characters in a text item
+    - `?` matches **exactly one** of any character in a text item
+    """
+    return match._texts_like(*contained_texts_or_item_placeholders)
 
 
 def url(exact_value: str) -> Condition[Browser]: