Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to select path segments using Python re regexprs; follow on to PR#146 & #177 #186

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

AlainLich
Copy link

Hi @moomoohk ,

Following your comments, this PR substitutes to #177 and includes only required changes to implement and test the feature. I hope that this fits your expectations.

Also, the PR is based on your latest commit abb76a1 (origin/master, origin/HEAD, master, Sat Mar 25 21:07:41 2023 +0300).

Tests through workflows at https://github.com/AlainLich/dpath-python/actions/runs/4651582912

Regards

Alain

 Restructured as requested, strictly required mods
    - flake8 compliant
    - nose2 tested
    - tox tested for Python3.11 and Pypy3.9
    - improved some tests (require verification of result...)
    - documentation README.rst synchronized

  Still TBD:
    - check of .github/workflows

  Test environment:
       - Running on ARM-64
          Linux version 5.15.49-linuxkit (root@buildkitsandbox)
           (gcc (Alpine 10.2.1_pre1) 10.2.1 20201203, GNU ld (GNU Binutils) 2.35.2)
           dpath-maintainers#1 SMP PREEMPT Tue Sep 13 07:51:32 UTC 2022
       - Python 3.11.2
       - pypy3 --version : Python 3.9.16 (7.3.11+dfsg-2, Feb 06 2023, 17:14:22)
                           [PyPy 7.3.11 with GCC 12.2.0]
  - new file:   .github/workflows/python3Test.yml
  - remove this for PR
- f-strings support = for self-documenting expression
  introduced with Python 3.8
  - includes only required changes for feature, as suggested in @moomoohk
    comment on April 5th
  - test improvements
  - tested on python3.8,...,3.11 and pypy3.7 & 3.9
@moomoohk
Copy link
Collaborator

Thanks Alain I will begin looking over your changes. My first pass will mainly aimed at style discrepancies and then I'll dive deeper into functionality.
Have a great weekend.

@moomoohk moomoohk self-assigned this Apr 14, 2023
dpath/__init__.py Outdated Show resolved Hide resolved
dpath/__init__.py Outdated Show resolved Hide resolved
dpath/__init__.py Outdated Show resolved Hide resolved
dpath/__init__.py Outdated Show resolved Hide resolved
dpath/__init__.py Outdated Show resolved Hide resolved
dpath/segments.py Show resolved Hide resolved
dpath/segments.py Outdated Show resolved Hide resolved
dpath/__init__.py Outdated Show resolved Hide resolved
tox.ini Outdated Show resolved Hide resolved
dpath/options.py Outdated Show resolved Hide resolved
@AlainLich
Copy link
Author

Hi @moomoohk,

Just answered your questions, I plan to push to Github later today or tomorrow

Have a good day,
Alain

AlainLich and others added 6 commits April 15, 2023 15:00
  - types: addition of GlobExtend where both a glob and a regex are allowed
  - code : __init__.py: improved _split_path
  - added InvalidRegex exception
  - test improvements (assert expected result) and additions
  - change default behaviour DPATH_ACCEPT_RE_REGEXP_IN_STRING=False, kept documentation current

Flake8 compliant, tested Python11/Pypy9 on debian/ARM64

Also added a workflow for testing in arbitrary branch, expected to be suppressed in PR,
but more testing is in order at this time!!
    - made option DPATH_ACCEPT_RE_REGEXP_IN_STRING configurable from sys. environ
    - separated in test_various_exts tests which require this to be True, other tests
      may always be run
    - added tox-set-rex.ini which sets (whereas the default is unset)
    - python3Test.yml runs both tox configurations, reduced number of Python & Pypy configs
@moomoohk
Copy link
Collaborator

Hi Alain,
I've taken the liberty of polishing off the feature-related code and preparing it for release.
The final outstanding blocker is the tests you've written. I couldn't help but notice they don't conform with the style of the existing tests. Additionally, they seem to be rather messy...
They seem to be quite comprehensive, which is great, but I don't see how I will be able to maintain them in the future.

What do you think we can do about this?

P.S. I've also noticed you've added a new GitHub workflow just for testing this feature. Is it truly necessary? Again, I would like this PR to be minimal and focused. This change entails the extra envar logic you've added to options.py, which I think needs to be considered separately.

Regards

@AlainLich
Copy link
Author

Hi @moomoohk,
I agree that my test files are messy, however I had not given much thought yet about this (yet!).

  1. my latest updates:

    • Once I understood that the feature DPATH_ACCEPT_RE_REGEXP_IN_STRING would likely be disabled by default, I decided that it would be necessary to be able to run the whole suite with the feature either enabled or disabled. I changed my workflow to apply this. This is selectable via parameters in the tox.ini
      • BTW: I anyway need to have my own workflow since yours have non modifiable branch names, and my PR code sits on a specific branch. I do not expect that workflow to be kept on your site.
    • Also, I reorganized my test file to separate tests that require DPATH_ACCEPT_RE_REGEXP_IN_STRING from the rest
    • I thought that using an environment parameter to switch was acceptable, and simpler than doing this via unittest
  2. I could redo the file test_various_exts.py (maybe renaming to test_regexp_ext.py):

    • keeping the capability to test with DPATH_ACCEPT_RE_REGEXP_IN_STRING set or not
    • keeping a 2 level format: 1 class per tested primitive, test methods for individual tests. I found this easy to run only part of the test for debugging
    • separating 2 "data" structures .d1 (easy to read) and .d2(representative (I took a docker inspect output)). I thought .d2 is useful to show fairly realistic/complex regular expressions,...
    • putting the .specs* tables in the method that uses them, and ensuring a strict format of (data, expected)rather than what is going on right now in .specs1 (data, expected1, expected2,,...)
    • factoring the common code for result comparison in some functions or class
  3. Your question about the additional workflow:

    • it is necessary for me to test since as said above your workflows have "wired-in" branch names, that do not correspond
    • I decided to test the ability to test with DPATH_ACCEPT_RE_REGEXP_IN_STRING set or not, in the latter case disabling tests that do not make sense. I think this capability is useful to avoid trouble later.
    • I do not expect this workflow to be kept on the maintainer site, however you may want to keep the feature of enabling both tests.

I hope this answers your questions, I have not yet looked at your 'polishing' modification

Please let me know if this suits you, I could try to make this available by next week end
Regards
Alain

@AlainLich
Copy link
Author

Hi @moomoohk ,

Your changes Sun Apr 16 01:50:18 - 02:05:29 2023 +0300 are all fine and integrated in my development tree.

Regards
Alain

@AlainLich
Copy link
Author

Hi @moomoohk,
I expect to push a revision of the test code (largely redone, I had not realized how bad it had become ) tomorrow (Thursday), too busy today.
Regards
Alain

@moomoohk
Copy link
Collaborator

Thanks @AlainLich, sounds great. I had meant to reply to your previous comment with some notes but my computer crashed...

After giving it some thought I've come to the conclusion that you're correct assuming that the tests should be run twice, with the option toggled. I've also thought about the best way to implement a testable feature flag and agree that an envar is the way to go. I've reduced the code in options.py to be as minimal and unintrusive as possible. You may have noticed, this included shortening and simplifying the envar name to ALLOW_REGEX.

As for setting the feature flag for the tests, I believe adding an extra step to the existing workflow will suffice. This will ensure that non-test code has to be written around the tests themselves. I will push this update shortly for your reference. Regarding what you wrote about the branch name, the existing workflow is configured to run on pushes to pull requests as well. Now that this PR is opened I would say that your workflow can be removed.

Additionally, I believe test_regex.py is an appropriate name for your tests.

Thanks again

@AlainLich
Copy link
Author

OK, give me 5 hours of stability, I will integrate my stuff and push
Have a good day

**NOTE** this is for my testing; if all goes well I will suppress extra workflow and push again
@AlainLich
Copy link
Author

Hi @moomoohk ,

while testing, got hit (again) by

test_segments.py", line 323, in test_view
ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)
IndexError: list index out of range
Falsifying example: test_view(
walkable=([{b'[\x00]': 0}], ((<SymmetricInt 0%1>, b'[\x00]'), 0)),
self=<tests.test_segments.TestSegments testMethod=test_view>,

and (same code location)

KeyError: b'[\x00]'
Falsifying example: test_view(
walkable=({b'[\x00]': 0}, ((b'[\x00]',), 0)),
self=<tests.test_segments.TestSegments testMethod=test_view>,

My understanding is that Hypothesis generated a test case where the string was essentially started by a zero character.... in tests/test_segments.py. Unexpected result ensued. In 1 run out of 4.

Considering that this is a different issue (no re.Pattern in sight), I will proceed to removing the unneeded workflow, and let you review the code.

Full story in attachment
ExtractBug.txt
complete workflow run https://github.com/AlainLich/dpath-python/actions/runs/4743890895.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants