Add ability to select path segments using Python re regexprs; follow on to PR#146 & #177 #186

AlainLich · 2023-04-09T19:15:45Z

Following your comments, this PR substitutes to #177 and includes only required changes to implement and test the feature. I hope that this fits your expectations.

Also, the PR is based on your latest commit abb76a1 (origin/master, origin/HEAD, master, Sat Mar 25 21:07:41 2023 +0300).

Tests through workflows at https://github.com/AlainLich/dpath-python/actions/runs/4651582912

Regards

Alain

Restructured as requested, strictly required mods - flake8 compliant - nose2 tested - tox tested for Python3.11 and Pypy3.9 - improved some tests (require verification of result...) - documentation README.rst synchronized Still TBD: - check of .github/workflows Test environment: - Running on ARM-64 Linux version 5.15.49-linuxkit (root@buildkitsandbox) (gcc (Alpine 10.2.1_pre1) 10.2.1 20201203, GNU ld (GNU Binutils) 2.35.2) dpath-maintainers#1 SMP PREEMPT Tue Sep 13 07:51:32 UTC 2022 - Python 3.11.2 - pypy3 --version : Python 3.9.16 (7.3.11+dfsg-2, Feb 06 2023, 17:14:22) [PyPy 7.3.11 with GCC 12.2.0]

- new file: .github/workflows/python3Test.yml - remove this for PR

- f-strings support = for self-documenting expression introduced with Python 3.8

@moomoohk

- includes only required changes for feature, as suggested in @moomoohk comment on April 5th - test improvements - tested on python3.8,...,3.11 and pypy3.7 & 3.9

moomoohk · 2023-04-14T12:10:53Z

Thanks Alain I will begin looking over your changes. My first pass will mainly aimed at style discrepancies and then I'll dive deeper into functionality.
Have a great weekend.

dpath/__init__.py

dpath/segments.py

dpath/__init__.py

tox.ini

dpath/options.py

AlainLich · 2023-04-15T08:09:31Z

Hi @moomoohk,

Just answered your questions, I plan to push to Github later today or tomorrow

Have a good day,
Alain

- types: addition of GlobExtend where both a glob and a regex are allowed - code : __init__.py: improved _split_path - added InvalidRegex exception - test improvements (assert expected result) and additions - change default behaviour DPATH_ACCEPT_RE_REGEXP_IN_STRING=False, kept documentation current Flake8 compliant, tested Python11/Pypy9 on debian/ARM64 Also added a workflow for testing in arbitrary branch, expected to be suppressed in PR, but more testing is in order at this time!!

- made option DPATH_ACCEPT_RE_REGEXP_IN_STRING configurable from sys. environ - separated in test_various_exts tests which require this to be True, other tests may always be run - added tox-set-rex.ini which sets (whereas the default is unset) - python3Test.yml runs both tox configurations, reduced number of Python & Pypy configs

moomoohk · 2023-04-15T23:02:17Z

Hi Alain,
I've taken the liberty of polishing off the feature-related code and preparing it for release.
The final outstanding blocker is the tests you've written. I couldn't help but notice they don't conform with the style of the existing tests. Additionally, they seem to be rather messy...
They seem to be quite comprehensive, which is great, but I don't see how I will be able to maintain them in the future.

What do you think we can do about this?

P.S. I've also noticed you've added a new GitHub workflow just for testing this feature. Is it truly necessary? Again, I would like this PR to be minimal and focused. This change entails the extra envar logic you've added to options.py, which I think needs to be considered separately.

Regards

AlainLich · 2023-04-16T09:57:51Z

Hi @moomoohk,
I agree that my test files are messy, however I had not given much thought yet about this (yet!).

my latest updates:
- Once I understood that the feature DPATH_ACCEPT_RE_REGEXP_IN_STRING would likely be disabled by default, I decided that it would be necessary to be able to run the whole suite with the feature either enabled or disabled. I changed my workflow to apply this. This is selectable via parameters in the tox.ini
  - BTW: I anyway need to have my own workflow since yours have non modifiable branch names, and my PR code sits on a specific branch. I do not expect that workflow to be kept on your site.
- Also, I reorganized my test file to separate tests that require DPATH_ACCEPT_RE_REGEXP_IN_STRING from the rest
- I thought that using an environment parameter to switch was acceptable, and simpler than doing this via unittest
I could redo the file test_various_exts.py (maybe renaming to test_regexp_ext.py):
- keeping the capability to test with DPATH_ACCEPT_RE_REGEXP_IN_STRING set or not
- keeping a 2 level format: 1 class per tested primitive, test methods for individual tests. I found this easy to run only part of the test for debugging
- separating 2 "data" structures .d1 (easy to read) and .d2(representative (I took a docker inspect output)). I thought .d2 is useful to show fairly realistic/complex regular expressions,...
- putting the .specs* tables in the method that uses them, and ensuring a strict format of (data, expected)rather than what is going on right now in .specs1 (data, expected1, expected2,,...)
- factoring the common code for result comparison in some functions or class
Your question about the additional workflow:
- it is necessary for me to test since as said above your workflows have "wired-in" branch names, that do not correspond
- I decided to test the ability to test with DPATH_ACCEPT_RE_REGEXP_IN_STRING set or not, in the latter case disabling tests that do not make sense. I think this capability is useful to avoid trouble later.
- I do not expect this workflow to be kept on the maintainer site, however you may want to keep the feature of enabling both tests.

I hope this answers your questions, I have not yet looked at your 'polishing' modification

Please let me know if this suits you, I could try to make this available by next week end
Regards
Alain

AlainLich · 2023-04-16T13:03:10Z

Hi @moomoohk ,

Your changes Sun Apr 16 01:50:18 - 02:05:29 2023 +0300 are all fine and integrated in my development tree.

Regards
Alain

AlainLich · 2023-04-19T07:43:49Z

Hi @moomoohk,
I expect to push a revision of the test code (largely redone, I had not realized how bad it had become ) tomorrow (Thursday), too busy today.
Regards
Alain

moomoohk · 2023-04-19T08:20:09Z

Thanks @AlainLich, sounds great. I had meant to reply to your previous comment with some notes but my computer crashed...

After giving it some thought I've come to the conclusion that you're correct assuming that the tests should be run twice, with the option toggled. I've also thought about the best way to implement a testable feature flag and agree that an envar is the way to go. I've reduced the code in options.py to be as minimal and unintrusive as possible. You may have noticed, this included shortening and simplifying the envar name to ALLOW_REGEX.

As for setting the feature flag for the tests, I believe adding an extra step to the existing workflow will suffice. This will ensure that non-test code has to be written around the tests themselves. I will push this update shortly for your reference. Regarding what you wrote about the branch name, the existing workflow is configured to run on pushes to pull requests as well. Now that this PR is opened I would say that your workflow can be removed.

Additionally, I believe test_regex.py is an appropriate name for your tests.

Thanks again

AlainLich · 2023-04-19T10:09:09Z

OK, give me 5 hours of stability, I will integrate my stuff and push
Have a good day

**NOTE** this is for my testing; if all goes well I will suppress extra workflow and push again

AlainLich · 2023-04-19T14:10:49Z

Hi @moomoohk ,

while testing, got hit (again) by

test_segments.py", line 323, in test_view
ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)
IndexError: list index out of range
Falsifying example: test_view(
walkable=([{b'[\x00]': 0}], ((<SymmetricInt 0%1>, b'[\x00]'), 0)),
self=<tests.test_segments.TestSegments testMethod=test_view>,

and (same code location)

KeyError: b'[\x00]'
Falsifying example: test_view(
walkable=({b'[\x00]': 0}, ((b'[\x00]',), 0)),
self=<tests.test_segments.TestSegments testMethod=test_view>,

My understanding is that Hypothesis generated a test case where the string was essentially started by a zero character.... in tests/test_segments.py. Unexpected result ensued. In 1 run out of 4.

Considering that this is a different issue (no re.Pattern in sight), I will proceed to removing the unneeded workflow, and let you review the code.

Full story in attachment
ExtractBug.txt
complete workflow run https://github.com/AlainLich/dpath-python/actions/runs/4743890895.

AlainLich added 4 commits April 8, 2023 23:00

Add a workflow which is launched manually within selectable branch

8a468c1

- new file: .github/workflows/python3Test.yml - remove this for PR

Corrected tests/test_various_exts.py pypy-3.7 incompat.

b5827d3

- f-strings support = for self-documenting expression introduced with Python 3.8

Adds support for re regular expressions in paths

bccb6d0

- includes only required changes for feature, as suggested in @moomoohk comment on April 5th - test improvements - tested on python3.8,...,3.11 and pypy3.7 & 3.9

This was referenced Apr 9, 2023

Add ability to select path segments using Python re regexprs; follow on to PR#146 #177

Closed

Add ability to select path segments with user defined match; follow on to PRs #146 & #177 #182

Open

Corrected documentation

bf42f9b

moomoohk self-assigned this Apr 14, 2023

moomoohk added the enhancement label Apr 14, 2023

moomoohk requested changes Apr 14, 2023

View reviewed changes

AlainLich and others added 6 commits April 15, 2023 15:00

Wording and minor changes

3363da2

Integration of re.Pattern into existing types

3b4eac6

Wording

d6ba501

Isolation of regex handling code

4d0324d

Remove unsupported str functions

659824f

Rename feature flag and simplify envar code

6d54bd9

Run tests with regex feature flag using workflow

4c806d1

Tests for regexp feature redone, files renamed.

1b6143d

**NOTE** this is for my testing; if all goes well I will suppress extra workflow and push again

Removed unneeded workflow files. This adds a redone test_regexp.py.

1cfa48e

Merge branch 'master' into AL-PR-restructRQT

04e4598

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to select path segments using Python re regexprs; follow on to PR#146 & #177 #186

Add ability to select path segments using Python re regexprs; follow on to PR#146 & #177 #186

AlainLich commented Apr 9, 2023

moomoohk commented Apr 14, 2023

AlainLich commented Apr 15, 2023

moomoohk commented Apr 15, 2023

AlainLich commented Apr 16, 2023

AlainLich commented Apr 16, 2023

AlainLich commented Apr 19, 2023

moomoohk commented Apr 19, 2023

AlainLich commented Apr 19, 2023

AlainLich commented Apr 19, 2023

Add ability to select path segments using Python re regexprs; follow on to PR#146 & #177 #186

Are you sure you want to change the base?

Add ability to select path segments using Python re regexprs; follow on to PR#146 & #177 #186

Conversation

AlainLich commented Apr 9, 2023

moomoohk commented Apr 14, 2023

AlainLich commented Apr 15, 2023

moomoohk commented Apr 15, 2023

AlainLich commented Apr 16, 2023

AlainLich commented Apr 16, 2023

AlainLich commented Apr 19, 2023

moomoohk commented Apr 19, 2023

AlainLich commented Apr 19, 2023

AlainLich commented Apr 19, 2023