Fix json indent #2546

will-moore · 2024-12-09T15:34:05Z

This fixes the default json_indent added in #1952.
It was getting ignored in the V3JsonEncoder because kwargs has "indent": None so that kwargs.pop("indent", config.get("json_indent")) always returns None.

TODO:

Add unit tests and/or doctests in docstrings
Add docstrings and API docs for any new/modified user-facing classes and functions
New/modified features documented in docs/tutorial.rst
Changes documented in docs/release.rst
GitHub Actions have all passed
Test coverage is 100% (Codecov passes)

src/zarr/core/metadata/v3.py

d-v-b · 2024-12-09T15:44:25Z

tests/test_metadata/test_v3.py

+            self.indent = 2
+
+    # expected has extra ' ' on each line compared with json.dumps( indent=2)
+    expected = json.dumps(json.loads(d), cls=TestIndentEncoder).encode()


json.dumps takes an indent keyword argument, so I don't think we need the custom encoder class for this test

Yeah, I tried that first!
But json.dumps(data, indent=2) gives a different output than a class derived from json.JSONEncoder. The json.JSONEncoder has an extra whitespace at the end of each line (see my comment on line 315 above).
I don't know why this is (I guess it would be nice to know)?!

weird, i would also like to know why this is the case. is the extra whitespace more or less correct?

Just confirmed this doing a test writing of some zarr data. All the zarr.json have whitespace at the end of rows that end with a comma.

weird, i would also like to know why this is the case. is the extra whitespace more or less correct?

I would say it's not really correct to have the extra whitespace at the end of lines.
Here, there shouldn't be whitespace after "foo",

>>> json.dumps({"test": "foo", "bar": 2}, cls=TestIndentEncoder) '{\n "test": "foo", \n "bar": 2\n}'

tests/test_metadata/test_v3.py

src/zarr/core/metadata/v3.py

Co-authored-by: Joe Hamman <jhamman1@gmail.com>

will-moore · 2024-12-12T09:54:28Z

There are 4 test failures just now:

FAILED tests/test_array.py::test_nbytes_stored - assert 502 == 366
FAILED tests/test_array.py::test_nbytes_stored_async - assert 502 == 366
FAILED tests/test_array.py::TestInfo::test_info_complete - AssertionError: assert Type
FAILED tests/test_array.py::TestInfo::test_info_complete_async - AssertionError: assert Type

I wouldn't expect those to be caused by this PR? Are they passing elsewhere?

jhamman · 2024-12-12T19:31:00Z

Looking at the failing tests, I do think these are related to this PR:

    async def test_nbytes_stored_async() -> None:
        arr = await zarr.api.asynchronous.create(shape=(100,), chunks=(10,), dtype="i4")
        result = await arr.nbytes_stored()
>       assert result == 366  # the size of the metadata document. This is a fragile test.
E       assert 502 == 366

So it seems the change to the JSON encoding is impacting the size of the metadata document. I agree with the comment in the test "this is a fragile test". I think we should be open to changing this test or set the json encoding for these specific tests.

d-v-b · 2024-12-12T19:32:47Z

I think the info test should not be hard-coding the metadata size but rather comparing the reported metadata size to the actual metadata size, that would make the test much less brittle.

d-v-b · 2024-12-12T19:38:11Z

similarly, test_nbytes_stored should test that the actual size of the metadata document (measured directly) matches what comes out of nbytes_stored(), instead of hard-coding things. I can clean these up in a PR unless someone beats me to it.

will-moore · 2024-12-17T13:56:23Z

@d-v-b I wasn't sure how to implement your suggestion from #2546 (comment) - Maybe you could give some more pointers?
But for now I've updated the byte counts so that the tests are green (but still fragile as before).

d-v-b · 2024-12-17T14:10:01Z

@d-v-b I wasn't sure how to implement your suggestion from #2546 (comment) - Maybe you could give some more pointers? But for now I've updated the byte counts so that the tests are green (but still fragile as before).

I was thinking that instead of comparing the reported size of the metadata document to a hard-coded value, we could just get the size of the metadata document directly, and ensure that the reported size matches the observed size. We would get the actual size of the metadata document by invoking store.get(zarr.json) (for v3) and measuring the length of the bytes coming out. A similar process would give you the size of the metadata document for v2. But this can be done in a separate PR.

will-moore · 2025-01-06T16:26:20Z

@d-v-b Anything else I need to do here?

dstansby · 2025-01-08T10:13:22Z

I fixed the docstests in the latest commit - 🤞 this works, and thanks for the PR!

will-moore · 2025-01-08T12:57:50Z

Great, thanks @dstansby

will-moore added 2 commits December 9, 2024 14:42

Fix usage of config json_indent in V3JsonEncoder

61f6dd0

Add test for json_indent

5ab3640

d-v-b reviewed Dec 9, 2024

View reviewed changes

src/zarr/core/metadata/v3.py Outdated Show resolved Hide resolved

d-v-b reviewed Dec 9, 2024

View reviewed changes

tests/test_metadata/test_v3.py Outdated Show resolved Hide resolved

will-moore added 4 commits December 9, 2024 15:56

parametrize json indent

37f96b0

Add None to indent test parameters

5af9d5f

ruff fix

1e37dd2

other ruff fixes

599eefc

will-moore force-pushed the fix_json_indent branch from 9744269 to 599eefc Compare December 9, 2024 16:22

jhamman reviewed Dec 9, 2024

View reviewed changes

src/zarr/core/metadata/v3.py Outdated Show resolved Hide resolved

will-moore and others added 3 commits December 9, 2024 22:59

Update src/zarr/core/metadata/v3.py

263dac4

Co-authored-by: Joe Hamman <jhamman1@gmail.com>

Merge remote-tracking branch 'origin/main' into fix_json_indent

4bac97d

Use explicit json encoder args

7a442e1

will-moore force-pushed the fix_json_indent branch from 00945a6 to 38d24fd Compare December 10, 2024 00:01

Add types

1442f4a

will-moore force-pushed the fix_json_indent branch from 38d24fd to 1442f4a Compare December 10, 2024 08:43

Merge branch 'main' into fix_json_indent

d6ba3f4

will-moore added 2 commits December 17, 2024 13:48

Update byte counts for tests

7123ce3

Merge remote-tracking branch 'origin/main' into fix_json_indent

e3dba80

Merge remote-tracking branch 'origin/main' into fix_json_indent

865815b

dstansby approved these changes Jan 6, 2025

View reviewed changes

Merge branch 'main' into fix_json_indent

83259c7

dstansby enabled auto-merge (squash) January 7, 2025 09:40

will-moore and others added 2 commits January 8, 2025 09:45

Merge branch 'main' into fix_json_indent

b218e58

Fix doctests

e262235

dstansby merged commit eb25424 into zarr-developers:main Jan 8, 2025
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix json indent #2546

Fix json indent #2546

will-moore commented Dec 9, 2024

d-v-b Dec 9, 2024

will-moore Dec 9, 2024

d-v-b Dec 9, 2024

will-moore Dec 9, 2024

will-moore Dec 9, 2024

will-moore commented Dec 12, 2024

jhamman commented Dec 12, 2024

d-v-b commented Dec 12, 2024

d-v-b commented Dec 12, 2024

will-moore commented Dec 17, 2024

d-v-b commented Dec 17, 2024

will-moore commented Jan 6, 2025

dstansby commented Jan 8, 2025

will-moore commented Jan 8, 2025

Fix json indent #2546

Fix json indent #2546

Conversation

will-moore commented Dec 9, 2024

d-v-b Dec 9, 2024

Choose a reason for hiding this comment

will-moore Dec 9, 2024

Choose a reason for hiding this comment

d-v-b Dec 9, 2024

Choose a reason for hiding this comment

will-moore Dec 9, 2024

Choose a reason for hiding this comment

will-moore Dec 9, 2024

Choose a reason for hiding this comment

will-moore commented Dec 12, 2024

jhamman commented Dec 12, 2024

d-v-b commented Dec 12, 2024

d-v-b commented Dec 12, 2024

will-moore commented Dec 17, 2024

d-v-b commented Dec 17, 2024

will-moore commented Jan 6, 2025

dstansby commented Jan 8, 2025

will-moore commented Jan 8, 2025