Fix --input-encoding=<encoding> regression added in PR #143 #162

ziegenberg · 2022-02-06T20:12:41Z

While adding type-hinting the option to specify an input encoding got ignored. This commit fixes this regression.

This commit also fixes the tests which call ansi2html as a command. As the pytest documentation states, during test execution stdin is set to a “null” object which will fail on attempts to read from it because it is rarely desired to wait for interactive input when running automated tests. So we also patch now sys.stdin using an io.TextIOWrapper and wrapping any actual input in a io.BytesIO.

While adding type-hinting the option to specify an input encoding got ignored. This commit fixes this regression. This commit also fixes the tests which call ansi2html as a command. As the pytest documentation states, during test execution `stdin` is set to a “null” object which will fail on attempts to read from it because it is rarely desired to wait for interactive input when running automated tests. So we also patch now `sys.stdin` using a `io.TextIOWrapper` and wrapping any actual input in an `io.BytesIO`. Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>

ziegenberg · 2022-02-06T20:18:53Z

@ssbarnea This should probably go into 1.7.1

Yeeting out specifying an input encoding was not intended and probably overlooked.

ziegenberg · 2022-02-06T20:22:35Z

Also, the former code before 40efaa1 was catching an io.UnsupportedOperation cause that made the test suite fail. Well, that's fixed now. Took me some time to figure out that pytest sets sys.stdin to a null object during test execution.

try:
    sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")
except io.UnsupportedOperation:
    # This only fails in the test suite...
    pass

hartwork

Hi @ziegenberg I am underinformed about this PR and don't want to stand in its way, but I have two questions about it that may help the PR, below:

hartwork · 2022-02-06T20:21:01Z

ansi2html/converter.py

@@ -786,6 +787,8 @@ def main() -> None:
        title=opts.output_title,
    )

+    reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")


Can you remind me (while I understand this feature got lost by mistake) why we need this wrapper in the first place?

As far as I know, that's the Python 3 way to making sys.stdin read anything with a different encoding than utf-8.

I see, argument --input-encoding=ENCODING! And there is no .detach() now because you're going straight to the bytes buffer. I think there are two options here: (a) Only wrap if ther is a bytes layer below (a la old approach) or (b) require binary buffer and have the tests make it work, always wrap.

I must say that the difference in noise in the test suite for (a) with StringIO is significantly lower (and we would not even nedd changes to the test suite then), so I'd personally be in favor of the old approach:

ansi2html/ansi2html/converter.py

Lines 721 to 725 in 7515540

try:

sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")

except io.UnsupportedOperation:

# This only fails in the test suite...

pass

The commend could say something like "if there is no binary .buffer beneath it, e.g. with StringIO" if you'd like more detail there as well, just an idea.

I switched from sys.stdin.detach() to sys.stdin.buffer, because it originally failed with "TextIO" has no attribute "detach".

The old approach with catching an exception in production code just to make the tests work does smell a bit. And also

try: except Error: pass

is an anti-pattern that should not have been used in the first place.

I think we agree that StringIO makes for more readbable tests. I agree with you that except .. pass feels like a hack, but maybe there is less hacky ways to achieve the same. E.g. this code would be super explicit and not have except .. pass workarounds:

if not isinstance(sys.stdin, StringIO): # e.g. during tests sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")

The use of reader further down would need a revert to sys.stdin then. What do you think?

@ziegenberg any chance you could make the PR be about that^^ addition?

tests/test_ansi2html.py

Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>

hartwork · 2022-02-06T20:51:39Z

tests/test_ansi2html.py

-        with patch("sys.stdin", new_callable=f):
+        with patch(
+            "sys.stdin",
+            TextIOWrapper(BytesIO(test_data.encode("utf-8"))),


Any reason why…

--- TextIOWrapper(BytesIO(X.encode("utf-8"))) +++ StringIO(X)

…would not work? (x4)

Update: It would not, but there's an easy fix (please see https://github.com/pycontribs/ansi2html/pull/162/files#r800234179 above)

No, that would not work. The tests then would fail in ansi2html/converter.py:790 with an AttributeError

> reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace") E AttributeError: '_io.StringIO' object has no attribute 'buffer'

Yes. But if we use the old approach that got lost, StringIO works fine again, without buffer.

Using the old method, mypy will complain about it:

ansi2html/converter.py:790: error: "TextIO" has no attribute "detach"

There is no current use of TextIO in the code I could fine, so unless we add it, there is no need to support TextIO.

(I have run mypy 0.931 locally on commit 7515540 with the old approach now and I don't get an error like that (despite use of .detach).)

ssbarnea · 2022-02-16T10:49:38Z

@hartwork Does this still need modifications? If not approve it. I will wait for you ack on this.

hartwork

@hartwork Does this still need modifications? If not approve it. I will wait for you ack on this.

@ssbarnea yes, while we need to to fix the regression, there is a way that keeps the test suite as readable. Status quo is not good to be merged in my view. I'm hoping @ziegenberg and I can agree on my suggestion right below https://github.com/pycontribs/ansi2html/pull/162/files#r800239213 .

hartwork · 2022-02-06T22:18:00Z

tests/test_ansi2html.py

-        with patch("sys.stdin", new_callable=f):
+        with patch(
+            "sys.stdin",
+            TextIOWrapper(BytesIO(test_data.encode("utf-8"))),


There is no current use of TextIO in the code I could fine, so unless we add it, there is no need to support TextIO.

hartwork · 2022-02-06T22:18:27Z

ansi2html/converter.py

@@ -786,6 +787,8 @@ def main() -> None:
        title=opts.output_title,
    )

+    reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")


I think we agree that StringIO makes for more readbable tests. I agree with you that except .. pass feels like a hack, but maybe there is less hacky ways to achieve the same. E.g. this code would be super explicit and not have except .. pass workarounds:

if not isinstance(sys.stdin, StringIO): # e.g. during tests sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")

The use of reader further down would need a revert to sys.stdin then. What do you think?

hartwork · 2022-02-06T22:47:33Z

tests/test_ansi2html.py

-        with patch("sys.stdin", new_callable=f):
+        with patch(
+            "sys.stdin",
+            TextIOWrapper(BytesIO(test_data.encode("utf-8"))),


(I have run mypy 0.931 locally on commit 7515540 with the old approach now and I don't get an error like that (despite use of .detach).)

hartwork · 2022-02-13T22:08:38Z

ansi2html/converter.py

@@ -786,6 +787,8 @@ def main() -> None:
        title=opts.output_title,
    )

+    reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")


@ziegenberg any chance you could make the PR be about that^^ addition?

ziegenberg · 2022-02-16T15:26:11Z

I can make the proposed changes.

I'd love to get a proposal for an automated test that checks that specifying input encoding works as intended. To ensure that such regression cannot happen again in the future.

hartwork · 2022-02-16T16:35:14Z

I can make the proposed changes.

@ziegenberg thank you!

I'd love to get a proposal for an automated test that checks that specifying input encoding works as intended. To ensure that such regression cannot happen again in the future.

That's a good point, I'll need to have a closer look for a proposal.
Personally, I would also vote for more critical PR review. I have a blog post on my view on code review if anyone's interested.

hartwork · 2022-03-08T01:19:44Z

I can make the proposed changes.

@ziegenberg any news?

hartwork · 2022-04-06T09:36:43Z

I can make the proposed changes.

@ziegenberg any news?

… tests (alternative to PR #162) (#172)

ziegenberg requested a review from ssbarnea as a code owner February 6, 2022 20:12

hartwork reviewed Feb 6, 2022

View reviewed changes

apply suggested changes from code review

1bd8d96

Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>

ziegenberg added the bug This issue/PR relates to a bug. label Feb 6, 2022

ziegenberg requested a review from hartwork February 6, 2022 20:43

hartwork reviewed Feb 6, 2022

View reviewed changes

hartwork mentioned this pull request Feb 6, 2022

tests: Simplify overly complicated "patch([..], new_callable=lambda: [..])" #163

Closed

hartwork changed the title ~~fix regression added in PR #143~~ Fix --input-encoding=<encoding> regression added in PR #143 Feb 6, 2022

ssbarnea approved these changes Feb 16, 2022

View reviewed changes

hartwork requested changes Feb 16, 2022

View reviewed changes

hartwork added this to the 1.7.1 milestone Feb 17, 2022

hartwork mentioned this pull request May 8, 2022

Fix --input-encoding=<encoding> regression added in PR #143 + related tests (alternative to PR #162) #172

Merged

hartwork removed this from the 1.7.1 milestone May 8, 2022

ssbarnea pushed a commit that referenced this pull request May 9, 2022

Fix --input-encoding=<encoding> regression added in PR #143 + related…

f7dd5ed

… tests (alternative to PR #162) (#172)

ssbarnea closed this May 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix --input-encoding=<encoding> regression added in PR #143 #162

Fix --input-encoding=<encoding> regression added in PR #143 #162

ziegenberg commented Feb 6, 2022 •

edited

Loading

ziegenberg commented Feb 6, 2022

ziegenberg commented Feb 6, 2022 •

edited

Loading

hartwork left a comment

hartwork Feb 6, 2022 •

edited

Loading

ziegenberg Feb 6, 2022

hartwork Feb 6, 2022 •

edited

Loading

ziegenberg Feb 6, 2022

hartwork Feb 6, 2022

hartwork Feb 13, 2022

hartwork Feb 6, 2022

hartwork Feb 6, 2022

ziegenberg Feb 6, 2022

hartwork Feb 6, 2022

ziegenberg Feb 6, 2022

hartwork Feb 6, 2022

hartwork Feb 6, 2022

ssbarnea commented Feb 16, 2022

hartwork left a comment

hartwork Feb 6, 2022

hartwork Feb 6, 2022

hartwork Feb 6, 2022

hartwork Feb 13, 2022

ziegenberg commented Feb 16, 2022

hartwork commented Feb 16, 2022

hartwork commented Mar 8, 2022

hartwork commented Apr 6, 2022

	try:
	sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")
	except io.UnsupportedOperation:
	# This only fails in the test suite...
	pass

Fix --input-encoding=<encoding> regression added in PR #143 #162

Fix --input-encoding=<encoding> regression added in PR #143 #162

Conversation

ziegenberg commented Feb 6, 2022 • edited Loading

ziegenberg commented Feb 6, 2022

ziegenberg commented Feb 6, 2022 • edited Loading

hartwork left a comment

Choose a reason for hiding this comment

hartwork Feb 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hartwork Feb 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssbarnea commented Feb 16, 2022

hartwork left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ziegenberg commented Feb 16, 2022

hartwork commented Feb 16, 2022

hartwork commented Mar 8, 2022

hartwork commented Apr 6, 2022

ziegenberg commented Feb 6, 2022 •

edited

Loading

ziegenberg commented Feb 6, 2022 •

edited

Loading

hartwork Feb 6, 2022 •

edited

Loading

hartwork Feb 6, 2022 •

edited

Loading