-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix --input-encoding=<encoding> regression added in PR #143 #162
Conversation
While adding type-hinting the option to specify an input encoding got ignored. This commit fixes this regression. This commit also fixes the tests which call ansi2html as a command. As the pytest documentation states, during test execution `stdin` is set to a “null” object which will fail on attempts to read from it because it is rarely desired to wait for interactive input when running automated tests. So we also patch now `sys.stdin` using a `io.TextIOWrapper` and wrapping any actual input in an `io.BytesIO`. Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>
@ssbarnea This should probably go into 1.7.1 Yeeting out specifying an input encoding was not intended and probably overlooked. |
Also, the former code before 40efaa1 was catching an try:
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")
except io.UnsupportedOperation:
# This only fails in the test suite...
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ziegenberg I am underinformed about this PR and don't want to stand in its way, but I have two questions about it that may help the PR, below:
@@ -786,6 +787,8 @@ def main() -> None: | |||
title=opts.output_title, | |||
) | |||
|
|||
reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me (while I understand this feature got lost by mistake) why we need this wrapper in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, that's the Python 3 way to making sys.stdin
read anything with a different encoding than utf-8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, argument --input-encoding=ENCODING
! And there is no .detach()
now because you're going straight to the bytes buffer. I think there are two options here: (a) Only wrap if ther is a bytes layer below (a la old approach) or (b) require binary buffer and have the tests make it work, always wrap.
I must say that the difference in noise in the test suite for (a) with StringIO
is significantly lower (and we would not even nedd changes to the test suite then), so I'd personally be in favor of the old approach:
ansi2html/ansi2html/converter.py
Lines 721 to 725 in 7515540
try: | |
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace") | |
except io.UnsupportedOperation: | |
# This only fails in the test suite... | |
pass |
The commend could say something like "if there is no binary .buffer beneath it, e.g. with StringIO" if you'd like more detail there as well, just an idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched from sys.stdin.detach()
to sys.stdin.buffer
, because it originally failed with "TextIO" has no attribute "detach"
.
The old approach with catching an exception in production code just to make the tests work does smell a bit. And also
try:
except Error:
pass
is an anti-pattern that should not have been used in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we agree that StringIO
makes for more readbable tests. I agree with you that except .. pass
feels like a hack, but maybe there is less hacky ways to achieve the same. E.g. this code would be super explicit and not have except .. pass
workarounds:
if not isinstance(sys.stdin, StringIO): # e.g. during tests
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")
The use of reader
further down would need a revert to sys.stdin
then. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ziegenberg any chance you could make the PR be about that^^ addition?
Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>
with patch("sys.stdin", new_callable=f): | ||
with patch( | ||
"sys.stdin", | ||
TextIOWrapper(BytesIO(test_data.encode("utf-8"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why…
--- TextIOWrapper(BytesIO(X.encode("utf-8")))
+++ StringIO(X)
…would not work? (x4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: It would not, but there's an easy fix (please see https://github.com/pycontribs/ansi2html/pull/162/files#r800234179 above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that would not work. The tests then would fail in ansi2html/converter.py:790
with an AttributeError
> reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")
E AttributeError: '_io.StringIO' object has no attribute 'buffer'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But if we use the old approach that got lost, StringIO works fine again, without buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the old method, mypy will complain about it:
ansi2html/converter.py:790: error: "TextIO" has no attribute "detach"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no current use of TextIO
in the code I could fine, so unless we add it, there is no need to support TextIO
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I have run mypy 0.931 locally on commit 7515540 with the old approach now and I don't get an error like that (despite use of .detach
).)
@hartwork Does this still need modifications? If not approve it. I will wait for you ack on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hartwork Does this still need modifications? If not approve it. I will wait for you ack on this.
@ssbarnea yes, while we need to to fix the regression, there is a way that keeps the test suite as readable. Status quo is not good to be merged in my view. I'm hoping @ziegenberg and I can agree on my suggestion right below https://github.com/pycontribs/ansi2html/pull/162/files#r800239213 .
with patch("sys.stdin", new_callable=f): | ||
with patch( | ||
"sys.stdin", | ||
TextIOWrapper(BytesIO(test_data.encode("utf-8"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no current use of TextIO
in the code I could fine, so unless we add it, there is no need to support TextIO
.
@@ -786,6 +787,8 @@ def main() -> None: | |||
title=opts.output_title, | |||
) | |||
|
|||
reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we agree that StringIO
makes for more readbable tests. I agree with you that except .. pass
feels like a hack, but maybe there is less hacky ways to achieve the same. E.g. this code would be super explicit and not have except .. pass
workarounds:
if not isinstance(sys.stdin, StringIO): # e.g. during tests
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")
The use of reader
further down would need a revert to sys.stdin
then. What do you think?
with patch("sys.stdin", new_callable=f): | ||
with patch( | ||
"sys.stdin", | ||
TextIOWrapper(BytesIO(test_data.encode("utf-8"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I have run mypy 0.931 locally on commit 7515540 with the old approach now and I don't get an error like that (despite use of .detach
).)
@@ -786,6 +787,8 @@ def main() -> None: | |||
title=opts.output_title, | |||
) | |||
|
|||
reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ziegenberg any chance you could make the PR be about that^^ addition?
I can make the proposed changes. I'd love to get a proposal for an automated test that checks that specifying input encoding works as intended. To ensure that such regression cannot happen again in the future. |
@ziegenberg thank you!
That's a good point, I'll need to have a closer look for a proposal. |
@ziegenberg any news? |
1 similar comment
@ziegenberg any news? |
While adding type-hinting the option to specify an input encoding got ignored. This commit fixes this regression.
This commit also fixes the tests which call ansi2html as a command. As the pytest documentation states, during test execution
stdin
is set to a “null” object which will fail on attempts to read from it because it is rarely desired to wait for interactive input when running automated tests. So we also patch nowsys.stdin
using anio.TextIOWrapper
and wrapping any actual input in aio.BytesIO
.