Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix --input-encoding=<encoding> regression added in PR #143 #162

Closed
wants to merge 2 commits into from

Conversation

ziegenberg
Copy link
Collaborator

@ziegenberg ziegenberg commented Feb 6, 2022

While adding type-hinting the option to specify an input encoding got ignored. This commit fixes this regression.

This commit also fixes the tests which call ansi2html as a command. As the pytest documentation states, during test execution stdin is set to a “null” object which will fail on attempts to read from it because it is rarely desired to wait for interactive input when running automated tests. So we also patch now sys.stdin using an io.TextIOWrapper and wrapping any actual input in a io.BytesIO.

While adding type-hinting the option to specify an input encoding
got ignored. This commit fixes this regression.

This commit also fixes the tests which call ansi2html as a command.
As the pytest documentation states, during test execution `stdin`
is set to a “null” object which will fail on attempts to read from it
because it is rarely desired to wait for interactive input when
running automated tests. So we also patch now `sys.stdin` using a
`io.TextIOWrapper` and wrapping any actual input in an `io.BytesIO`.

Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>
@ziegenberg ziegenberg requested a review from ssbarnea as a code owner February 6, 2022 20:12
@ziegenberg
Copy link
Collaborator Author

@ssbarnea This should probably go into 1.7.1

Yeeting out specifying an input encoding was not intended and probably overlooked.

@ziegenberg
Copy link
Collaborator Author

ziegenberg commented Feb 6, 2022

Also, the former code before 40efaa1 was catching an io.UnsupportedOperation cause that made the test suite fail. Well, that's fixed now. Took me some time to figure out that pytest sets sys.stdin to a null object during test execution.

try:
    sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")
except io.UnsupportedOperation:
    # This only fails in the test suite...
    pass

Copy link
Collaborator

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ziegenberg I am underinformed about this PR and don't want to stand in its way, but I have two questions about it that may help the PR, below:

@@ -786,6 +787,8 @@ def main() -> None:
title=opts.output_title,
)

reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")
Copy link
Collaborator

@hartwork hartwork Feb 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me (while I understand this feature got lost by mistake) why we need this wrapper in the first place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, that's the Python 3 way to making sys.stdin read anything with a different encoding than utf-8.

Copy link
Collaborator

@hartwork hartwork Feb 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, argument --input-encoding=ENCODING! And there is no .detach() now because you're going straight to the bytes buffer. I think there are two options here: (a) Only wrap if ther is a bytes layer below (a la old approach) or (b) require binary buffer and have the tests make it work, always wrap.

I must say that the difference in noise in the test suite for (a) with StringIO is significantly lower (and we would not even nedd changes to the test suite then), so I'd personally be in favor of the old approach:

try:
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")
except io.UnsupportedOperation:
# This only fails in the test suite...
pass

The commend could say something like "if there is no binary .buffer beneath it, e.g. with StringIO" if you'd like more detail there as well, just an idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched from sys.stdin.detach() to sys.stdin.buffer, because it originally failed with "TextIO" has no attribute "detach".

The old approach with catching an exception in production code just to make the tests work does smell a bit. And also

try:

except Error: 
    pass

is an anti-pattern that should not have been used in the first place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we agree that StringIO makes for more readbable tests. I agree with you that except .. pass feels like a hack, but maybe there is less hacky ways to achieve the same. E.g. this code would be super explicit and not have except .. pass workarounds:

if not isinstance(sys.stdin, StringIO):  # e.g. during tests
    sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")

The use of reader further down would need a revert to sys.stdin then. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziegenberg any chance you could make the PR be about that^^ addition?

tests/test_ansi2html.py Outdated Show resolved Hide resolved
Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>
@ziegenberg ziegenberg added the bug This issue/PR relates to a bug. label Feb 6, 2022
@ziegenberg ziegenberg requested a review from hartwork February 6, 2022 20:43
with patch("sys.stdin", new_callable=f):
with patch(
"sys.stdin",
TextIOWrapper(BytesIO(test_data.encode("utf-8"))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why…

--- TextIOWrapper(BytesIO(X.encode("utf-8")))
+++ StringIO(X)

…would not work? (x4)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: It would not, but there's an easy fix (please see https://github.com/pycontribs/ansi2html/pull/162/files#r800234179 above)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that would not work. The tests then would fail in ansi2html/converter.py:790 with an AttributeError

>       reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")
E       AttributeError: '_io.StringIO' object has no attribute 'buffer'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But if we use the old approach that got lost, StringIO works fine again, without buffer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the old method, mypy will complain about it:

ansi2html/converter.py:790: error: "TextIO" has no attribute "detach"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no current use of TextIO in the code I could fine, so unless we add it, there is no need to support TextIO.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I have run mypy 0.931 locally on commit 7515540 with the old approach now and I don't get an error like that (despite use of .detach).)

@hartwork hartwork changed the title fix regression added in PR #143 Fix --input-encoding=<encoding> regression added in PR #143 Feb 6, 2022
@ssbarnea
Copy link
Member

@hartwork Does this still need modifications? If not approve it. I will wait for you ack on this.

Copy link
Collaborator

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartwork Does this still need modifications? If not approve it. I will wait for you ack on this.

@ssbarnea yes, while we need to to fix the regression, there is a way that keeps the test suite as readable. Status quo is not good to be merged in my view. I'm hoping @ziegenberg and I can agree on my suggestion right below https://github.com/pycontribs/ansi2html/pull/162/files#r800239213 .

with patch("sys.stdin", new_callable=f):
with patch(
"sys.stdin",
TextIOWrapper(BytesIO(test_data.encode("utf-8"))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no current use of TextIO in the code I could fine, so unless we add it, there is no need to support TextIO.

@@ -786,6 +787,8 @@ def main() -> None:
title=opts.output_title,
)

reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we agree that StringIO makes for more readbable tests. I agree with you that except .. pass feels like a hack, but maybe there is less hacky ways to achieve the same. E.g. this code would be super explicit and not have except .. pass workarounds:

if not isinstance(sys.stdin, StringIO):  # e.g. during tests
    sys.stdin = io.TextIOWrapper(sys.stdin.detach(), opts.input_encoding, "replace")

The use of reader further down would need a revert to sys.stdin then. What do you think?

with patch("sys.stdin", new_callable=f):
with patch(
"sys.stdin",
TextIOWrapper(BytesIO(test_data.encode("utf-8"))),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I have run mypy 0.931 locally on commit 7515540 with the old approach now and I don't get an error like that (despite use of .detach).)

@@ -786,6 +787,8 @@ def main() -> None:
title=opts.output_title,
)

reader = io.TextIOWrapper(sys.stdin.buffer, opts.input_encoding, "replace")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziegenberg any chance you could make the PR be about that^^ addition?

@ziegenberg
Copy link
Collaborator Author

I can make the proposed changes.

I'd love to get a proposal for an automated test that checks that specifying input encoding works as intended. To ensure that such regression cannot happen again in the future.

@hartwork
Copy link
Collaborator

I can make the proposed changes.

@ziegenberg thank you!

I'd love to get a proposal for an automated test that checks that specifying input encoding works as intended. To ensure that such regression cannot happen again in the future.

That's a good point, I'll need to have a closer look for a proposal.
Personally, I would also vote for more critical PR review. I have a blog post on my view on code review if anyone's interested.

@hartwork hartwork added this to the 1.7.1 milestone Feb 17, 2022
@hartwork
Copy link
Collaborator

hartwork commented Mar 8, 2022

I can make the proposed changes.

@ziegenberg any news?

1 similar comment
@hartwork
Copy link
Collaborator

hartwork commented Apr 6, 2022

I can make the proposed changes.

@ziegenberg any news?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue/PR relates to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants