Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"raw" view is not raw, adds extra backslash #5470

Closed
Prinzhorn opened this issue Jul 21, 2022 · 5 comments · Fixed by #5894
Closed

"raw" view is not raw, adds extra backslash #5470

Prinzhorn opened this issue Jul 21, 2022 · 5 comments · Fixed by #5894

Comments

@Prinzhorn
Copy link
Member

Problem Description

I just noticed during #5469 (comment)

Steps to reproduce the behavior:

http.txt

HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 1

\
  1. ncat -l 1337 < http.txt
  2. mitmproxy
  3. curl --proxy localhost:8080 http://127.0.0.1:1337

raw is \\:

image

hex:

image

System Information

Mitmproxy: 8.1.1 binary
Python:    3.10.5
OpenSSL:   OpenSSL 3.0.3 3 May 2022
Platform:  Linux-5.15.0-41-generic-x86_64-with-glibc2.35
@mhils
Copy link
Member

mhils commented Jul 21, 2022

Raw mode currently escapes unprintable characters as well as backslashes so that the representation is unambiguous. I do agree though that the example you are showing is all but optimal, maybe we should just switch to rendering unprintable characters as �.

@Prinzhorn
Copy link
Member Author

maybe we should just switch to rendering unprintable characters as �.

That sounds reasonable to me. Text bodies would then be rendered 100% correctly (which I think is very important) and binary bodies would contain � as one would kind of expect (I don't think a lot of people would inspect binary bodies for anything other than text they contain [we could additionally also add a strings(1) view 🤔 ])

@tss008
Copy link

tss008 commented Oct 11, 2023

It was a breaking change. Removing "strutils.bytes_to_escaped_str(data, True)" may force one to set PYTHONUTF8.

@mhils
Copy link
Member

mhils commented Oct 12, 2023 via email

@tss008
Copy link

tss008 commented Oct 12, 2023

Could you clarify please why/under what circumstances you need to set PYTHONUTF8? :)

On Wed, Oct 11, 2023, 17:56 tss008 @.> wrote: It was a breaking change. Removing "strutils.bytes_to_escaped_str(data, True)" may force one to set PYTHONUTF8. — Reply to this email directly, view it on GitHub <#5470 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHY2PX7YJLAJIOM5VUTRVDX626RRANCNFSM54HHHPXQ . You are receiving this because you modified the open/close state.Message ID: @.>

Here you go.

The following tests were performed on two Windows 10 machines using the attached file
raw2.dmp and mitmdump 10.1.1:

  1. Test 1, machine 1 (PYTHONUTF8 is not set)
mitmdump.exe -r raw2.dmp --set dumper_default_contentview=raw --set flow_detail=4 -n > stdout.log

stdout.log contains:

192.168.1.34:54835: GET http://11111111111:1111/2.json?sdd
    Host: 111111111111
    Connection: keep-alive
    Upgrade-Insecure-Requests: 1
    User-Agent: Mozilla/5.0 (Linux; Android 11; SM-S908E) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.88 Safari/537.36
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
    Accept-Encoding: gzip, deflate
    Accept-Language: ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7

 << 200 OK 142b
    Server: nginx
    Date: Thu, 12 Oct 2023 09:09:57 GMT
    Content-Type: application/json
    Last-Modified: Thu, 12 Oct 2023 07:57:39 GMT
    Transfer-Encoding: chunked
    Connection: keep-alive
    Keep-Alive: timeout=35
    Content-Encoding: gzip

[12:13:16.828] Addon error: 'charmap' codec can't encode characters in position 84-95: character maps to <undefined>
Traceback (most recent call last):
  File "mitmproxy\addons\dumper.py", line 284, in response
  File "mitmproxy\addons\dumper.py", line 263, in echo_flow
  File "mitmproxy\addons\dumper.py", line 135, in _echo_message
  File "mitmproxy\addons\dumper.py", line 96, in echo
  File "encodings\cp1252.py", line 19, in encode
UnicodeEncodeError: 'charmap' codec can't encode characters in position 84-95: character maps to <undefined>
  1. Test 2, machine 2 (PYTHONUTF8 is not set)
    Repeat the same actions as we have in Test 1 on another machine. This time the content of the HTTP answer is "converted" to CP-1251:
    Untitled-bad

  2. Test 3, machine 1 (PYTHONUTF8 = 1) and Test 4, machine 2 ((PYTHONUTF8 = 1)
    Repeat Test 1 with PYTHONUTF8 set and everything seems to be correct on both machines, the content of the HTTP answer looks ok:
    Untitled-good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants