Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TXT.php #28629

Closed
wants to merge 4 commits into from
Closed

Update TXT.php #28629

wants to merge 4 commits into from

Conversation

MasahikoSada
Copy link

To support files encoded by Shift-JIS(Japanese) , GB2312 and GBK(Chinese) .

To support encoded by Shift-JIS(Japanese) , GB2312 and GBK(Chinese) files
Copy link

@erhuz erhuz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to indent row 70 and remove an indentation from row 69.

@kesselb
Copy link
Contributor

kesselb commented Aug 28, 2021

Thanks 👍

I would say the issue is not the encoding but the font used for txt preview NotoSans-Regular.ttf

On the other hand imagettftext expects a utf8 encoded string as text anyway so a explicit is probably not wrong. Yet https://www.php.net/manual/en/function.mb-convert-encoding should be easier to use here as the from encoding is detected automatically.

lib/private/Preview/TXT.php Outdated Show resolved Hide resolved
lib/private/Preview/TXT.php Outdated Show resolved Hide resolved
MasahikoSada and others added 3 commits August 29, 2021 11:42
Thanks, I appreciate your suggestion.

Co-authored-by: kesselb <mail@danielkesselberg.de>
Thanks, I appereate your suggestion :)

Co-authored-by: kesselb <mail@danielkesselberg.de>
In my environment, mb_convert_encoding() doesn't work properly without 'auto' in from_encoding.
@kesselb
Copy link
Contributor

kesselb commented Aug 29, 2021

Thanks @MasahikoSada, can you provide a simple text document or a string with some characters to test the change?

@kesselb kesselb added 3. to review Waiting for reviews enhancement labels Aug 29, 2021
@MasahikoSada
Copy link
Author

MasahikoSada commented Aug 29, 2021

Thanks @MasahikoSada, can you provide a simple text document or a string with some characters to test the change?

@kesselb
Thanks for your reply.

This is a sample file endoced by SHIFT-JIS(ANSI).
https://1drv.ms/t/s!AnT4d94afUwMhYJsm8nD3x9bDGas8w?e=LDAdM7

Screen shot is here.
https://1drv.ms/u/s!AnT4d94afUwMhYJtdnIDR17FOPaCCg?e=OqgYbJ

@kesselb
Copy link
Contributor

kesselb commented Aug 29, 2021

Hi,

uploaded the test file to my dev instance to generate the preview images. Is one of the previews the expected outcome?

1. no patch

txt_master

2. version with iconv

txt_iconv

3. version with mb_convert_encoding

txt_mb

@MasahikoSada
Copy link
Author

MasahikoSada commented Aug 29, 2021

Hi,

uploaded the test file to my dev instance to generate the preview images. Is one of the previews the expected outcome?

Hi,
No one is the expected output, but I think that the version 2's encode is proper one.
The only issue of version 2 is just the font.

Can you show me the version with mb_convert_encoding and 'auto' option.
I think it will be as same as version 2.

@kesselb
Copy link
Contributor

kesselb commented Aug 29, 2021

Hey,

$content = ���肪�Ƃ��B ����͓��{�ꕶ���̃e�X�g�t�@�C���ł��B

mb_convert_encoding($content, 'UTF-8', 'auto'); => false

Blank preview image

mb_convert_encoding($content, 'UTF-8', 'SJIS-win, GB2312, GBK, UTF-8, WINDOWS-1252, ISO-8859-15, ISO-8859-1, ASCII') => ありがとう。 これは日本語文字のテストファイルです。

txt_mb_list_of_encodings

@MasahikoSada
Copy link
Author

MasahikoSada commented Aug 30, 2021

Hey,

$content = ���肪�Ƃ��B ����͓��{�ꕶ���̃e�X�g�t�@�C���ł��B

mb_convert_encoding($content, 'UTF-8', 'auto'); => false

Blank preview image

mb_convert_encoding($content, 'UTF-8', 'SJIS-win, GB2312, GBK, UTF-8, WINDOWS-1252, ISO-8859-15, ISO-8859-1, ASCII') => ありがとう。 これは日本語文字のテストファイルです。

Thanks for your comment.
When 'auto' is set, I think it depends on 'mbstring.language' in php.ini .
When I set mbstring.language=japanese, the output is expected one.

@szaimen szaimen added this to the Nextcloud 23 milestone Sep 13, 2021
@skjnldsv skjnldsv mentioned this pull request Oct 13, 2021
@skjnldsv skjnldsv modified the milestones: Nextcloud 23, Nextcloud 24 Oct 18, 2021
@skjnldsv skjnldsv mentioned this pull request Mar 24, 2022
@blizzz blizzz mentioned this pull request Mar 31, 2022
This was referenced Apr 7, 2022
@blizzz blizzz modified the milestones: Nextcloud 24, Nextcloud 25 Apr 21, 2022
This was referenced Aug 12, 2022
This was referenced Aug 24, 2022
This was referenced Sep 6, 2022
@skjnldsv skjnldsv mentioned this pull request Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review Waiting for reviews enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants