Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notepad and commit message, proper saving in utf-8. #123

Closed
wants to merge 1 commit into from
Closed

Notepad and commit message, proper saving in utf-8. #123

wants to merge 1 commit into from

Conversation

dimztimz
Copy link
Contributor

Fixes the issue git-for-windows/git#818 .

unix2dos.exe -m "$1"
else
unix2dos.exe "$1"
fi

This comment was marked as off-topic.

This comment was marked as off-topic.

@dscho
Copy link
Member

dscho commented Jul 14, 2016

What a nice contribution!

I added some suggestions in a comment; please feel free to force-push after addressing them and leave a note here so I do not miss the updates.

When Notepad is set as commit message editor, this fix adds utf-8 BOM
in the file COMMIT_MESSAGE before Notepad opens the file.
This way Notepad is forced to read and save the file as UTF-8,
allowing commit messages with international characters.

Previosly Notepad opened and saved the file in some ANSI codepage,
while Git was reading the file as UTF-8 creating problems when
international charaters were written.

The fix takes into account rares cases when we manually use other
encoding than UTF-8 for commit messegas via the git config
i18n.commitencoding variable.

Now Notepad is completely usable as commit message editor without using
external tools like Gitpad.

Signed-off-by: Dimitrij Mijoski <dmjpp@hotmail.com>
@dimztimz
Copy link
Contributor Author

@dscho Ok I updated with force push (i reseted the old commit and it went away)

@rimrul notepad /W is useless as that forces UTF-16. UTF-16 is not even allowed in commit messages. dos2unix wont add a second BOM if one is already present. That handed that case fine.

@rimrul
Copy link
Member

rimrul commented Jul 14, 2016

@rimrul notepad /W is useless as that forces UTF-16.

Oh, you're right. Sorry, poorly documented flag. One questions remain, though: Do we need to handle other use cases for this wrapper like interactive rebases for example? Does the commitencoding apply to more than commit messages regarding the editor or should we introduce a guard clause similar to this one?

case "$1" in
*/COMMIT_EDITMSG|*\\COMMIT_EDITMSG)
...
*)
...
esac

@@ -15,7 +15,12 @@ die "Usage: $0 <file>"

if test -f "$1"
then
unix2dos.exe "$1"
commitencoding="$(git config i18n.commitencoding)" ||
commitencoding=""

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@dimztimz
Copy link
Contributor Author

@rimrul the bug fix can not introduce new bugs. It may only leave the old behavior unresolved in very rare edge cases if the other subsystems that open editor do not respect the variable.

And that is highly unlikely. My guess is that in all cases the same subroutines are used.

@rimrul
Copy link
Member

rimrul commented Jul 14, 2016

It can introduce new bugs and/or pitfalls for future bugs if it uses the Commit encoding for things that expect some other encoding (wether that's configurable or not).

From git help config:

i18n.commitEncoding

Character encoding the commit messages are stored in; Git itself does not care per se, but this information is necessary e.g. when importing commits from emails or in the gitk graphical history browser (and possibly at other places in the future or in other porcelains). See e.g. git-mailinfo(1). Defaults to utf-8.

Using the Commit encoding for anything that is not a commit message is undocumented behaviourbehaviour that conflicts with the documentation.

@dimztimz
Copy link
Contributor Author

Dude you are just splitting hairs now. I never expected this much bureaucracy for a five line bug fix. I have analized enough and I did not find any possibility of new bugs.

If there are new bugs please find it, document it, show example and open an issue.

@rimrul
Copy link
Member

rimrul commented Jul 15, 2016

I'm sorry. I really didn't want to be nitpicking, but I feel like this could really cause issues in the future. I don't have a current bug to show you, but I think it's really not hard to see, that the general scenario I described above is likely to cause issues:

Using the Commit encoding for anything that is not a commit message

Your fallback that I commented on regarding wether it's needed should be working fine either way.

@@ -34,13 +34,13 @@ source=('inputrc'
'99-post-install-cleanup.post'
'astextplain'
'git-sdk.sh')
md5sums=('3fab57079f0322efe256e95fe29d516f'
'e7ad03fffc29e619e402dbd5ec9ef74c'
'bfb591886b2a28af3334521e71198b74'

This comment was marked as off-topic.

@dscho
Copy link
Member

dscho commented Jul 15, 2016

@dimztimz please realize that @rimrul only tried to help you come up with the best solution.

If you keep in mind that Git for Windows has been downloaded over a million times in the past four weeks, you will have to agree that it is a good idea to be cautious.

I can also help in this discussion by contributing my knowledge and now we have combined the best of three Open Source contributors. I find that really exciting (and not annoying at all).

In this particular case, I would like to add that @rimrul's concern about COMMIT_EDITMSG is a very, very valid concern. We introduced a different solution to this problem earlier, though, so all is good: we skip all special handling for files outside of any .git/ directory.

@dscho dscho closed this in b8c6923 Jul 15, 2016
@dscho
Copy link
Member

dscho commented Jul 15, 2016

@dimztimz thank you for your contribution! I felt that you got a bit distressed by the discussion (which is not uncommon in the Git project), so I just made the final changes myself and amended the commit.

@dimztimz
Copy link
Contributor Author

Thanks. I see now what was the hustle all about.

Git can launch the core editor for file in the project. I was initially thinking that was used only for commit message, tag message and similar.

Thanks again to everyone.

dscho added a commit that referenced this pull request Jul 16, 2016
When configuring `notepad` as commit message
editor, [UTF-8 messages are now handled
correctly](#123).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants