-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow different encodings for reading JCAMP files #101
Conversation
nmrglue/fileio/bruker.py
Outdated
@@ -2091,7 +2092,7 @@ def rm_dig_filter( | |||
|
|||
# JCAMP-DX functions | |||
|
|||
def read_jcamp(filename): | |||
def read_jcamp(filename, encodings=[locale.getpreferredencoding()]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be preferable to use a single encoding here. If users want to try multiple encoding they can create a loop over read_jcamp
with a check for a UnicodeDecodeError.
nmrglue/fileio/bruker.py
Outdated
warn("Unable to correctly parse line:" + line) | ||
for N, encoding in enumerate(encodings): | ||
try: | ||
with open(filename, 'r', encoding = encoding) as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python 2 encoding
is not an argument in the buildin open
command. io.open
does support this argument.
@andreasdoll I made two suggestions on how I think this should be implemented. Let me know if you need any help with this. |
Allow to specify an encoding for reading JCAMP files. Keep the current behaviour by choosing the default encoding if no encoding is specified. Use io.open() to read a JCAMP file to guarantee backwards compatibility with Python 2.
Thanks @jjhelmus, I wasn't aware that encodings in open() is not backwards compatible. I've modified that commit according to your wishes and added the previously forgotten parameter docstring. |
LGTM, thanks @andreasdoll |
Allow to provide a list of possible encodings for reading JCAMP files,
and use the first encoding which doesn't raise an UnicodeDecodeError.
Suppress such errors, except when all provided encodings fail.
Keep the current behaviour by choosing the default encoding if no
encoding is specified.
I'm not sure if you want to pull this, but for my purposes this is easier and more futureproof than converting the JCAMP files on operating system level.