Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow different encodings for reading JCAMP files #101

Merged
merged 1 commit into from
Jul 22, 2019
Merged

Allow different encodings for reading JCAMP files #101

merged 1 commit into from
Jul 22, 2019

Conversation

andreasdoll
Copy link
Contributor

Allow to provide a list of possible encodings for reading JCAMP files,
and use the first encoding which doesn't raise an UnicodeDecodeError.
Suppress such errors, except when all provided encodings fail.
Keep the current behaviour by choosing the default encoding if no
encoding is specified.

I'm not sure if you want to pull this, but for my purposes this is easier and more futureproof than converting the JCAMP files on operating system level.

@@ -2091,7 +2092,7 @@ def rm_dig_filter(

# JCAMP-DX functions

def read_jcamp(filename):
def read_jcamp(filename, encodings=[locale.getpreferredencoding()]):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be preferable to use a single encoding here. If users want to try multiple encoding they can create a loop over read_jcamp with a check for a UnicodeDecodeError.

warn("Unable to correctly parse line:" + line)
for N, encoding in enumerate(encodings):
try:
with open(filename, 'r', encoding = encoding) as f:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python 2 encoding is not an argument in the buildin open command. io.open does support this argument.

@jjhelmus
Copy link
Owner

@andreasdoll I made two suggestions on how I think this should be implemented. Let me know if you need any help with this.

Allow to specify an encoding for reading JCAMP files. Keep the current
behaviour by choosing the default encoding if no encoding is specified.
Use io.open() to read a JCAMP file to guarantee backwards compatibility
with Python 2.
@andreasdoll
Copy link
Contributor Author

Thanks @jjhelmus, I wasn't aware that encodings in open() is not backwards compatible. I've modified that commit according to your wishes and added the previously forgotten parameter docstring.

@jjhelmus
Copy link
Owner

LGTM, thanks @andreasdoll

@jjhelmus jjhelmus merged commit 51a655c into jjhelmus:master Jul 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants