Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow parsing files with UTF-8 BOM #5

Closed
jbvsmo opened this issue Nov 6, 2018 · 3 comments
Closed

Allow parsing files with UTF-8 BOM #5

jbvsmo opened this issue Nov 6, 2018 · 3 comments
Assignees
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Milestone

Comments

@jbvsmo
Copy link

jbvsmo commented Nov 6, 2018

I don't know what the gedcom 5.5 format says about this, but for the sake of simplicity and because most text editors nowadays add it by default, this code should detect and ignore an UTF-8 BOM mark at the start of the file.

It is super complicated to understand why the loading failed because it only says: Line 1 of document violates GEDCOM format 5.5 and nothing more. Because these bytes are meant to be ignored, you can't see the issue on line 1 unless you load the file in python and print a representation of said line.

One option is to use the utf-8-sig codec instead.
https://docs.python.org/3/library/codecs.html#module-encodings.utf_8_sig

@joeyaurel
Copy link
Owner

Hey @jbvsmo! Thank you for your issue.

The problems were resolved with the issue #6 and a new version of the parser should be up really soon.

@joeyaurel joeyaurel self-assigned this Nov 19, 2018
@joeyaurel joeyaurel added duplicate This issue or pull request already exists enhancement New feature or request labels Nov 19, 2018
@joeyaurel joeyaurel added this to the v1.0.0 milestone Nov 19, 2018
@damonbrodie
Copy link

damonbrodie commented Nov 19, 2018

I think this can be closed now - my previous commit now handles BOM.

Nevermind - I see Nick commented on this already.

@joeyaurel
Copy link
Owner

It sure does :) I just published a new version @jbvsmo https://pypi.org/project/python-gedcom/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants