Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception on exporting files with accented characters in names #569

Closed
dehall opened this issue Sep 12, 2019 · 2 comments
Closed

Exception on exporting files with accented characters in names #569

dehall opened this issue Sep 12, 2019 · 2 comments

Comments

@dehall
Copy link
Contributor

dehall commented Sep 12, 2019

Observed the following exception on a CentOS system with a NAS (I don't know the full tech specs of the file system)

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: Mar?a_Elena653_Duran646_482f31eb-b9f2-4d32-93c3-28e021c29160.json
        at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
        at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
        at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
        at sun.nio.fs.AbstractPath.resolve(AbstractPath.java:53)
        at org.mitre.synthea.export.Exporter.exportRecord(Exporter.java:113)
        at org.mitre.synthea.export.Exporter.export(Exporter.java:52)
        at org.mitre.synthea.engine.Generator.generatePerson(Generator.java:406)
        at org.mitre.synthea.engine.Generator.lambda$3(Generator.java:242)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Many of the "Spanish" names in Synthea use accented characters (à, ò, etc) and while I've never seen this exception on Mac or Windows, apparently some file systems do not support them. We should consider sanitizing the file names to be pure ASCII to be safe.

@shabiel
Copy link
Contributor

shabiel commented Sep 12, 2019 via email

@jawalonoski
Copy link
Member

I am in agreement with @shabiel on this one.

If it is really an issue for certain systems, you can use UUID filenames by changing synthea.properties:

exporter.use_uuid_filenames = true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants