-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify support other file formats, e.g. Xslx #466
Comments
It sounds to me like you're looking at an old version of the code because we recently made some of these changes to support adding the JSON and fixed-width file readers. Can you sync your repo against the latest code from master? |
Yes, you're right! It seems like the kind of changes I suggested are in the newest version. My plan is to create a new maven module with the XslxReader and XslxReadOptions in a tech.tablesaw.io.xslx package, so it needn't be part of the core. Then integrate it into the build and create a pull request. Sounds OK? Should I create a separate issue first? |
No need to create a separate issue I'm not sure the best API for optional dependencies would be or how to best include them. We should probably follow the same pattern for the JSON, HTML, and CSV dependencies as well. For including, we have the For API, most stuff is done today via |
I think I'd probably lean towards including the POI dependency as an optional dependency like the Smile dependency and creating the reader in the main module alongside the json, csv, html readers |
I agree
…On Thu, Feb 21, 2019 at 11:02 PM Ben McCann ***@***.***> wrote:
I think I'd probably lean towards including the POI dependency as an
optional dependency like the Smile dependency and creating the reader in
the main module alongside the json, csv, html readers
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#466 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADRXggNE6_JEAujfm4gCy_KoSDiebTTKks5vP2vsgaJpZM4bE3fp>
.
|
* Initial xlsx support, as discussed in issue #466 * Addresses some Codacy issues. * Removed imports and improved assert message. * Moved test file(s) to data folder, to see if that fixes travis build. * Add missing value, when cell value isn't appended * Improved test messages. * Improved assert messages, added test for file with missing values. Fixed problem with mapInto for missing value, shouldn't call function. * Removed comments and replaced tabs with spaces. * Replaced tabs with spaces (fixed Eclipse settings), improved code according to comments. * Removed final (and fixed my setup to not add them), and made other changes according to comments. * More improvements. * Replaced use of int[] with private static class TableRange, removed use of SHORT column type, replaced check prefix with assert for test helper methods.
I'm looking into supporting Excel files (xslx) by using the Apache POI project. I would like to reuse some of the logic from the corresponding csv code, like CsvReadOptions and its Builder class, but that's pretty difficult. First, the CsvReadOptions has a private constructor (called by its Builder), so you cannot subclass it. Second, some options don't make sense for a format that's not line-oriented, e.g. separator and lineEnding. In addition, the customParser method in ColumnType takes an CsvReadOptions instance as argument, so it needs to be subclassed.
I propose making an AbstractReadOptions class that includes the things that other formats are most likely to need, and that is easy to subclass. And update the customParser method in ColumnType to use that class as argument instead.
The text was updated successfully, but these errors were encountered: