-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Announcement: Create release 5.0 which is compatible with 3.20.0 but incompatible with 4.0 #471
Comments
With sadly no documentation on what went wrong and having this strict decision that 4.x will be dropped, I don't really feel comfortable advocating/using Kartothek anymore. Can you give any insights what are design decisions with 4.x that break the usage of it? This transparency would be useful to understand so that the same issues don't occur with 5.x again. |
Thanks for reaching out. I'll compile list of 4.0 changes and our intended way of treating them later today. |
The only reported issue I am aware of is #452 . I haven't investigated it thoroughly but from my first impression this looks like a problem which is easy enough to be fixed and doesn't warrant this dramatic change. I would appreciate more openness about these things if kartothek shall remain a shared open source library since I also do not feel comfortable with advocating anymore if this is the way the library is being developed. |
Let me get a bit more into detail about the reasoning. First it's correct that the only public reported issue is #452 and it's predecessors. That one was specially bad because the issue surface not when kartothek writes out the data but when it reads it the next time. Because of some incidents which were triggered that that we rolled back to kartothek 3.20 immediately and stopped the rollout of Kartothek 4.0. But this is only a bug. Bugs can be fixed. Concrete ChangesI want to be as open and transparent about the reasoning as possible. Quoting from an internal high level analysis (slightly changed by myself for public internet):
We then went on to examine each and every change between 3.2.0 and 4.0
We are in favor of dropping:
Everything else is to be decided, yet. For the Kartothek 4.0 dropping reasonsAs shown above, there is a proper migration path needed.
Thanks for the feedback @xhochy and @fjetter - your input is always valued and appriciated. I know this move is drastic and I would be more then happy if we would not be in this situation. But we are and need to move forward. |
tbqh I think this is terrible idea. If the migration from 3.x to 4.x is an issue, I suggest time be invested in making scripts that take care of updating the old datasets to 4.x as well as the code. |
Making a script is some effort but doable. The problem is tracking the (potential unknown) consumers who need to run it. |
tbh, I am still struggling to understand. What I do understand is
imho, these two points do not warrant any rollback but there are sane and much cheaper ways to move forward. For all other listed issue, I am struggling to see the relevance since they are dealing with rather fringe use cases. The entire intention of the 4.0 release was to get rid of legacy code in an affordable way. The only intended change in behaviour (other than dropping entire, unused features, of course) was the imho, rolling back and implementing a proper deprecation lifecycle for everything is not worth it since the work involved for the API migrations should be much cheaper than the rollback + deprecation path. On top of this, the intended changes for the 4.0 release have been in a lengthy BY-internal review process before implementation even started. Rolling back because months after there are other opinions is not the way I want to develop a library. That all said, the phrasing of this issue and its title already indicate that you are willing to pull super-admin privileges to get your way and this doesn't feel like an open discussion. I'm not willing to put more time into this issue if the outcome is predetermined, anyhow. |
Revert "Bump codecov/codecov-action from v1.4.1 to v1.5.0 (JDASoftwareGroup#466)" This reverts commit fdc9779. Revert "fix mistakes in documentation" This reverts commit 4e4b5e0. Revert "Bump pre-commit/action from v2.0.0 to v2.0.3 (JDASoftwareGroup#460)" This reverts commit d027ca2. Revert "Bump codecov/codecov-action from v1.4.0 to v1.4.1 (JDASoftwareGroup#461)" This reverts commit 97cd553. Revert "Bump codecov/codecov-action from v1.3.1 to v1.4.0 (JDASoftwareGroup#458)" This reverts commit e48d67a. Revert "Fix bug when loading few columns of a dataset with many primary indices (JDASoftwareGroup#446)" This reverts commit 90ee486. Revert "Prepare release 4.0.1" This reverts commit b278503. Revert "Fix tests for dask dataframe and delayed backends" This reverts commit 5520f74. Revert "Add end-to-end regression test" This reverts commit 8a3e6ae. Revert "Fix dataset corruption after updates (JDASoftwareGroup#445)" This reverts commit a26e840. Revert "Set release date for 4.0" This reverts commit 08a8094. Revert "Return dask scalar for store and update from ddf (JDASoftwareGroup#437)" This reverts commit 494732d. Revert "Add tests for non-default table (JDASoftwareGroup#440)" This reverts commit 3807a02. Revert "Bump codecov/codecov-action from v1.2.2 to v1.3.1 (JDASoftwareGroup#441)" This reverts commit f7615ec. Revert "Set default for dates_as_object to True (JDASoftwareGroup#436)" This reverts commit 75ffdb5. Revert "Remove inferred indices (JDASoftwareGroup#438)" This reverts commit b1e2535. Revert "fix typo: 'KTK_CUBE_UUID_SEPERATOR' -> 'KTK_CUBE_UUID_SEPARATOR' (JDASoftwareGroup#422)" This reverts commit b349cee. Revert "Remove all deprecated arguments (JDASoftwareGroup#434)" This reverts commit 74f0790. Revert "Remove multi table feature (JDASoftwareGroup#431)" This reverts commit 032856a.
I can very much second @fjetter's statement here. I'm happy to work with you on fixing the issues that came with the 4.0 release. Apart from the To make this more productive and less prone to misunderstandings, I'm happy to also have a virtual coffee chat on a 1:1 basis to clear things up. This is probably less tense than the plain text communication here. |
@xhochy I sent you a mail with proposals for a virtual coffee. Thanks for taking the time. |
Just want to give a small update on this. The next steps are (aligned with @xhochy):
Let me know your thoughts. |
Revert "Bump codecov/codecov-action from v1.4.1 to v1.5.0 (JDASoftwareGroup#466)" This reverts commit fdc9779. Revert "fix mistakes in documentation" This reverts commit 4e4b5e0. Revert "Bump pre-commit/action from v2.0.0 to v2.0.3 (JDASoftwareGroup#460)" This reverts commit d027ca2. Revert "Bump codecov/codecov-action from v1.4.0 to v1.4.1 (JDASoftwareGroup#461)" This reverts commit 97cd553. Revert "Bump codecov/codecov-action from v1.3.1 to v1.4.0 (JDASoftwareGroup#458)" This reverts commit e48d67a. Revert "Fix bug when loading few columns of a dataset with many primary indices (JDASoftwareGroup#446)" This reverts commit 90ee486. Revert "Prepare release 4.0.1" This reverts commit b278503. Revert "Fix tests for dask dataframe and delayed backends" This reverts commit 5520f74. Revert "Add end-to-end regression test" This reverts commit 8a3e6ae. Revert "Fix dataset corruption after updates (JDASoftwareGroup#445)" This reverts commit a26e840. Revert "Set release date for 4.0" This reverts commit 08a8094. Revert "Return dask scalar for store and update from ddf (JDASoftwareGroup#437)" This reverts commit 494732d. Revert "Add tests for non-default table (JDASoftwareGroup#440)" This reverts commit 3807a02. Revert "Bump codecov/codecov-action from v1.2.2 to v1.3.1 (JDASoftwareGroup#441)" This reverts commit f7615ec. Revert "Set default for dates_as_object to True (JDASoftwareGroup#436)" This reverts commit 75ffdb5. Revert "Remove inferred indices (JDASoftwareGroup#438)" This reverts commit b1e2535. Revert "fix typo: 'KTK_CUBE_UUID_SEPERATOR' -> 'KTK_CUBE_UUID_SEPARATOR' (JDASoftwareGroup#422)" This reverts commit b349cee. Revert "Remove all deprecated arguments (JDASoftwareGroup#434)" This reverts commit 74f0790. Revert "Remove multi table feature (JDASoftwareGroup#431)" This reverts commit 032856a.
Revert "Bump codecov/codecov-action from v1.4.1 to v1.5.0 (#466)" This reverts commit fdc9779. Revert "fix mistakes in documentation" This reverts commit 4e4b5e0. Revert "Bump pre-commit/action from v2.0.0 to v2.0.3 (#460)" This reverts commit d027ca2. Revert "Bump codecov/codecov-action from v1.4.0 to v1.4.1 (#461)" This reverts commit 97cd553. Revert "Bump codecov/codecov-action from v1.3.1 to v1.4.0 (#458)" This reverts commit e48d67a. Revert "Fix bug when loading few columns of a dataset with many primary indices (#446)" This reverts commit 90ee486. Revert "Prepare release 4.0.1" This reverts commit b278503. Revert "Fix tests for dask dataframe and delayed backends" This reverts commit 5520f74. Revert "Add end-to-end regression test" This reverts commit 8a3e6ae. Revert "Fix dataset corruption after updates (#445)" This reverts commit a26e840. Revert "Set release date for 4.0" This reverts commit 08a8094. Revert "Return dask scalar for store and update from ddf (#437)" This reverts commit 494732d. Revert "Add tests for non-default table (#440)" This reverts commit 3807a02. Revert "Bump codecov/codecov-action from v1.2.2 to v1.3.1 (#441)" This reverts commit f7615ec. Revert "Set default for dates_as_object to True (#436)" This reverts commit 75ffdb5. Revert "Remove inferred indices (#438)" This reverts commit b1e2535. Revert "fix typo: 'KTK_CUBE_UUID_SEPERATOR' -> 'KTK_CUBE_UUID_SEPARATOR' (#422)" This reverts commit b349cee. Revert "Remove all deprecated arguments (#434)" This reverts commit 74f0790. Revert "Remove multi table feature (#431)" This reverts commit 032856a.
We merged @ilia-zaitcev-by's PR. Next steps are releasing a 5.0.0rc1, extensive testing and then preparing the 5.0. |
Thanks @steffen-schroeder-by for the communication and constructive discussion! |
Background
Kartothek 4.0 is a great release with many extremely valuable simplifications. But during internal migrations from 3.20.0 to 4.0, it turned out that for many of those well thought changes an easy migration could not be taken with reasonable effort.
Also some of the simplifications caused customer facing incidents. Thus we are currently stuck with 3.20.0.
To give this important library the deserved attention and speed we are working with a bunch of colleagues on getting out a stable version of kartothek again, where we can simply migrate to and accept community feedback and pull requetst.
To move on, we need to bring a current stable version (3.20.0) to a version which has the simplification of 4.0.
We could have continued with development of 3.x branch and slowly converging against 4.0.
We have decided against this for the following reasons:
Planned next steps
To overcome this we decided on the following way forward.
Dict[str,List[T]]
toList[T]
) we might go a step in between. Communication on that will follow.We know that especially the incompatibility between 4.0 and 5.0 will be an issue for some consumers. We that's why encourage you to use the very stable kartothek version 3.20.0 and not version 4.x
The text was updated successfully, but these errors were encountered: