Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inner join handle common columns #73

Merged
merged 6 commits into from
Oct 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions changelog.org
Original file line number Diff line number Diff line change
@@ -1,3 +1,24 @@
* v0.4.7
- *Potentially BREAKING behavior*: ~innerJoin~ now has another
parameter ~commonColumns~, which handles how columns that are common
to both dataframes are handled.
Previously, we wrongly assumed that the data in all common columns
must match exactly. In that case it didn't matter which column to
take.
However, if they did *not* match, the data for the common columns
(including the one we joined by) was corrupted and left at default
initialization from the first mismatch in a common column.
There are now 3 different ways:
- ~ccLeft~: keep the data of the left input
- ~ccRename~: Rename the common columns to ~*_left~ and ~*_right~
(default)
- ~ccDrop~: Drop common columns that are not the joined one.
We choose to use ~ccRename~ as the default, because it keeps all
information. It is a breaking change however, because the common
columns now have a different name. But imo it's better here to make
people aware that this change happened instead of giving them wrong
data.
Feel free to conflict me on this.
* v0.4.6
- prepare Datamancer for ~-d:nimPreviewHashFarm~
* v0.4.5
Expand Down
Loading
Loading