-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare dataframe classes #111
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Code Coverage Summary
Results for commit: c30e97f Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Unit Tests Summary 1 files 11 suites 5s ⏱️ Results for commit c30e97f. ♻️ This comment has been updated with latest results. |
kieranjmartin
requested changes
Jul 19, 2024
kieranjmartin
approved these changes
Jul 25, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #42
This one was a little more involved so a couple of points to bear in mind below:
I currently create a "summary" dataframe which contains the number of rows and columns as well as the class of each dataset. I wanted this summary to appear at the top of the printed output regardless of wether users wanted to check the class as its a nice high level summary. This thus requires some special case handling in the main diffdf function to adjust the corresponding issue. e.g. if the check is disabled and there aren't any other issues then remove the summary issue as it should be regarded as compared the same. Likewise the warning message needs to be different to the actual display string. (happy to talk about this more if the special case handling doesn't make sense
In general this highlights to me that our current issues setup is too rigid and needs to be overhauled to handle more dynamic content (e.g. display content that isn't strictly to be regarded as an issue). I didn't want to do this overhaul here as that would be too much for a single PR so have just added special case handling logic instead.
The printed output for this looked clunky so I played around with the newlines and spacings to try and make it look as good as I think I can get it (thus some of the print snapshots have changed slightly)
A slightly bigger change to the printed output is I took this moment to remove the "all rows are show below" as this is an utter waste of space
I also removed the "a summary is shown below" as it no longer makes sense when combined with the summary table
The unit tests for
row_print
where too fragile and broke despite only white space changes so I reworked them to be a little bit more robust by making them relative to each other rather than testing for specific row lengths (e.g. show that changing row_limit from 5 to 10 increases the number of lines by 5)