-
-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maintain original feature and target names after shuffling #377
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #377 +/- ##
==========================================
+ Coverage 35.42% 35.54% +0.11%
==========================================
Files 96 96
Lines 6379 6381 +2
==========================================
+ Hits 2260 2268 +8
+ Misses 4119 4113 -6 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix.
src/dataset/impl_dataset.rs
Outdated
@@ -577,6 +577,20 @@ where | |||
/// ### Returns | |||
/// | |||
/// A new shuffled version of the current Dataset | |||
/// # Example | |||
/// ``` | |||
/// use rand::thread_rng; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an example, it would be better to have assertions with explicit values rather than println!()
which requires the user to run the example.
As the purpose of the function is rather straightforward and as checking names in an example of shuffle()
seems a bit contrived, here I would simply go with a test in mod.rs
instead of an example. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it does seem contrived to check for names in shuffle, but since this is the purpose of this PR, do you think I should still add an assertion for feature and target names in the test in mod.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my point was to remove the example and add a test of shuffle()
in mod.rs
testing also you've fixed the bug, something like:
#[test]
fn test_dataset_shuffle() {
let mut rng = SmallRng::seed_from_u64(42);
let f_names = vec!["1", "2", "3"];
let t_names = vec!["one"];
let dataset = Dataset::new(
array![[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]],
array![0., 1., 3.],
)
.with_feature_names(f_names.clone())
.with_target_names(t_names.clone());
let shuffled = dataset.shuffle(&mut rng);
assert_abs_diff_ne!(dataset.records(), shuffled.records());
assert_abs_diff_ne!(dataset.targets(), shuffled.targets());
assert_eq!(f_names, shuffled.feature_names());
assert_eq!(t_names, shuffled.target_names());
}
Thanks! |
Fixing Issue #375