Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data split mode to DMatrix MetaInfo #8568

Merged
merged 27 commits into from
Dec 25, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
40252ee
Add data split mode to DMatrix MetaInfo
rongou Dec 7, 2022
7c35c40
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 8, 2022
26ed1a9
remove dsplit training param
rongou Dec 8, 2022
d3fda24
fix dmatrix validation
rongou Dec 8, 2022
8e797f7
fix python
rongou Dec 8, 2022
e12f361
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 12, 2022
8f7ac3e
fix dsplit for local mode
rongou Dec 12, 2022
fa7a670
fix java bulid
rongou Dec 12, 2022
afc5fa0
fix R package
rongou Dec 12, 2022
31b7112
fix demo
rongou Dec 12, 2022
32d7fcc
fix line too long
rongou Dec 12, 2022
c857cd9
fix r doc
rongou Dec 12, 2022
aa0c26c
update roxgen
rongou Dec 12, 2022
cbd1a42
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 13, 2022
d7830cb
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 15, 2022
c9ee1d6
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 15, 2022
6782dd9
add XGDMatrixCreateFromFileV2
rongou Dec 15, 2022
86226e0
add a test for v2
rongou Dec 16, 2022
914df2a
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 16, 2022
bde1e4c
add need_split to json config
rongou Dec 16, 2022
55f8aa4
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 19, 2022
9002705
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 20, 2022
c80a3ae
change to uri
rongou Dec 20, 2022
58ae574
remove need_split as a parameter
rongou Dec 20, 2022
f6148a3
fix python
rongou Dec 20, 2022
da7d545
fix dask test
rongou Dec 20, 2022
417dc18
Merge remote-tracking branch 'upstream/master' into data-split-param
rongou Dec 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions include/xgboost/data.h
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ class MetaInfo {
uint64_t num_nonzero_{0}; // NOLINT
/*! \brief label of each instance */
linalg::Tensor<float, 2> labels;
/*! \brief data split mode */
DataSplitMode data_split_mode{DataSplitMode::kNone};
/*!
* \brief the index of begin and end of a group
* needed when the learning task is ranking.
Expand Down
1 change: 1 addition & 0 deletions src/data/simple_dmatrix.cc
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ DMatrix* SimpleDMatrix::SliceCol(std::size_t start, std::size_t size) {
out->Info() = this->Info().Copy();
out->Info().num_nonzero_ = h_offset.back();
}
out->Info().data_split_mode = DataSplitMode::kCol;
return out;
}

Expand Down
2 changes: 2 additions & 0 deletions tests/cpp/data/test_simple_dmatrix.cc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ TEST(SimpleDMatrix, MetaInfo) {
EXPECT_EQ(dmat->Info().num_col_, 5);
EXPECT_EQ(dmat->Info().num_nonzero_, 6);
EXPECT_EQ(dmat->Info().labels.Size(), dmat->Info().num_row_);
EXPECT_EQ(dmat->Info().data_split_mode, DataSplitMode::kNone);

delete dmat;
}
Expand Down Expand Up @@ -360,6 +361,7 @@ TEST(SimpleDMatrix, SliceCol) {
ASSERT_EQ(out->Info().num_col_, out->Info().num_col_);
ASSERT_EQ(out->Info().num_row_, kRows);
ASSERT_EQ(out->Info().num_nonzero_, kRows * kSlicCols); // dense
ASSERT_EQ(out->Info().data_split_mode, DataSplitMode::kCol);
}
}

Expand Down