Skip to content

Commit

Permalink
Rework the precision metric. (#9222)
Browse files Browse the repository at this point in the history
- Rework the precision metric for both CPU and GPU.
- Mention it in the document.
- Cleanup old support code for GPU ranking metric.
- Deterministic GPU implementation.

* Drop support for classification.

* type.

* use batch shape.

* lint.

* cpu build.

* cpu build.

* lint.

* Tests.

* Fix.

* Cleanup error message.
  • Loading branch information
trivialfis authored Jun 2, 2023
1 parent db82881 commit 9fbde21
Show file tree
Hide file tree
Showing 22 changed files with 309 additions and 499 deletions.
2 changes: 1 addition & 1 deletion R-package/R/callbacks.R
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ cb.early.stop <- function(stopping_rounds, maximize = FALSE,

# maximize is usually NULL when not set in xgb.train and built-in metrics
if (is.null(maximize))
maximize <<- grepl('(_auc|_map|_ndcg)', metric_name)
maximize <<- grepl('(_auc|_map|_ndcg|_pre)', metric_name)

if (verbose && NVL(env$rank, 0) == 0)
cat("Will train until ", metric_name, " hasn't improved in ",
Expand Down
3 changes: 2 additions & 1 deletion doc/parameter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,7 @@ Specify the learning task and the corresponding learning objective. The objectiv

After XGBoost 1.6, both of the requirements and restrictions for using ``aucpr`` in classification problem are similar to ``auc``. For ranking task, only binary relevance label :math:`y \in [0, 1]` is supported. Different from ``map (mean average precision)``, ``aucpr`` calculates the *interpolated* area under precision recall curve using continuous interpolation.

- ``pre``: Precision at :math:`k`. Supports only learning to rank task.
- ``ndcg``: `Normalized Discounted Cumulative Gain <http://en.wikipedia.org/wiki/NDCG>`_
- ``map``: `Mean Average Precision <http://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision>`_

Expand All @@ -435,7 +436,7 @@ Specify the learning task and the corresponding learning objective. The objectiv

where :math:`I_{(k)}` is an indicator function that equals to :math:`1` when the document at :math:`k` is relevant and :math:`0` otherwise. The :math:`P@k` is the precision at :math:`k`, and :math:`N` is the total number of relevant documents. Lastly, the `mean average precision` is defined as the weighted average across all queries.

- ``ndcg@n``, ``map@n``: :math:`n` can be assigned as an integer to cut off the top positions in the lists for evaluation.
- ``ndcg@n``, ``map@n``, ``pre@n``: :math:`n` can be assigned as an integer to cut off the top positions in the lists for evaluation.
- ``ndcg-``, ``map-``, ``ndcg@n-``, ``map@n-``: In XGBoost, the NDCG and MAP evaluate the score of a list without any positive samples as :math:`1`. By appending "-" to the evaluation metric name, we can ask XGBoost to evaluate these scores as :math:`0` to be consistent under some conditions.
- ``poisson-nloglik``: negative log-likelihood for Poisson regression
- ``gamma-nloglik``: negative log-likelihood for gamma regression
Expand Down
2 changes: 2 additions & 0 deletions python-package/xgboost/callback.py
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,8 @@ def minimize(new: _Score, best: _Score) -> bool:
maximize_metrics = (
"auc",
"aucpr",
"pre",
"pre@",
"map",
"ndcg",
"auc@",
Expand Down
54 changes: 53 additions & 1 deletion python-package/xgboost/testing/metrics.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,61 @@
"""Tests for evaluation metrics."""
from typing import Dict
from typing import Dict, List

import numpy as np
import pytest

import xgboost as xgb
from xgboost.compat import concat
from xgboost.core import _parse_eval_str


def check_precision_score(tree_method: str) -> None:
"""Test for precision with ranking and classification."""
datasets = pytest.importorskip("sklearn.datasets")

X, y = datasets.make_classification(
n_samples=1024, n_features=4, n_classes=2, random_state=2023
)
qid = np.zeros(shape=y.shape) # same group

ltr = xgb.XGBRanker(n_estimators=2, tree_method=tree_method)
ltr.fit(X, y, qid=qid)

# re-generate so that XGBoost doesn't evaluate the result to 1.0
X, y = datasets.make_classification(
n_samples=512, n_features=4, n_classes=2, random_state=1994
)

ltr.set_params(eval_metric="pre@32")
result = _parse_eval_str(
ltr.get_booster().eval_set(evals=[(xgb.DMatrix(X, y), "Xy")])
)
score_0 = result[1][1]

X_list = []
y_list = []
n_query_groups = 3
q_list: List[np.ndarray] = []
for i in range(n_query_groups):
# same for all groups
X, y = datasets.make_classification(
n_samples=512, n_features=4, n_classes=2, random_state=1994
)
X_list.append(X)
y_list.append(y)
q = np.full(shape=y.shape, fill_value=i, dtype=np.uint64)
q_list.append(q)

qid = concat(q_list)
X = concat(X_list)
y = concat(y_list)

result = _parse_eval_str(
ltr.get_booster().eval_set(evals=[(xgb.DMatrix(X, y, qid=qid), "Xy")])
)
assert result[1][0].endswith("pre@32")
score_1 = result[1][1]
assert score_1 == score_0


def check_quantile_error(tree_method: str) -> None:
Expand Down
170 changes: 0 additions & 170 deletions src/common/device_helpers.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -825,176 +825,6 @@ XGBOOST_DEVICE auto tcrend(xgboost::common::Span<T> const &span) { // NOLINT
return tcrbegin(span) + span.size();
}

// This type sorts an array which is divided into multiple groups. The sorting is influenced
// by the function object 'Comparator'
template <typename T>
class SegmentSorter {
private:
// Items sorted within the group
caching_device_vector<T> ditems_;

// Original position of the items before they are sorted descending within their groups
caching_device_vector<uint32_t> doriginal_pos_;

// Segments within the original list that delineates the different groups
caching_device_vector<uint32_t> group_segments_;

// Need this on the device as it is used in the kernels
caching_device_vector<uint32_t> dgroups_; // Group information on device

// Where did the item that was originally present at position 'x' move to after they are sorted
caching_device_vector<uint32_t> dindexable_sorted_pos_;

// Initialize everything but the segments
void Init(uint32_t num_elems) {
ditems_.resize(num_elems);

doriginal_pos_.resize(num_elems);
thrust::sequence(doriginal_pos_.begin(), doriginal_pos_.end());
}

// Initialize all with group info
void Init(const std::vector<uint32_t> &groups) {
uint32_t num_elems = groups.back();
this->Init(num_elems);
this->CreateGroupSegments(groups);
}

public:
// This needs to be public due to device lambda
void CreateGroupSegments(const std::vector<uint32_t> &groups) {
uint32_t num_elems = groups.back();
group_segments_.resize(num_elems, 0);

dgroups_ = groups;

if (GetNumGroups() == 1) return; // There are no segments; hence, no need to compute them

// Define the segments by assigning a group ID to each element
const uint32_t *dgroups = dgroups_.data().get();
uint32_t ngroups = dgroups_.size();
auto ComputeGroupIDLambda = [=] __device__(uint32_t idx) {
return thrust::upper_bound(thrust::seq, dgroups, dgroups + ngroups, idx) -
dgroups - 1;
}; // NOLINT

thrust::transform(thrust::make_counting_iterator(static_cast<uint32_t>(0)),
thrust::make_counting_iterator(num_elems),
group_segments_.begin(),
ComputeGroupIDLambda);
}

// Accessors that returns device pointer
inline uint32_t GetNumItems() const { return ditems_.size(); }
inline const xgboost::common::Span<const T> GetItemsSpan() const {
return { ditems_.data().get(), ditems_.size() };
}

inline const xgboost::common::Span<const uint32_t> GetOriginalPositionsSpan() const {
return { doriginal_pos_.data().get(), doriginal_pos_.size() };
}

inline const xgboost::common::Span<const uint32_t> GetGroupSegmentsSpan() const {
return { group_segments_.data().get(), group_segments_.size() };
}

inline uint32_t GetNumGroups() const { return dgroups_.size() - 1; }
inline const xgboost::common::Span<const uint32_t> GetGroupsSpan() const {
return { dgroups_.data().get(), dgroups_.size() };
}

inline const xgboost::common::Span<const uint32_t> GetIndexableSortedPositionsSpan() const {
return { dindexable_sorted_pos_.data().get(), dindexable_sorted_pos_.size() };
}

// Sort an array that is divided into multiple groups. The array is sorted within each group.
// This version provides the group information that is on the host.
// The array is sorted based on an adaptable binary predicate. By default a stateless predicate
// is used.
template <typename Comparator = thrust::greater<T>>
void SortItems(const T *ditems, uint32_t item_size, const std::vector<uint32_t> &groups,
const Comparator &comp = Comparator()) {
this->Init(groups);
this->SortItems(ditems, item_size, this->GetGroupSegmentsSpan(), comp);
}

// Sort an array that is divided into multiple groups. The array is sorted within each group.
// This version provides the group information that is on the device.
// The array is sorted based on an adaptable binary predicate. By default a stateless predicate
// is used.
template <typename Comparator = thrust::greater<T>>
void SortItems(const T *ditems, uint32_t item_size,
const xgboost::common::Span<const uint32_t> &group_segments,
const Comparator &comp = Comparator()) {
this->Init(item_size);

// Sort the items that are grouped. We would like to avoid using predicates to perform the sort,
// as thrust resorts to using a merge sort as opposed to a much much faster radix sort
// when comparators are used. Hence, the following algorithm is used. This is done so that
// we can grab the appropriate related values from the original list later, after the
// items are sorted.
//
// Here is the internal representation:
// dgroups_: [ 0, 3, 5, 8, 10 ]
// group_segments_: 0 0 0 | 1 1 | 2 2 2 | 3 3
// doriginal_pos_: 0 1 2 | 3 4 | 5 6 7 | 8 9
// ditems_: 1 0 1 | 2 1 | 1 3 3 | 4 4 (from original items)
//
// Sort the items first and make a note of the original positions in doriginal_pos_
// based on the sort
// ditems_: 4 4 3 3 2 1 1 1 1 0
// doriginal_pos_: 8 9 6 7 3 0 2 4 5 1
// NOTE: This consumes space, but is much faster than some of the other approaches - sorting
// in kernel, sorting using predicates etc.

ditems_.assign(thrust::device_ptr<const T>(ditems),
thrust::device_ptr<const T>(ditems) + item_size);

// Allocator to be used by sort for managing space overhead while sorting
dh::XGBCachingDeviceAllocator<char> alloc;

thrust::stable_sort_by_key(thrust::cuda::par(alloc),
ditems_.begin(), ditems_.end(),
doriginal_pos_.begin(), comp);

if (GetNumGroups() == 1) return; // The entire array is sorted, as it isn't segmented

// Next, gather the segments based on the doriginal_pos_. This is to reflect the
// holisitic item sort order on the segments
// group_segments_c_: 3 3 2 2 1 0 0 1 2 0
// doriginal_pos_: 8 9 6 7 3 0 2 4 5 1 (stays the same)
caching_device_vector<uint32_t> group_segments_c(item_size);
thrust::gather(doriginal_pos_.begin(), doriginal_pos_.end(),
dh::tcbegin(group_segments), group_segments_c.begin());

// Now, sort the group segments so that you may bring the items within the group together,
// in the process also noting the relative changes to the doriginal_pos_ while that happens
// group_segments_c_: 0 0 0 1 1 2 2 2 3 3
// doriginal_pos_: 0 2 1 3 4 6 7 5 8 9
thrust::stable_sort_by_key(thrust::cuda::par(alloc),
group_segments_c.begin(), group_segments_c.end(),
doriginal_pos_.begin(), thrust::less<uint32_t>());

// Finally, gather the original items based on doriginal_pos_ to sort the input and
// to store them in ditems_
// doriginal_pos_: 0 2 1 3 4 6 7 5 8 9 (stays the same)
// ditems_: 1 1 0 2 1 3 3 1 4 4 (from unsorted items - ditems)
thrust::gather(doriginal_pos_.begin(), doriginal_pos_.end(),
thrust::device_ptr<const T>(ditems), ditems_.begin());
}

// Determine where an item that was originally present at position 'x' has been relocated to
// after a sort. Creation of such an index has to be explicitly requested after a sort
void CreateIndexableSortedPositions() {
dindexable_sorted_pos_.resize(GetNumItems());
thrust::scatter(thrust::make_counting_iterator(static_cast<uint32_t>(0)),
thrust::make_counting_iterator(GetNumItems()), // Rearrange indices...
// ...based on this map
dh::tcbegin(GetOriginalPositionsSpan()),
dindexable_sorted_pos_.begin()); // Write results into this
}
};

// Atomic add function for gradients
template <typename OutputGradientT, typename InputGradientT>
XGBOOST_DEV_INLINE void AtomicAddGpair(OutputGradientT* dest,
Expand Down
9 changes: 4 additions & 5 deletions src/common/optional_weight.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@
#include "xgboost/host_device_vector.h" // HostDeviceVector
#include "xgboost/span.h" // Span

namespace xgboost {
namespace common {
namespace xgboost::common {
struct OptionalWeights {
Span<float const> weights;
float dft{1.0f}; // fixme: make this compile time constant
Expand All @@ -18,7 +17,8 @@ struct OptionalWeights {
explicit OptionalWeights(float w) : dft{w} {}

XGBOOST_DEVICE float operator[](size_t i) const { return weights.empty() ? dft : weights[i]; }
auto Empty() const { return weights.empty(); }
[[nodiscard]] auto Empty() const { return weights.empty(); }
[[nodiscard]] auto Size() const { return weights.size(); }
};

inline OptionalWeights MakeOptionalWeights(Context const* ctx,
Expand All @@ -28,6 +28,5 @@ inline OptionalWeights MakeOptionalWeights(Context const* ctx,
}
return OptionalWeights{ctx->IsCPU() ? weights.ConstHostSpan() : weights.ConstDeviceSpan()};
}
} // namespace common
} // namespace xgboost
} // namespace xgboost::common
#endif // XGBOOST_COMMON_OPTIONAL_WEIGHT_H_
3 changes: 3 additions & 0 deletions src/common/quantile.cc
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,9 @@ void HostSketchContainer::PushAdapterBatch(Batch const &batch, size_t base_rowid
MetaInfo const &info, float missing) {
auto const &h_weights =
(use_group_ind_ ? detail::UnrollGroupWeights(info) : info.weights_.HostVector());
if (!use_group_ind_ && !h_weights.empty()) {
CHECK_EQ(h_weights.size(), batch.Size()) << "Invalid size of sample weight.";
}

auto is_valid = data::IsValidFunctor{missing};
auto weights = OptionalWeights{Span<float const>{h_weights}};
Expand Down
20 changes: 12 additions & 8 deletions src/common/quantile.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@

#include "categorical.h"
#include "common.h"
#include "error_msg.h" // GroupWeight
#include "optional_weight.h" // OptionalWeights
#include "threading_utils.h"
#include "timer.h"

namespace xgboost {
namespace common {
namespace xgboost::common {
/*!
* \brief experimental wsummary
* \tparam DType type of data content
Expand Down Expand Up @@ -695,13 +695,18 @@ inline std::vector<float> UnrollGroupWeights(MetaInfo const &info) {
return group_weights;
}

size_t n_samples = info.num_row_;
auto const &group_ptr = info.group_ptr_;
std::vector<float> results(n_samples);
CHECK_GE(group_ptr.size(), 2);
CHECK_EQ(group_ptr.back(), n_samples);

auto n_groups = group_ptr.size() - 1;
CHECK_EQ(info.weights_.Size(), n_groups) << error::GroupWeight();

bst_row_t n_samples = info.num_row_;
std::vector<float> results(n_samples);
CHECK_EQ(group_ptr.back(), n_samples)
<< error::GroupSize() << " the number of rows from the data.";
size_t cur_group = 0;
for (size_t i = 0; i < n_samples; ++i) {
for (bst_row_t i = 0; i < n_samples; ++i) {
results[i] = group_weights[cur_group];
if (i == group_ptr[cur_group + 1]) {
cur_group++;
Expand Down Expand Up @@ -1010,6 +1015,5 @@ class SortedSketchContainer : public SketchContainerImpl<WXQuantileSketch<float,
*/
void PushColPage(SparsePage const &page, MetaInfo const &info, Span<float const> hessian);
};
} // namespace common
} // namespace xgboost
} // namespace xgboost::common
#endif // XGBOOST_COMMON_QUANTILE_H_
13 changes: 12 additions & 1 deletion src/common/ranking_utils.cc
Original file line number Diff line number Diff line change
Expand Up @@ -114,9 +114,20 @@ void NDCGCache::InitOnCUDA(Context const*, MetaInfo const&) { common::AssertGPUS

DMLC_REGISTER_PARAMETER(LambdaRankParam);

void PreCache::InitOnCPU(Context const*, MetaInfo const& info) {
auto const& h_label = info.labels.HostView().Slice(linalg::All(), 0);
CheckPreLabels("pre", h_label,
[](auto beg, auto end, auto op) { return std::all_of(beg, end, op); });
}

#if !defined(XGBOOST_USE_CUDA)
void PreCache::InitOnCUDA(Context const*, MetaInfo const&) { common::AssertGPUSupport(); }
#endif // !defined(XGBOOST_USE_CUDA)

void MAPCache::InitOnCPU(Context const*, MetaInfo const& info) {
auto const& h_label = info.labels.HostView().Slice(linalg::All(), 0);
CheckMapLabels(h_label, [](auto beg, auto end, auto op) { return std::all_of(beg, end, op); });
CheckPreLabels("map", h_label,
[](auto beg, auto end, auto op) { return std::all_of(beg, end, op); });
}

#if !defined(XGBOOST_USE_CUDA)
Expand Down
7 changes: 6 additions & 1 deletion src/common/ranking_utils.cu
Original file line number Diff line number Diff line change
Expand Up @@ -205,8 +205,13 @@ void NDCGCache::InitOnCUDA(Context const* ctx, MetaInfo const& info) {
[=] XGBOOST_DEVICE(std::size_t i) { d_discount[i] = CalcDCGDiscount(i); });
}

void PreCache::InitOnCUDA(Context const* ctx, MetaInfo const& info) {
auto const d_label = info.labels.View(ctx->gpu_id).Slice(linalg::All(), 0);
CheckPreLabels("pre", d_label, CheckMAPOp{ctx->CUDACtx()});
}

void MAPCache::InitOnCUDA(Context const* ctx, MetaInfo const& info) {
auto const d_label = info.labels.View(ctx->gpu_id).Slice(linalg::All(), 0);
CheckMapLabels(d_label, CheckMAPOp{ctx->CUDACtx()});
CheckPreLabels("map", d_label, CheckMAPOp{ctx->CUDACtx()});
}
} // namespace xgboost::ltr
Loading

0 comments on commit 9fbde21

Please sign in to comment.