Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction for the Learn-to-Rank Application #3326

Closed
A4Ayou opened this issue Aug 20, 2020 · 7 comments
Closed

Prediction for the Learn-to-Rank Application #3326

A4Ayou opened this issue Aug 20, 2020 · 7 comments

Comments

@A4Ayou
Copy link

A4Ayou commented Aug 20, 2020

Hello,

I developed a learning-to-rank (LTR) model by setting the objective to lambdarank, and training the model using the .fit() function with an LGB Data Set where I passed (1) the features data, (2) the label data, and (3) the group of the query.

To predict the results, I am using the .predict() function. But this function only takes as input the array of features, and do not consider the argument group to indicate the query.

My question is then: when using prediction for an LTR model, should we pass data for one query at a time in the .predict() function, or can we directly pass a set of different queries?

For example, let's say I want to predict the ranking from 2 distinct queries A and B. I get my features for the documents in A X(A) and for the documents in B X(B). Should I run.predict()separately on X(A) and X(B) or can I just concatenate both X(A) and X(B) into a single numpy object X and run X?

My intuition would be to go with the first option, but it may also not be the most convenient. For my applications, I am dealing with thousands of queries. Does that mean that I have to run the prediction thousands of time to get the results?

This is a follow up on the issue #1398 which was closed (#1398)

Please let me know if something is unclear

Thanks!

Have a nice day!

@guolinke
Copy link
Collaborator

Group is not needed in prediction time. The prediction results are pointwise, and you can rank them by yourself.

@A4Ayou
Copy link
Author

A4Ayou commented Aug 20, 2020

Group is not needed in prediction time. The prediction results are pointwise, and you can rank them by yourself.

Thank you for the quick answer !

Follow-up question (just by curiosity): would it be more efficient to run the prediction with the pairwise approach (as it is done for training)? Or is it not feasible?

@guolinke
Copy link
Collaborator

I think the current solution (listwise) is more efficient than pairwise.

@A4Ayou
Copy link
Author

A4Ayou commented Aug 20, 2020

But the lambdarank implemented in LightGBM training module is pairwise and not listwise, correct?

And independent from this question, why are we using pointwise approach instead of pairwise/listwise for prediction?

@guolinke
Copy link
Collaborator

@A4Ayou
it is listwise, as all pairs in one list are used in training. And then, based on the scores of all pairs, the pointwise score is aggregated. In short, the lambdarank will calculate the score listwise, then reduce it pointwise. and the tree is learned to aggregated pointwise score. Therefore, it does not need to use listwise in prediction.

@A4Ayou
Copy link
Author

A4Ayou commented Aug 24, 2020

@A4Ayou
it is listwise, as all pairs in one list are used in training. And then, based on the scores of all pairs, the pointwise score is aggregated. In short, the lambdarank will calculate the score listwise, then reduce it pointwise. and the tree is learned to aggregated pointwise score. Therefore, it does not need to use listwise in prediction.

Thank you !

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants