Add Sequential Recommendation models #543

hieuddo · 2023-11-01T19:00:13Z

Description

Most of the currently supported models in cornac are categorized as general recommenders. Recently, sequential recommendations have gained more and more attention (e.g., the most popular topic in RecSys'23). It would be nice if cornac extends to adopt some more recommendation tasks, especially sequential/session-based recommendation (next item(s), next-basket).

Expected behavior with the suggested feature

Add generic data pipeline (i.e., parser, eval_method, evaluation) for next-basket, next-itemS, and next-item recommendations.
Add some fundamental and popular models (e.g., RNN, GRU4Rec).
Adopt general recommender models to sequential context, for example:
- kNN: nearest items to current session's items
- BPR: aggregated current session's items (e.g., avg, weighted) as "user" representation

The text was updated successfully, but these errors were encountered:

hieuddo · 2023-11-01T19:12:34Z

I, myself, will try to integrate GRU4Rec if this idea aligns with cornac's scope.

Interesting note: it's important to implement the algorithms fully and correctly. Recently, the GRU4Rec's authors assessed some re-implementations and found out most (if not all) of them are partially flawed or missing some key features, even in RecSys's endorsed frameworks, like Microsoft's Recommenders and RecPack.
Reference: The Effect of Third Party Implementations on Reproducibility

tqtg · 2023-11-01T19:44:39Z

This is awesome!
We were thinking about the family of sequential models when Trong was still with us, though we didn't have enough capacity to put them inside Cornac. If you're interested in doing this, let's chat more and see how we can organize Cornac to better support the models. I believe this will be a big enough change, together with graph-based models, to release Cornac version 2.

lthoang · 2023-11-02T07:26:44Z

For next-basket recommendation task, I found this interesting paper "A Next Basket Recommendation Reality Check", specifying some basic baselines as well as how to evaluate NBR models thoroughly.
Source code: https://github.com/liming-7/A-Next-Basket-Recommendation-Reality-Check

hieuddo · 2023-11-02T07:42:29Z

Some more references:

Two sequential recommendation frameworks endorsed by ACMRecSys:

Frameworks from some published papers, e.g:

https://github.com/rn5l/session-rec

Let's take some time and later discuss our pipeline for generic next-basket/item(s) tasks.

tqtg · 2023-11-02T17:18:36Z

A few questions to start with:

How to load data? We support UIRT data format in Reader to deal with timestamp. Do we need more than that?
How to implement training loop? We have user_iter and item_iter in Dataset. Also, we can retrieve chrono_user_data/chrono_item_data for for training/evaluation.
How to evaluate model performance? I suppose we still follow standard ranking evaluation scheme? Do we have additional approaches to do evaluation (maybe for next basket)? If yes, let's think through it with the current evaluation scheme in Cornac. I guess it might be easier to start with next item recommendation first.

lthoang · 2023-11-03T15:47:29Z

Also noting that the current Dataset does not support manipulating repeating items for next-item/basket recommendation.

tqtg · 2023-11-08T17:21:00Z

Let's have an option to keep interactions between a pair of user-item if timestamps provided.

tqtg · 2023-11-18T01:17:16Z

Few things to note:

'USIT' data format for sequential recs
'UBIT' data format for basket recs
Consider using json to represent extras (e.g., order quantity, price) for each interaction in the basket data.

@lthoang @hieuddo

lthoang · 2023-12-01T15:30:32Z

We should consider to support some augmentation strategy (e.g., slide-window) for user to increase their training data.

lthoang · 2023-12-18T15:34:46Z

We are currently consider the last item in sequence as the target test instance. For example, for a sequence a b c d, the first 3 items a b c are the inputs and the last item d is considered as output. Eventually, the total number of test instances are equivalent to the total number of test sequences.

Looking at the source code of GRU4Rec https://github.com/hidasib/GRU4Rec_PyTorch_Official/, I find that they consider every next items as ground truth for evaluation. For example, for a test sequence a b c d, the test ground truth are b, c, d. The respecting inputs are a, a b, a b c or just b, c, d for GRU4Rec.

Should we also support the above scenario? @tqtg @hieuddo

lthoang · 2023-12-22T08:18:16Z

For user_based evaluation, take HGRU4Rec https://arxiv.org/pdf/1706.04148.pdf as example, beside user_idx, it also need the user's sessions (sorted chronologically) for constructing user hidden factors passing through sessions.

In https://github.com/mquad/hgru4rec/, although every user is initialize with zeros vector. The history sequences definitely affect the final representation of user vector.

tqtg · 2023-12-24T19:07:50Z

We are currently consider the last item in sequence as the target test instance. For example, for a sequence a b c d, the first 3 items a b c are the inputs and the last item d is considered as output. Eventually, the total number of test instances are equivalent to the total number of test sequences.

Looking at the source code of GRU4Rec https://github.com/hidasib/GRU4Rec_PyTorch_Official/, I find that they consider every next items as ground truth for evaluation. For example, for a test sequence a b c d, the test ground truth are b, c, d. The respecting inputs are a, a b, a b c or just b, c, d for GRU4Rec.

Should we also support the above scenario? @tqtg @hieuddo

Yes, we should definitely support this. Do we already have a solution?

tqtg · 2023-12-24T19:10:47Z

@lthoang Let's create different issues/features for your suggestions raised above. We will try to address them separately from this general feature.

lthoang · 2023-12-25T00:22:21Z

We are currently consider the last item in sequence as the target test instance. For example, for a sequence a b c d, the first 3 items a b c are the inputs and the last item d is considered as output. Eventually, the total number of test instances are equivalent to the total number of test sequences.

Looking at the source code of GRU4Rec https://github.com/hidasib/GRU4Rec_PyTorch_Official/, I find that they consider every next items as ground truth for evaluation. For example, for a test sequence a b c d, the test ground truth are b, c, d. The respecting inputs are a, a b, a b c or just b, c, d for GRU4Rec.

Should we also support the above scenario? @tqtg @hieuddo

For user_based evaluation, take HGRU4Rec https://arxiv.org/pdf/1706.04148.pdf as example, beside user_idx, it also need the user's sessions (sorted chronologically) for constructing user hidden factors passing through sessions.

In https://github.com/mquad/hgru4rec/, although every user is initialize with zeros vector. The history sequences definitely affect the final representation of user vector.

Let's move these two into new features.

hieuddo assigned darrylong, tqtg and hieuddo Nov 1, 2023

hieuddo assigned lthoang Nov 2, 2023

tqtg mentioned this issue Nov 9, 2023

Add next-basket recommendation evaluation method #545

Merged

5 tasks

hieuddo mentioned this issue Dec 8, 2023

Add next-item pipeline #561

Merged

6 tasks

tqtg closed this as completed Dec 26, 2023

This was referenced Jan 4, 2024

Add Session-based Recommendations With Recurrent Neural Networks (GRU4Rec) model #574

Merged

[FEATURE] Add evaluation on every next-item in test sequence #578

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Sequential Recommendation models #543

Add Sequential Recommendation models #543

hieuddo commented Nov 1, 2023 •

edited

Loading

hieuddo commented Nov 1, 2023

tqtg commented Nov 1, 2023

lthoang commented Nov 2, 2023 •

edited

Loading

hieuddo commented Nov 2, 2023

tqtg commented Nov 2, 2023

lthoang commented Nov 3, 2023

tqtg commented Nov 8, 2023

tqtg commented Nov 18, 2023 •

edited

Loading

lthoang commented Dec 1, 2023

lthoang commented Dec 18, 2023

lthoang commented Dec 22, 2023

tqtg commented Dec 24, 2023

tqtg commented Dec 24, 2023

lthoang commented Dec 25, 2023

Add Sequential Recommendation models #543

Add Sequential Recommendation models #543

Comments

hieuddo commented Nov 1, 2023 • edited Loading

Description

Expected behavior with the suggested feature

hieuddo commented Nov 1, 2023

tqtg commented Nov 1, 2023

lthoang commented Nov 2, 2023 • edited Loading

hieuddo commented Nov 2, 2023

tqtg commented Nov 2, 2023

lthoang commented Nov 3, 2023

tqtg commented Nov 8, 2023

tqtg commented Nov 18, 2023 • edited Loading

lthoang commented Dec 1, 2023

lthoang commented Dec 18, 2023

lthoang commented Dec 22, 2023

tqtg commented Dec 24, 2023

tqtg commented Dec 24, 2023

lthoang commented Dec 25, 2023

hieuddo commented Nov 1, 2023 •

edited

Loading

lthoang commented Nov 2, 2023 •

edited

Loading

tqtg commented Nov 18, 2023 •

edited

Loading