Feature/add dmrl: Add DMRL Model #597

mabeckers · 2024-03-11T19:29:12Z

Description

I added Disentangled Multimodal Representation Learning (https://arxiv.org/pdf/2203.05406.pdf) as the DMRL model to cornac.
In the context of this addition I had to:

add the PWLearningSampler: A Sampler to be wrapped around PyTorche DataLoader class that does the necessary loading of the data in a way that DMRL training mechanism requires it (pairwise based ranking approach)
add the TransformersTextModality (as described in the paper) to encode textual features into latent space
add the TransformersVisionModality (as described in the paper) to encode visual features into latent space
add the DistanceCorrelationCalculator which can be use to calculate the disentangled loss (as described in the paper)
add dmrl and recom_dmrl as described by paper and cornac framework
added tests for all above described modules
had to modify BaseMethod to include TransformersTextModality as allowed testmodality
TransformersVision Modality not yet included in BaseMehthod as not used in dmrl examples (preencoded cornac vision features were used). Please add if wanted.

Checklist:

I have added tests.
I have updated the documentation accordingly.
I have updated README.md (if you are adding a new model).
I have updated examples/README.md (if you are adding a new example).
I have updated datasets/README.md (if you are adding a new dataset).

tqtg · 2024-03-12T19:52:05Z

Hi @mabeckers, thanks for the contribution. It's great to see DMRL being added into Cornac. However, there are a few things that we might need to reconsider. First, each model in Cornac is very self-contained and model dependencies should not be added as global requirements. This is to minimize the maintenance effort for the core functions, also to facilitate a wider-range of model implementation. With that said, there are two directions to proceed with the DMRL model:

Considering the model taking in raw text and raw video. With this approach, we should bring the text/video transformer-based encoders as part of the model implementation. For example, CDL model has an autoencoder for text.
Considering the model taking in text/video features. In this case, we should do text/video encoding separately (possibly as part of the example) prior to model training in Cornac. With the embedding ready, we simply employ the FeatureModality for either text/video. We can consider if additional VideoModality is needed or ImageModality could be used as an alternative for data input.

Hope that my explanation is clear enough. Happy to chat more.

mabeckers · 2024-03-12T21:52:23Z

Hi @tqtg. thanks for taking a look at my PR. Yeah I notice that the two new transformer modalities introduced more general dependencies. I can go ahead and move those modules inside of recom_dmrl, that way the model receives basic text and image as input. I will make it general so that in case one already has encoded features from somewhere (say it comes with the example) the model can take that in as well and will not run another layer of encoding on top of that feature set.
Does that sound good to you?

Thanks,
Max

tqtg · 2024-03-13T01:24:29Z

sounds good to me. Let's do that and see how it goes.

mabeckers · 2024-03-15T22:15:09Z

Made the requested changes, please let me know if there's anything else I can change for this PR. Also remerged with the latest cornac master.
Thanks!

tqtg · 2024-03-16T18:41:02Z

@mabeckers I did some changes to make the tests work and also refactoring. Please have a look and see if they make sense to you.

mabeckers · 2024-03-18T15:01:02Z

Everything looks great to me!

tqtg · 2024-03-18T18:41:48Z

Hey @mabeckers, there is something that we need to modify about the model input (text and image modalities). By design, we don't input the modalities directly to the model, but we input them to an evaluation method (e.g., RatioSplit). The reason is that the modalities will be aligned with user/item data splitting and user/item ID being mapped properly. Taking CDL model as an example, we input text modality to the RatioSplit eval method (here) and we can access the text modality inside the model implementation via the train_set (here). Can we work on this last change before we merge the model into Cornac?

mabeckers · 2024-03-18T20:08:34Z

Hey @tqtg Yeah I understand that's how the cornac framework works with modalities, which is why until commit 9fc96b3 I had it that way and was feeding modalities from the outside to the RatioSplit Instance. I only changed it and moved them inside the model because you mentioned you didn't want to add any general dependencies (such as new TransformerModalities) to the cornac core but move that into the DMRL folder and have the model accept raw text and images. That's why I moved it out of RatioSplit. I am happy to reverse the commit back to the earlier version and introduce TransformerVisionModadality and TransformerTextModality as new general modality encoders. I can of course also just keep them in the DMRL folder and still use them as normal modalities and input to the RatioSplit instance. Just let me know which way you would prefer it.
Thanks!

tqtg · 2024-03-18T20:15:19Z

My point is that you can reuse TextModality and ImageModality to hold the image/text corpus and input them into RatioSplit to perform data splitting. The only part we want to move inside model implementation is where we use Transformers to encode the raw data. Does that make sense to you?

mabeckers · 2024-03-19T03:02:32Z

Ok I see. So if I am understanding you correctly you would want the example file running the DMRL example to look something like this?:

"""Example for Disentangled Multimodal Recommendation, with only feedback and textual modality.
For an example including image modality please see dmrl_clothes_example.py"""

import cornac
from cornac.data import Reader
from cornac.datasets import citeulike
from cornac.eval_methods import RatioSplit
from cornac.models.dmrl.recom_dmrl import TextModalityInput

The necessary data can be loaded as follows

docs, item_id_ordering_text = citeulike.load_text()
feedback = citeulike.load_feedback(reader=Reader(item_set=item_id_ordering_text))

text_modality_input = TextModalityInput(item_id_ordering_text, docs)

Instantiate DMRL recommender

dmrl_recommender = cornac.models.dmrl.DMRL(
batch_size=4096,
epochs=20,
log_metrics=False,
learning_rate=0.01,
num_factors=2,
decay_r=0.5,
decay_c=0.01,
num_neg=3,
embedding_dim=100,
text_features=text_modality_input)

NEW METHOD THAT HOLDS THE TRANSFORMER ENCODING WITHIN DMRL MODEL:

item_text_modality = dmrl_recommender.encode_text() # returns a generic feature modality (or even a TextModality) # where pre-encoded text is given in .features attribute and uses Transformer internally.

Define an evaluation method to split feedback into train and test sets

ratio_split = RatioSplit(
data=feedback,
test_size=0.2,
exclude_unknowns=True,
verbose=True,
seed=123,
rating_threshold=0.5,
item_text = item_text_modality
)

Use Recall@300 for evaluations

rec_300 = cornac.metrics.Recall(k=300)
prec_30 = cornac.metrics.Precision(k=30)

Put everything together into an experiment and run it

cornac.Experiment(eval_method=ratio_split, models=[dmrl_recommender], metrics=[prec_30, rec_300]).run()

tqtg · 2024-03-19T06:13:27Z

@mabeckers I made some changes to illustrate my idea. Please have a look and let me know if they make sense to you. We can further refactor the code to remove some unused parts.

mabeckers · 2024-03-19T16:56:51Z

@tqtg Had to set preencode=True (that means to be pre-encoded as part of TransformersModality init, preencoded means it's already pre-encoded from outside), but other than that looks very good! I understand now what you meant. We use TextModality for data splitting and id mapping on outside and "overwrite" it on the inside with TransformerModalities. Just running some final checks then will commit! Thanks for showing me this way of doing it. Only downside here is that we call vectorizer.fit_transform(self.corpus) in _build_text() of the TextModality when all we want is _swap_text() ... so a little overhead but I'm fine doing it that way :)

tqtg · 2024-03-19T17:18:44Z

@mabeckers OK, I though it should be encoded batch by batch during training thus preencode=False. Anw, please help check because I might misinterpret your implementation.

For the basic text transformation overhead, I'm aware of that and it might be an issue with a big text corpus. I was thinking of using tokenizer as the indicator whether we want to do any transformation or not. If the tokenizer is not provided during the initialization of the TextModality, we just bypass the _build_text() call. What do you think?

mabeckers · 2024-03-19T21:57:18Z

yeah the tokenizer is a good idea. That would make sense. The TransformerModalities work both in batch as well as pre-encoding. Just using them as pre-encoding in my examples bc it makes runtime a lot faster. I just pushed my final changes and checks. Thanks

tqtg · 2024-03-20T02:03:10Z

yeah the tokenizer is a good idea. That would make sense. The TransformerModalities work both in batch as well as pre-encoding. Just using them as pre-encoding in my examples bc it makes runtime a lot faster. I just pushed my final changes and checks. Thanks

Cool! This PR looks good to me. Let's merge it and have another one to update the TextModality. Thanks @mabeckers!

mabeckers added 27 commits November 25, 2023 18:34

setup of DMRL und pwlearning sampler

d1ebf3a

added pairwise based loss

40efdbd

added distance calculation module for disentangled loss

f9237d0

commit for now

d7d9d09

add print for device

984967d

added tensorboard and gradient clipping

a25dd86

current state

c7aacb6

tensorboard scalars to cpu

a1a7b79

ui_ratings corrected

fc942d4

add self.device for scoring

c6125f1

encoded corpus to device as well

f1c8d78

one more to cpu needed in scoring function

fa3735d

adjuste learning rate and adamW optimizer

84b02f1

docstrings and minor changes

df1ae95

added dmrl clothes example

1e02aaf

added transformer vision module

ed51262

add secon text modality to base method

7e91c00

Merge branch 'master' into feature/add_dmrl

bcbe734

remove old set.py

e5f78be

remove devcontainer setup

2285430

add dmrl to readme and models.rst

b900729

remove temp folder

bfb48d1

add DMRL to examples readme

75af683

remove test_dmrl as we have examples

579b160

add requirements

5c90b4e

only import tensorboard if needed

effab3d

add torchvision to reqs

9fc96b3

tqtg added 5 commits March 16, 2024 17:45

move test files to dmrl folder in tests

09059f8

update requirements

f881b51

refactor code

d269e60

add requests as a dependency

eff55f1

update README

7b99127

tqtg approved these changes Mar 16, 2024

View reviewed changes

darrylong added the models New models, changes to models label Mar 18, 2024

tqtg added 3 commits March 18, 2024 10:12

Sort models chronologically in models.rst

35db270

Update README.md

1deb49e

Sort examples alphabetically

a29b9fa

tqtg added 4 commits March 19, 2024 05:56

refactor model code

6bd0047

update example

f70b539

update example

991a923

fix item image

2cb099c

mabeckers added 2 commits March 19, 2024 21:56

renmoved modality input classes and added assertions

3bc2c22

added pythonpath to pytest ini

cb6c77c

Merge branch 'master' into feature/add_dmrl

240a9c2

tqtg merged commit 296d2d9 into PreferredAI:master Mar 20, 2024
12 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/add dmrl: Add DMRL Model #597

Feature/add dmrl: Add DMRL Model #597

mabeckers commented Mar 11, 2024

tqtg commented Mar 12, 2024

mabeckers commented Mar 12, 2024

tqtg commented Mar 13, 2024

mabeckers commented Mar 15, 2024

tqtg commented Mar 16, 2024

mabeckers commented Mar 18, 2024

tqtg commented Mar 18, 2024

mabeckers commented Mar 18, 2024

tqtg commented Mar 18, 2024

mabeckers commented Mar 19, 2024

tqtg commented Mar 19, 2024

mabeckers commented Mar 19, 2024 •

edited

Loading

tqtg commented Mar 19, 2024 •

edited

Loading

mabeckers commented Mar 19, 2024

tqtg commented Mar 20, 2024

Feature/add dmrl: Add DMRL Model #597

Feature/add dmrl: Add DMRL Model #597

Conversation

mabeckers commented Mar 11, 2024

Description

Checklist:

tqtg commented Mar 12, 2024

mabeckers commented Mar 12, 2024

tqtg commented Mar 13, 2024

mabeckers commented Mar 15, 2024

tqtg commented Mar 16, 2024

mabeckers commented Mar 18, 2024

tqtg commented Mar 18, 2024

mabeckers commented Mar 18, 2024

tqtg commented Mar 18, 2024

mabeckers commented Mar 19, 2024

The necessary data can be loaded as follows

Instantiate DMRL recommender

NEW METHOD THAT HOLDS THE TRANSFORMER ENCODING WITHIN DMRL MODEL:

Define an evaluation method to split feedback into train and test sets

Use Recall@300 for evaluations

Put everything together into an experiment and run it

tqtg commented Mar 19, 2024

mabeckers commented Mar 19, 2024 • edited Loading

tqtg commented Mar 19, 2024 • edited Loading

mabeckers commented Mar 19, 2024

tqtg commented Mar 20, 2024

mabeckers commented Mar 19, 2024 •

edited

Loading

tqtg commented Mar 19, 2024 •

edited

Loading