Update to latest version of transformers #31

akashsaravanan-georgian · 2023-03-03T18:11:32Z

Includes fixes to #9, #14, #29. Also potentially resolves #3, #7, #27, #28.

jminnion · 2023-03-07T23:14:42Z

One suggestion relevant to this PR (I'd be happy to create an issue for this if it would help):

Update multimodal_transformers/__init__.py line 4 (link to line) to update __version__

Same suggestion for setup.py line 3 (another __version__ variable there).

You folks are awesome, my team is attempting to use this branch/PR for our graduate degree capstone project, we're very appreciative of your work. Thank you!

akashsaravanan-georgian · 2023-03-08T18:35:12Z

Hey @jminnion, thank you for pointing out the inconsistency there! I've fixed it.

Thank you for using our library :) We plan on merging this branch to main and publishing a new release soon.

truskovskiyk

@akashsaravanan-georgian amazing work, please fix comments before merge

truskovskiyk · 2023-03-08T18:46:49Z

datasets/Melbourne_Airbnb_Open_Data/train_config.json

@@ -14,8 +14,8 @@
  "num_train_epochs": 5,
  "overwrite_output_dir": true,
  "learning_rate": 3e-3,
-  "per_device_train_batch_size": 12,


why do we need this change?

This was changed during a pass to standardize the training config formats across the different datasets. I modified the batch size from 12 -> 16 just to conform to the traditional approach of having the batch size be a power of 2.

truskovskiyk · 2023-03-08T18:47:05Z

datasets/PetFindermy_Adoption_Prediction/train_config.json

@@ -1,24 +1,25 @@
 {
-  "output_dir": "./logs_petfinder/",
+  "output_dir": "./logs_petfinder/gating_on_cat_and_num_feats_then_sum_full_model",


hm, why gating_on_cat_and_num_feats_then_sum_full_model ?

The existing config utilized this. The name is just a reference to the method used to combine the different embeddings obtained from the model. I just standardized the format in which the output dir was named across the three configs.

truskovskiyk · 2023-03-08T18:48:26Z

datasets/Womens_Clothing_E-Commerce_Reviews/train_config.json

  "tokenizer_name": "bert-base-uncased",
-  "per_device_train_batch_size": 12,
+  "use_simple_classifier": false,
+  "logging_dir": "./logs_clothing_review/bertbase_gating_on_cat_and_num_feats_then_sum_full_model_lr_3e-3/",


is it possible to avoid hardcoded paths like this?

What would you suggest instead? These seem to be primarily intended to serve as examples for users more than anything else.

multimodal_exp_args.py

multimodal_transformers/__init__.py

multimodal_transformers/model/tabular_transformers.py

setup.py

truskovskiyk · 2023-03-08T18:56:59Z

tests/test_model.py

+CONFIGS = [
+    "./tests/test_airbnb.json", 
+    "./tests/test_clothing.json", 
+    "./tests/test_petfinder.json"
+]
+
+MODELS = [
+    "albert-base-v2",
+    "bert-base-multilingual-uncased",
+    "distilbert-base-uncased",
+    "roberta-base",
+    "xlm-mlm-100-1280",
+    "xlm-roberta-base",
+    "xlnet-base-cased"
+]


try to use pytest fixture for this: https://docs.pytest.org/en/6.2.x/fixture.html

I'm not sure if a fixture is the best option here. We have only one test function we're using. I was under the impression that we'd use a fixture when we have multiple tests that require similar inputs not when we have one test with several different inputs.

angeliney

Thank you for updating this repo, @akashsaravanan-georgian !! I just tried running the tests on mac and debian. Both work well with Python 3.10 💯

setup.py

gileshall and others added 7 commits February 16, 2022 01:26

realign with current version of transformers library

c120338

Fix: multimodal_exp_args to support latest transformers.

129c441

Fix: Update to current stable transformers version.

227fd72

Fix: Add fix from issue #9.

9b21861

Fix: Add fix from issue #14.

3294ae2

Fix: Evaluation issues.

48048f6

Fix: Standardize train configs.

f67972f

This was referenced Mar 3, 2023

added compatibility with Transformer v4.5.1 #30

Closed

realign with current version of transformers library #15

Closed

akashsara added 2 commits March 3, 2023 18:30

Docs: Add comment documenting change to p.predictions.

4fd17c2

Docs: Resolve #19

1465eed

akashsaravanan-georgian linked an issue Mar 3, 2023 that may be closed by this pull request

Typo in example Forward Pass of Transformer With Tabular Models) #19

Closed

akashsara added 2 commits March 6, 2023 17:00

Fix: Resolve #32.

c518857

Fix: Use only 100 samples in debug mode.

242180a

akashsaravanan-georgian linked an issue Mar 6, 2023 that may be closed by this pull request

AttributeError: 'OneHotEncoder' object has no attribute 'get_feature_names' #32

Closed

akashsara added 2 commits March 7, 2023 18:49

Fix: XLNet Config has no use_cache attribute.

3117e15

Fix: use_cache -> use_mems for future proofing.

8bba5e5

akashsara added 4 commits March 8, 2023 18:22

Fix: Force summary_proj_to_labels=False for XLNet and XLM.

8d8202a

Fix: Set new library version.

00897e9

Fix: Include sacremoses for certain models and pytest in setup.py

265478e

Tests: Add basic model tests.

2c61362

akashsaravanan-georgian requested a review from angeliney March 8, 2023 18:35

akashsaravanan-georgian requested a review from truskovskiyk March 8, 2023 18:35

akashsaravanan-georgian marked this pull request as ready for review March 8, 2023 18:37

truskovskiyk approved these changes Mar 8, 2023

View reviewed changes

akashsara added 6 commits March 8, 2023 19:19

Docs: Add maintainer info to setup.py

e6d09ae

Chore: Include versions for all libraries in setup.py

54640f0

Fix: Single source of truth for project version number

a63a1bf

Feat: Make debug dataset size a training argument.

6685f4d

Tests: Run models for only 1 epoch.

517fa0d

Docs: Update note in test_model.py

507e756

angeliney approved these changes Mar 8, 2023

View reviewed changes

setup.py Outdated Show resolved Hide resolved

akashsara added 3 commits March 8, 2023 21:02

Fix: Update main.py to use newly added debug_dataset_size arg

52c59ce

Chore: Standardize output_dir naming format across datasets.

b84351c

Docs: Add Python 3.10 to setup.py

e201ed9

akashsaravanan-georgian requested a review from truskovskiyk March 8, 2023 23:22

akashsara added 2 commits March 9, 2023 19:55

Fix: Set library versions to support Python >=3.7

8faabd5

Fix: Update requirements.txt

d839824

truskovskiyk approved these changes Mar 10, 2023

View reviewed changes

akashsaravanan-georgian merged commit c341715 into master Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to latest version of transformers #31

Update to latest version of transformers #31

akashsaravanan-georgian commented Mar 3, 2023

jminnion commented Mar 7, 2023 •

edited

Loading

akashsaravanan-georgian commented Mar 8, 2023

truskovskiyk left a comment

truskovskiyk Mar 8, 2023

akashsaravanan-georgian Mar 8, 2023

truskovskiyk Mar 8, 2023

akashsaravanan-georgian Mar 8, 2023

truskovskiyk Mar 8, 2023

akashsaravanan-georgian Mar 8, 2023

truskovskiyk Mar 8, 2023

akashsaravanan-georgian Mar 8, 2023

angeliney left a comment

Update to latest version of transformers #31

Update to latest version of transformers #31

Conversation

akashsaravanan-georgian commented Mar 3, 2023

jminnion commented Mar 7, 2023 • edited Loading

akashsaravanan-georgian commented Mar 8, 2023

truskovskiyk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angeliney left a comment

Choose a reason for hiding this comment

jminnion commented Mar 7, 2023 •

edited

Loading