Add RoPE Interpolation #3564

shahules786 · 2023-07-11T18:36:15Z

Added support for RopE interpolation via the SuperHOT method and its variants proposed in
reddit
scaled-rope

Supported methods

Linear scaling
NTK aware scaling
Dynamic NTK

Supported Models

LLAMA
Falcon

This can be easily extended and experimented with by configuring two parameters
superhot and superhot_config

github-actions · 2023-07-11T18:38:53Z

❌ pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

andreaskoepf

Thanks a lot for adding the scaled-rope implementation! I left some minor comments.

andreaskoepf · 2023-07-12T08:09:51Z

model/model_training/configs/config.yaml

@@ -779,3 +782,32 @@ debug:
  verbose: true
  num_train_epochs: 0.2
  dtype: fp32
+
+patching-test:


Could we change the name to something like "rope_scaling_test"?

model/model_training/models/rope.py

model/model_training/trainer_sft.py

andreaskoepf · 2023-07-12T08:22:28Z

model/model_training/trainer_sft.py

    model = get_model(training_conf, tokenizer)

+    from model_training.models.patching import RopePatch


Is there a benefit of late import, otherwise I would recommend to move this to the other imports at the beginning of the file (to make it easier to see all main dependencies).

model/model_training/configs/config.yaml

shahules786 added 8 commits July 11, 2023 12:32

NTK scaled rope

1cdb388

added patching for falcon

53189e4

fix for falcon layers

252e96b

add rope scaling

ef45c26

add dynamic ntk

44b9522

rename

b40c93d

added rope

e5620b3

added sample config

d0e9587

shahules786 marked this pull request as ready for review July 11, 2023 18:37

shahules786 requested review from theblackcat102, sanagno, dvruette, andreaskoepf and yk as code owners July 11, 2023 18:37

shahules786 added 2 commits July 11, 2023 18:42

pre-commit

522594e

rmv changes

cc14e5a

andreaskoepf requested changes Jul 12, 2023

View reviewed changes

shahules786 added 2 commits July 12, 2023 10:34

cleanup

eaae56c

added docs

ef03d9b

shahules786 requested a review from andreaskoepf July 12, 2023 11:15

andreaskoepf approved these changes Jul 12, 2023

View reviewed changes

andreaskoepf merged commit 018657b into LAION-AI:main Jul 12, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RoPE Interpolation #3564

Add RoPE Interpolation #3564

shahules786 commented Jul 11, 2023

github-actions bot commented Jul 11, 2023

andreaskoepf left a comment

andreaskoepf Jul 12, 2023

andreaskoepf Jul 12, 2023

		model = get_model(training_conf, tokenizer)

		from model_training.models.patching import RopePatch

Add RoPE Interpolation #3564

Add RoPE Interpolation #3564

Conversation

shahules786 commented Jul 11, 2023

github-actions bot commented Jul 11, 2023

andreaskoepf left a comment

Choose a reason for hiding this comment

andreaskoepf Jul 12, 2023

Choose a reason for hiding this comment

andreaskoepf Jul 12, 2023

Choose a reason for hiding this comment