build: install additional fms-acceleration plugins #350

anhuong · 2024-09-25T21:28:54Z

Description of the change

Users of the image will be able to automatically use padding free, multipack, and fast kernels via the fms-acceleration plugins.

Related issue number

NA

How to verify the PR

Tested the installation and running tuning with and without the flags. Just because they are installed does not mean they are enabled, the user must still pass the necessary flags/configs.

Was the PR tested

I have added >=1 unit test(s) for every new method I have added.
I have ensured all unit tests pass

Signed-off-by: Anh Uong <anh.uong@ibm.com>

github-actions · 2024-09-25T21:29:05Z

Thanks for making a pull request! 😃
One of the maintainers will review and advise on the next steps.

Signed-off-by: Anh Uong <anh.uong@ibm.com>

Ssukriti · 2024-09-25T21:38:08Z

README.md

@@ -671,6 +672,16 @@ Notes:
    - works only for *multi-gpu*.
    - currently only includes the version of *multipack* optimized for linear attention implementations like *flash-attn*.

+Note: To pass the above flags via a JSON config, each of the flags expects the value to be a mixed type list, so the values must be a list. For example:


in line 653: - attention_and_distributed_packing (experimental) we have mentioned it as experimental, but we are talking about releasing it to product with openshift 2.14, is it still experimental or ready for release @fabianlim @anhuong

I believe I can mark these as ready in this PR as well and no longer experimental from earlier conversation with Fabian. Will wait on @fabianlim to review as well

Padding free is already upstreamed to HF main. Instruct lab is using multipack, and this has been tested for up to about 500K samples in the dataset. Beyond that, I am not aware of the speed performance of multipack, as it runs through the lengths of each example before the start of every epoch.

Is there any issue to including these new plugins into product if the fused-op-and-kernels plugin uses Apache 2.0 license but (contains extracted code) from unsloth?

@anhuong yes that is a good point thanks for bringing this up.

unsloth is Apache 2.0, but we were disturbed by those "comments" peppered in the code.

we only extracted part of the unsloth code, and we did the extraction on a version that existed before those "comments" appeared (as far as we could tell)

all extracted portions contained the relevant License Notice headers credited to the owners of unsloth

Beyond what we have done, I am not any more knowledgable to say what is permissible and what is not. This requires a person knowledgable in these things to run through.

The peft plugin also contains a triton-only extraction of the ModelCloud fork of AutoGPTQ https://github.com/foundation-model-stack/fms-acceleration/tree/main/plugins/accelerated-peft#gptq-loras-autogptq---current-implementation-vs-legacy-implementation. The fork is released as Apache 2.0

@anhuong Code scan should pass with no issue as regarding the inclusion of the new plug-ins, and as noted by @fabianlim unsloth is apache 2.0.

fabianlim

LGTM. ILAB training uses multipack, so it to some extent quite ready, but see my comment

Signed-off-by: Anh Uong <anh.uong@ibm.com>

anhuong added 2 commits September 25, 2024 09:26

deps: add fms-acceleration fast kernels and padding plugins

fab2943

Signed-off-by: Anh Uong <anh.uong@ibm.com>

docs: using fms-acceleration flags as json

ad943f5

Signed-off-by: Anh Uong <anh.uong@ibm.com>

anhuong requested review from Ssukriti and alex-jw-brooks as code owners September 25, 2024 21:28

github-actions bot added the build label Sep 25, 2024

Merge branch 'main' into enable-plugins

069b887

Signed-off-by: Anh Uong <anh.uong@ibm.com>

Ssukriti requested review from fabianlim and removed request for alex-jw-brooks September 25, 2024 21:35

Ssukriti reviewed Sep 25, 2024

View reviewed changes

anhuong changed the title ~~build: install fms-acceleration plugins to enable padding free, multipack, and fast kernels~~ build: install additional fms-acceleration plugins Sep 25, 2024

fabianlim previously approved these changes Sep 25, 2024

View reviewed changes

anhuong added 2 commits September 25, 2024 16:43

remove experimental from padding free and multipack

93baeb0

Signed-off-by: Anh Uong <anh.uong@ibm.com>

Merge branch 'main' into enable-plugins

e863acf

Signed-off-by: Anh Uong <anh.uong@ibm.com>

anhuong dismissed fabianlim’s stale review via e863acf September 25, 2024 22:44

remove experimental from readme

015b8f9

Signed-off-by: Anh Uong <anh.uong@ibm.com>

Ssukriti approved these changes Sep 26, 2024

View reviewed changes

anhuong merged commit 1350f8a into foundation-model-stack:main Sep 26, 2024
8 checks passed

anhuong mentioned this pull request Sep 26, 2024

Missing dependency in fms-hf-tuning image for padding_free and multipack #347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: install additional fms-acceleration plugins #350

build: install additional fms-acceleration plugins #350

anhuong commented Sep 25, 2024

github-actions bot commented Sep 25, 2024

Ssukriti Sep 25, 2024

anhuong Sep 25, 2024 •

edited

Loading

fabianlim Sep 25, 2024

anhuong Sep 25, 2024

fabianlim Sep 25, 2024

wynterl Sep 26, 2024 •

edited

Loading

fabianlim left a comment

build: install additional fms-acceleration plugins #350

build: install additional fms-acceleration plugins #350

Conversation

anhuong commented Sep 25, 2024

Description of the change

Related issue number

How to verify the PR

Was the PR tested

github-actions bot commented Sep 25, 2024

Ssukriti Sep 25, 2024

Choose a reason for hiding this comment

anhuong Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

fabianlim Sep 25, 2024

Choose a reason for hiding this comment

anhuong Sep 25, 2024

Choose a reason for hiding this comment

fabianlim Sep 25, 2024

Choose a reason for hiding this comment

wynterl Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

fabianlim left a comment

Choose a reason for hiding this comment

anhuong Sep 25, 2024 •

edited

Loading

wynterl Sep 26, 2024 •

edited

Loading