FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) #2104

suyang160 · 2024-09-26T14:50:13Z

Previously, the weight matrix was converted to float32 without considering the need for transposition. This update ensures that the weight matrix is transposed when the fan_in_fan_out condition is met, resolving dimension mismatch issues during GPT-2 training.

BenjaminBossan

Thanks for this PR. First of all, my apologies for not responding earlier. The notification somehow slipped my attention and I just wasn't aware of this PR. In the future, feel free to ping my after a couple of days when there is no response.

The fix looks good, thanks for that. Let's add some tests to ensure that this bug doesn't happen again. For this, could you please add the following tests to the existing PiSSA tests:

    @pytest.mark.parametrize("device", ["cuda", "cpu"])
    def test_gpt2_pissa_4bit(self, device, tmp_path):
        # see 2104
        self.get_errors(bits=4, device=device, model_id="gpt2", tmp_path=tmp_path)

    @pytest.mark.parametrize("device", ["cuda", "cpu"])
    def test_gpt2_pissa_8bit(self, device, tmp_path):
        # see 2104
        self.get_errors(bits=8, device=device, model_id="gpt2", tmp_path=tmp_path)

For this to work, we need to make some changes to these lines though:

peft/tests/test_gpu_examples.py

Line 1721 in 5e91b54

if isinstance(module, torch.nn.Linear) and "lm_head" not in name:

peft/tests/test_gpu_examples.py

Line 1730 in 5e91b54

if isinstance(module, torch.nn.Linear) and "lm_head" not in name:

There, we need to change isinstance(module, torch.nn.Linear) to isinstance(module, (torch.nn.Linear, Conv1D)), where Conv1D is imported from transformers.pytorch_utils.

…SA initialization (huggingface#2103) This update addresses an issue where the weight matrix was converted to float32 without considering the need for transposition. The weight matrix is now transposed when the fan_in_fan_out condition is met, resolving dimension mismatch issues during GPT-2 training. To ensure this fix is robust, tests have been updated to include parameterized cases for different devices and bit configurations. Additionally, the isinstance checks have been modified to include Conv1D layers, ensuring all relevant layers are processed correctly.

suyang160 · 2024-10-08T16:19:55Z

@BenjaminBossan Thank you for your feedback and suggestions, I've updated this PR.

HuggingFaceDocBuilderDev · 2024-10-08T16:24:27Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan

Thanks a lot for this fix, LGTM.

…ce#2104) Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization. Co-authored-by: Yang Su <suyang360@gmail.com>

BenjaminBossan requested changes Oct 8, 2024

View reviewed changes

suyang160 force-pushed the bugfix/issue-2103-fix-pissa-init branch from 2a513f6 to 4d77af8 Compare October 8, 2024 16:10

suyang160 force-pushed the bugfix/issue-2103-fix-pissa-init branch from 4d77af8 to 1bf7d7a Compare October 8, 2024 16:18

BenjaminBossan approved these changes Oct 8, 2024

View reviewed changes

BenjaminBossan merged commit a724834 into huggingface:main Oct 8, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) #2104

FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) #2104

suyang160 commented Sep 26, 2024

BenjaminBossan left a comment

suyang160 commented Oct 8, 2024

HuggingFaceDocBuilderDev commented Oct 8, 2024

BenjaminBossan left a comment

FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) #2104

FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) #2104

Conversation

suyang160 commented Sep 26, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

suyang160 commented Oct 8, 2024

HuggingFaceDocBuilderDev commented Oct 8, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment