Vulkan build on MacOS #5311

dokterbob · 2024-02-03T18:02:09Z

Significant (~30%) performance boost on Intel Mac.

I'm getting up to ~10 toks/s with 28 layers offloaded to a 4GB AMD Radeon Pro 5500M. Up from ~7 tok/s on just CPU.

ggml-vulkan.cpp

philiptaron · 2024-02-04T01:27:09Z

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

dokterbob · 2024-02-06T14:58:01Z

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line?
60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

philiptaron · 2024-02-06T15:15:23Z

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line? 60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

Yeah, just remove all vulkan references from broken or badPlatform.

To install nix, follow https://github.com/DeterminateSystems/nix-installer. Then nix build github:dokterbob/llama.cpp/macos_vulkan. You can uninstall Nix afterwards if you don't want it around.

slp · 2024-02-07T23:40:24Z

This should probably have it's own issue, but I've tested this PR on an Apple Silicon M1 and noticed I was getting garbage output to my prompts when batch_size != 1. Rebuilding with LLAMA_VULKAN_RUN_TESTS=1 I found that only M-sized matmuls work properly, both S-sized and L-sized fail. Here's an example:

TEST F16_F32_S m=100 n=46 k=558 batch=2 split_k=1 matmul 4.609ms avg_err=6.90319
m = 32 n = 0 b = 0
Actual result:

               0       1       2       3       4       5       6       7       8       9
     27:    4.73   12.32    1.43    6.25   -2.21    7.54    6.00    1.09   13.29    6.43
     28:   -2.47    1.54   11.37    7.52  -15.77   11.56   -1.01   -2.09   -0.30  -11.93
     29:   -7.33   -4.47   -4.05   -6.62   -3.47   -7.88    0.44   -5.68  -15.74   -0.46
     30:    6.85    3.08    4.87    5.17    4.22    6.31  -12.52   -3.55   -4.75    2.67
     31:  -16.09    3.49   -8.56   -8.89   -0.33    6.08  -13.39    5.76  -13.02   -8.90
     32:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     33:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     34:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     35:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     36:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
Expected result:

               0       1       2       3       4       5       6       7       8       9
     27:    4.73   12.32    1.43    6.25   -2.21    7.54    6.00    1.09   13.29    6.43
     28:   -2.47    1.54   11.37    7.52  -15.77   11.56   -1.01   -2.09   -0.30  -11.93
     29:   -7.33   -4.47   -4.05   -6.62   -3.47   -7.88    0.44   -5.68  -15.74   -0.46
     30:    6.85    3.08    4.87    5.17    4.22    6.31  -12.52   -3.55   -4.75    2.67
     31:  -16.09    3.49   -8.56   -8.89   -0.33    6.08  -13.39    5.76  -13.02   -8.90
     32:   12.42    6.19   -2.81    9.91   -9.84    1.53   -5.20   -0.57   -5.79   -4.55
     33:   -7.44   -0.05   12.08   -1.73    5.63   -6.50   -9.00  -17.33   -5.36   -9.06
     34:   -3.51    0.46   -5.92   -3.95   -4.04   15.13   -1.47   -2.21    0.37   10.94
     35:    8.12    1.64  -13.92    3.49    7.47    8.52   -4.61   16.53    1.28   -4.64
     36:   -3.86   -9.55    7.20   -3.38    4.28    2.46   11.96   -4.71    0.67   16.10

@0cc4m is this something expected with integrated GPUs? Can I help debug this further somehow?

dokterbob · 2024-02-08T10:57:30Z

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line? 60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

Yeah, just remove all vulkan references from broken or badPlatform.

To install nix, follow https://github.com/DeterminateSystems/nix-installer. Then nix build github:dokterbob/llama.cpp/macos_vulkan. You can uninstall Nix afterwards if you don't want it around.

I can make requested patches today but testing them locally would take me longer, ETA ~week.

Is there a way we can get this in sooner? For Intel Mac users this really is a huge improvement.

0cc4m · 2024-02-08T12:32:09Z

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line? 60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

Yeah, just remove all vulkan references from broken or badPlatform.
To install nix, follow https://github.com/DeterminateSystems/nix-installer. Then nix build github:dokterbob/llama.cpp/macos_vulkan. You can uninstall Nix afterwards if you don't want it around.

I can make requested patches today but testing them locally would take me longer, ETA ~week.

Is there a way we can get this in sooner? For Intel Mac users this really is a huge improvement.

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

dokterbob · 2024-02-10T20:45:24Z

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

Sure, happy to test!

0cc4m · 2024-02-10T21:29:00Z

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

Sure, happy to test!

@dokterbob I added the checks. Can you check if it works on Apple?

I tried build it with nix locally and it didn't throw an error, but I have no idea how nix works or how to use it. @philiptaron Let me know if it looks correct. The CI should fail if it doesn't work, right?

philiptaron · 2024-02-12T04:53:16Z

I think you also need this change:

diff --git a/flake.nix b/flake.nix
index ad2f9b29..dc4e503c 100644
--- a/flake.nix
+++ b/flake.nix
@@ -150,6 +150,7 @@
             packages =
               {
                 default = config.legacyPackages.llamaPackages.llama-cpp;
+                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs pkgs.stdenv.isLinux {
                 opencl = config.packages.default.override { useOpenCL = true; };
@@ -157,7 +158,6 @@

                 mpi-cpu = config.packages.default.override { useMpi = true; };
                 mpi-cuda = config.packages.default.override { useMpi = true; };
-                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs (system == "x86_64-linux") {
                 rocm = config.legacyPackages.llamaPackagesRocm.llama-cpp;

0cc4m · 2024-02-12T05:07:36Z

I think you also need this change:

diff --git a/flake.nix b/flake.nix
index ad2f9b29..dc4e503c 100644
--- a/flake.nix
+++ b/flake.nix
@@ -150,6 +150,7 @@
             packages =
               {
                 default = config.legacyPackages.llamaPackages.llama-cpp;
+                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs pkgs.stdenv.isLinux {
                 opencl = config.packages.default.override { useOpenCL = true; };
@@ -157,7 +158,6 @@

                 mpi-cpu = config.packages.default.override { useMpi = true; };
                 mpi-cuda = config.packages.default.override { useMpi = true; };
-                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs (system == "x86_64-linux") {
                 rocm = config.legacyPackages.llamaPackagesRocm.llama-cpp;

You can also push commits to this PR. I think it's better if you do that, I have no clue what I'm doing in the nix files.

philiptaron · 2024-02-13T02:20:15Z

You can also push commits to this PR. I think it's better if you do that, I have no clue what I'm doing in the nix files.

It might be significantly delayed then; it's a week of real-life responsibilities for me. 🙂

dokterbob · 2024-02-13T20:28:54Z

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

Sure, happy to test!

@dokterbob I added the checks. Can you check if it works on Apple?

I tried build it with nix locally and it didn't throw an error, but I have no idea how nix works or how to use it. @philiptaron Let me know if it looks correct. The CI should fail if it doesn't work, right?

Runs like a charm and pushed @philiptaron's patch. Not get into nix to test it though.

philiptaron

I built this on Sonoma, both Intel Mac and M2 Mac. Compiles and runs. If @ggerganov doesn't object, I aim to merge this in 12 hours.

ggml-vulkan.cpp

philiptaron

I took a look at the code; I'd like to see the refactor tractor continue to plow, @0cc4m. 😀

philiptaron · 2024-02-15T00:23:43Z

ggml-vulkan.cpp

+    std::vector<const char*> layers;
+
+    if (validation_ext) {
+        layers.push_back("VK_LAYER_KHRONOS_validation");
+    }
+    std::vector<const char*> extensions;
+    if (validation_ext) {
+        extensions.push_back("VK_EXT_validation_features");
+    }
+    if (portability_enumeration_ext) {
+        extensions.push_back("VK_KHR_portability_enumeration");
+    }
+    vk::InstanceCreateInfo instance_create_info(vk::InstanceCreateFlags{}, &app_info, layers, extensions);
+    if (portability_enumeration_ext) {
+        instance_create_info.flags |= vk::InstanceCreateFlagBits::eEnumeratePortabilityKHR;
+    }
+
+    std::vector<vk::ValidationFeatureEnableEXT> features_enable;
+    vk::ValidationFeaturesEXT validation_features;
+
+    if (validation_ext) {
+        features_enable = { vk::ValidationFeatureEnableEXT::eBestPractices };
+        validation_features = {
+            features_enable,
+            {},
+        };
+        validation_features.setPNext(nullptr);
+        instance_create_info.setPNext(&validation_features);
+
+        std::cerr << "ggml_vulkan: Validation layers enabled" << std::endl;
+    }


I was hoping that these constants would have one home:

"VK_LAYER_KHRONOS_validation"
"VK_EXT_validation_features"
"VK_KHR_portability_enumeration"

Could you accomplish that by making a function that returned or mutated the two variables features_enable and validation_features?

That way, this generic code could stay unaware of the sort of features and validation that got added, and the system-specific code below would be able to add (as needed) the right features and validation.

I could, but let's not overextend the scope and duration of this PR. If you wanna do a refactor PR yourself or in collaboration with me, or gather suggestions in an issue, that's fine.

I went to try to do that refactoring with a spare hour this morning, and ran into a rebase error (on 8b38eb5) that I could not see how to resolve in a straightforward manner. Could you or @dokterbob rebase this branch on master appropriately?

I didn't have any issues merging upstream into the branch, not sure what your problem was.

To deal with the linker error that Vulkan currently has with cmake, you can use this workaround patch:

diff --git a/examples/gguf/CMakeLists.txt b/examples/gguf/CMakeLists.txt index 6481f087..85950145 100644 --- a/examples/gguf/CMakeLists.txt +++ b/examples/gguf/CMakeLists.txt @@ -1,5 +1,5 @@ set(TARGET gguf) add_executable(${TARGET} gguf.cpp) install(TARGETS ${TARGET} RUNTIME) -target_link_libraries(${TARGET} PRIVATE ggml ${CMAKE_THREAD_LIBS_INIT}) +target_link_libraries(${TARGET} PRIVATE common ggml ${CMAKE_THREAD_LIBS_INIT}) target_compile_features(${TARGET} PRIVATE cxx_std_11)

Closes ggerganov#5304

Refs: - https://chat.openai.com/share/7020ce72-65fc-45ec-b7be-9d9d798a5f3f - SaschaWillems/Vulkan#954 - haasn/libplacebo#128 - KhronosGroup/Vulkan-Samples#476

@rbourgeat

Based on work by @rbourgeat in https://github.com/ggerganov/llama.cpp/pull/5322/files

… clean up ggml_vk_instance_init()

Author: Philip Taron <philip.taron@gmail.com> Date: Tue Feb 13 20:28:02 2024 +0000

philiptaron · 2024-02-19T22:13:00Z

I intend to merge after CI completes.

ggerganov · 2024-02-21T14:51:05Z

Keep in mind to squash the commits when merging in the future 👍

philiptaron · 2024-02-21T15:23:12Z

Keep in mind to squash the commits when merging in the future 👍

If that's the style preferred in this repo, I'll make sure to do that next time.

thewh1teagle · 2024-08-19T12:36:18Z

Any chance we should enable it on linux/windows as well? I'm getting similar error with whisper.cpp + Vulkan.

dokterbob mentioned this pull request Feb 3, 2024

Build errors with Vulkan on Mac #5304

Closed

0cc4m self-requested a review February 3, 2024 21:10

0cc4m reviewed Feb 3, 2024

View reviewed changes

ggml-vulkan.cpp Outdated Show resolved Hide resolved

0cc4m mentioned this pull request Feb 4, 2024

Vulkan Apple Silicon compatibility #5322

Closed

philiptaron approved these changes Feb 14, 2024

View reviewed changes

ggml-vulkan.cpp Outdated Show resolved Hide resolved

0cc4m approved these changes Feb 14, 2024

View reviewed changes

philiptaron requested changes Feb 15, 2024

View reviewed changes

philiptaron mentioned this pull request Feb 15, 2024

llama-cpp: add support for Vulkan backend NixOS/nixpkgs#287554

Closed

13 tasks

dokterbob and others added 7 commits February 19, 2024 14:09

Allow for Vulkan build with Accelerate.

f3db6c8

Closes ggerganov#5304

Resolve ErrorIncompatibleDriver with Vulkan on MacOS.

eecc149

Refs: - https://chat.openai.com/share/7020ce72-65fc-45ec-b7be-9d9d798a5f3f - SaschaWillems/Vulkan#954 - haasn/libplacebo#128 - KhronosGroup/Vulkan-Samples#476

Add preprocessor checks for Apple devices.

bdad1f2

Based on work by @rbourgeat in https://github.com/ggerganov/llama.cpp/pull/5322/files

Add check for VK_KHR_portability_enumeration for MoltenVK support

9c18f8c

Refactor validation and enumeration platform checks into functions to…

790b700

… clean up ggml_vk_instance_init()

Enable Vulkan MacOS CI

8c42c31

nix: now that we can do so, allow MacOS to build Vulkan binaries

b7b44e0

Author: Philip Taron <philip.taron@gmail.com> Date: Tue Feb 13 20:28:02 2024 +0000

philiptaron force-pushed the macos_vulkan branch from d735894 to b7b44e0 Compare February 19, 2024 22:10

philiptaron approved these changes Feb 19, 2024

View reviewed changes

philiptaron merged commit 633782b into ggerganov:master Feb 19, 2024
55 checks passed

dokterbob deleted the macos_vulkan branch February 21, 2024 11:03

philiptaron mentioned this pull request Feb 21, 2024

llama-cpp: pull upstream flake changes NixOS/nixpkgs#289513

Merged

13 tasks

netrunnereve mentioned this pull request Feb 28, 2024

Support Vulkan versions older than v1.3.208 (eEnumeratePortabilityKHR) #5757

Merged

dokterbob mentioned this pull request Aug 16, 2024

[Feature] Support Vulkan on Intel Macs nomic-ai/gpt4all#2114

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulkan build on MacOS #5311

Vulkan build on MacOS #5311

dokterbob commented Feb 3, 2024

philiptaron commented Feb 4, 2024

dokterbob commented Feb 6, 2024

philiptaron commented Feb 6, 2024

slp commented Feb 7, 2024

dokterbob commented Feb 8, 2024

0cc4m commented Feb 8, 2024

dokterbob commented Feb 10, 2024

0cc4m commented Feb 10, 2024

philiptaron commented Feb 12, 2024

0cc4m commented Feb 12, 2024

philiptaron commented Feb 13, 2024

dokterbob commented Feb 13, 2024 •

edited

Loading

philiptaron left a comment

philiptaron left a comment

philiptaron Feb 15, 2024

0cc4m Feb 15, 2024

philiptaron Feb 15, 2024

0cc4m Feb 15, 2024

philiptaron commented Feb 19, 2024

ggerganov commented Feb 21, 2024

philiptaron commented Feb 21, 2024

thewh1teagle commented Aug 19, 2024

Vulkan build on MacOS #5311

Vulkan build on MacOS #5311

Conversation

dokterbob commented Feb 3, 2024

philiptaron commented Feb 4, 2024

dokterbob commented Feb 6, 2024

philiptaron commented Feb 6, 2024

slp commented Feb 7, 2024

dokterbob commented Feb 8, 2024

0cc4m commented Feb 8, 2024

dokterbob commented Feb 10, 2024

0cc4m commented Feb 10, 2024

philiptaron commented Feb 12, 2024

0cc4m commented Feb 12, 2024

philiptaron commented Feb 13, 2024

dokterbob commented Feb 13, 2024 • edited Loading

philiptaron left a comment

Choose a reason for hiding this comment

philiptaron left a comment

Choose a reason for hiding this comment

philiptaron Feb 15, 2024

Choose a reason for hiding this comment

0cc4m Feb 15, 2024

Choose a reason for hiding this comment

philiptaron Feb 15, 2024

Choose a reason for hiding this comment

0cc4m Feb 15, 2024

Choose a reason for hiding this comment

philiptaron commented Feb 19, 2024

ggerganov commented Feb 21, 2024

philiptaron commented Feb 21, 2024

thewh1teagle commented Aug 19, 2024

dokterbob commented Feb 13, 2024 •

edited

Loading