Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan build on MacOS #5311

Merged
merged 7 commits into from
Feb 19, 2024
Merged

Vulkan build on MacOS #5311

merged 7 commits into from
Feb 19, 2024

Conversation

dokterbob
Copy link
Contributor

Significant (~30%) performance boost on Intel Mac.

I'm getting up to ~10 toks/s with 28 layers offloaded to a 4GB AMD Radeon Pro 5500M. Up from ~7 tok/s on just CPU.

@0cc4m 0cc4m self-requested a review February 3, 2024 21:10
ggml-vulkan.cpp Outdated Show resolved Hide resolved
@philiptaron
Copy link
Collaborator

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

@dokterbob
Copy link
Contributor Author

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line?
60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

@philiptaron
Copy link
Collaborator

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line? 60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

Yeah, just remove all vulkan references from broken or badPlatform.

To install nix, follow https://github.com/DeterminateSystems/nix-installer. Then nix build github:dokterbob/llama.cpp/macos_vulkan. You can uninstall Nix afterwards if you don't want it around.

@slp
Copy link
Collaborator

slp commented Feb 7, 2024

This should probably have it's own issue, but I've tested this PR on an Apple Silicon M1 and noticed I was getting garbage output to my prompts when batch_size != 1. Rebuilding with LLAMA_VULKAN_RUN_TESTS=1 I found that only M-sized matmuls work properly, both S-sized and L-sized fail. Here's an example:

TEST F16_F32_S m=100 n=46 k=558 batch=2 split_k=1 matmul 4.609ms avg_err=6.90319
m = 32 n = 0 b = 0
Actual result:

               0       1       2       3       4       5       6       7       8       9
     27:    4.73   12.32    1.43    6.25   -2.21    7.54    6.00    1.09   13.29    6.43
     28:   -2.47    1.54   11.37    7.52  -15.77   11.56   -1.01   -2.09   -0.30  -11.93
     29:   -7.33   -4.47   -4.05   -6.62   -3.47   -7.88    0.44   -5.68  -15.74   -0.46
     30:    6.85    3.08    4.87    5.17    4.22    6.31  -12.52   -3.55   -4.75    2.67
     31:  -16.09    3.49   -8.56   -8.89   -0.33    6.08  -13.39    5.76  -13.02   -8.90
     32:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     33:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     34:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     35:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
     36:    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
Expected result:

               0       1       2       3       4       5       6       7       8       9
     27:    4.73   12.32    1.43    6.25   -2.21    7.54    6.00    1.09   13.29    6.43
     28:   -2.47    1.54   11.37    7.52  -15.77   11.56   -1.01   -2.09   -0.30  -11.93
     29:   -7.33   -4.47   -4.05   -6.62   -3.47   -7.88    0.44   -5.68  -15.74   -0.46
     30:    6.85    3.08    4.87    5.17    4.22    6.31  -12.52   -3.55   -4.75    2.67
     31:  -16.09    3.49   -8.56   -8.89   -0.33    6.08  -13.39    5.76  -13.02   -8.90
     32:   12.42    6.19   -2.81    9.91   -9.84    1.53   -5.20   -0.57   -5.79   -4.55
     33:   -7.44   -0.05   12.08   -1.73    5.63   -6.50   -9.00  -17.33   -5.36   -9.06
     34:   -3.51    0.46   -5.92   -3.95   -4.04   15.13   -1.47   -2.21    0.37   10.94
     35:    8.12    1.64  -13.92    3.49    7.47    8.52   -4.61   16.53    1.28   -4.64
     36:   -3.86   -9.55    7.20   -3.38    4.28    2.46   11.96   -4.71    0.67   16.10

@0cc4m is this something expected with integrated GPUs? Can I help debug this further somehow?

@dokterbob
Copy link
Contributor Author

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line? 60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

Yeah, just remove all vulkan references from broken or badPlatform.

To install nix, follow https://github.com/DeterminateSystems/nix-installer. Then nix build github:dokterbob/llama.cpp/macos_vulkan. You can uninstall Nix afterwards if you don't want it around.

I can make requested patches today but testing them locally would take me longer, ETA ~week.

Is there a way we can get this in sooner? For Intel Mac users this really is a huge improvement.

@0cc4m
Copy link
Collaborator

0cc4m commented Feb 8, 2024

Could you edit the flake (see 60ecf09) and see if you can build with Vulkan on x86_64 MacOS?

Does that imply removing useVulkan on this line? 60ecf09#diff-ed99c076a4accf9ff1d7ad65a79dc3802df6bc8c91c9879e12c766a1b414272bR262

Yeah, just remove all vulkan references from broken or badPlatform.
To install nix, follow https://github.com/DeterminateSystems/nix-installer. Then nix build github:dokterbob/llama.cpp/macos_vulkan. You can uninstall Nix afterwards if you don't want it around.

I can make requested patches today but testing them locally would take me longer, ETA ~week.

Is there a way we can get this in sooner? For Intel Mac users this really is a huge improvement.

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

@dokterbob
Copy link
Contributor Author

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

Sure, happy to test!

@0cc4m
Copy link
Collaborator

0cc4m commented Feb 10, 2024

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

Sure, happy to test!

@dokterbob I added the checks. Can you check if it works on Apple?

I tried build it with nix locally and it didn't throw an error, but I have no idea how nix works or how to use it. @philiptaron Let me know if it looks correct. The CI should fail if it doesn't work, right?

@philiptaron
Copy link
Collaborator

I think you also need this change:

diff --git a/flake.nix b/flake.nix
index ad2f9b29..dc4e503c 100644
--- a/flake.nix
+++ b/flake.nix
@@ -150,6 +150,7 @@
             packages =
               {
                 default = config.legacyPackages.llamaPackages.llama-cpp;
+                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs pkgs.stdenv.isLinux {
                 opencl = config.packages.default.override { useOpenCL = true; };
@@ -157,7 +158,6 @@

                 mpi-cpu = config.packages.default.override { useMpi = true; };
                 mpi-cuda = config.packages.default.override { useMpi = true; };
-                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs (system == "x86_64-linux") {
                 rocm = config.legacyPackages.llamaPackagesRocm.llama-cpp;

@0cc4m
Copy link
Collaborator

0cc4m commented Feb 12, 2024

I think you also need this change:

diff --git a/flake.nix b/flake.nix
index ad2f9b29..dc4e503c 100644
--- a/flake.nix
+++ b/flake.nix
@@ -150,6 +150,7 @@
             packages =
               {
                 default = config.legacyPackages.llamaPackages.llama-cpp;
+                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs pkgs.stdenv.isLinux {
                 opencl = config.packages.default.override { useOpenCL = true; };
@@ -157,7 +158,6 @@

                 mpi-cpu = config.packages.default.override { useMpi = true; };
                 mpi-cuda = config.packages.default.override { useMpi = true; };
-                vulkan = config.packages.default.override { useVulkan = true; };
               }
               // lib.optionalAttrs (system == "x86_64-linux") {
                 rocm = config.legacyPackages.llamaPackagesRocm.llama-cpp;

You can also push commits to this PR. I think it's better if you do that, I have no clue what I'm doing in the nix files.

@philiptaron
Copy link
Collaborator

You can also push commits to this PR. I think it's better if you do that, I have no clue what I'm doing in the nix files.

It might be significantly delayed then; it's a week of real-life responsibilities for me. 🙂

@dokterbob
Copy link
Contributor Author

dokterbob commented Feb 13, 2024

I can try doing the nix tests, I think. I'll also add code to check if the extension is available. It would be good if you could test that afterwards, since I don't have a Mac system. Then we should be able to merge. Would that work?

Sure, happy to test!

@dokterbob I added the checks. Can you check if it works on Apple?

I tried build it with nix locally and it didn't throw an error, but I have no idea how nix works or how to use it. @philiptaron Let me know if it looks correct. The CI should fail if it doesn't work, right?

Runs like a charm and pushed @philiptaron's patch. Not get into nix to test it though.

Copy link
Collaborator

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I built this on Sonoma, both Intel Mac and M2 Mac. Compiles and runs. If @ggerganov doesn't object, I aim to merge this in 12 hours.

ggml-vulkan.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the code; I'd like to see the refactor tractor continue to plow, @0cc4m. 😀

Comment on lines +1099 to +1141
std::vector<const char*> layers;

if (validation_ext) {
layers.push_back("VK_LAYER_KHRONOS_validation");
}
std::vector<const char*> extensions;
if (validation_ext) {
extensions.push_back("VK_EXT_validation_features");
}
if (portability_enumeration_ext) {
extensions.push_back("VK_KHR_portability_enumeration");
}
vk::InstanceCreateInfo instance_create_info(vk::InstanceCreateFlags{}, &app_info, layers, extensions);
if (portability_enumeration_ext) {
instance_create_info.flags |= vk::InstanceCreateFlagBits::eEnumeratePortabilityKHR;
}

std::vector<vk::ValidationFeatureEnableEXT> features_enable;
vk::ValidationFeaturesEXT validation_features;

if (validation_ext) {
features_enable = { vk::ValidationFeatureEnableEXT::eBestPractices };
validation_features = {
features_enable,
{},
};
validation_features.setPNext(nullptr);
instance_create_info.setPNext(&validation_features);

std::cerr << "ggml_vulkan: Validation layers enabled" << std::endl;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping that these constants would have one home:

"VK_LAYER_KHRONOS_validation"
"VK_EXT_validation_features"
"VK_KHR_portability_enumeration"

Could you accomplish that by making a function that returned or mutated the two variables features_enable and validation_features?

That way, this generic code could stay unaware of the sort of features and validation that got added, and the system-specific code below would be able to add (as needed) the right features and validation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, but let's not overextend the scope and duration of this PR. If you wanna do a refactor PR yourself or in collaboration with me, or gather suggestions in an issue, that's fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went to try to do that refactoring with a spare hour this morning, and ran into a rebase error (on 8b38eb5) that I could not see how to resolve in a straightforward manner. Could you or @dokterbob rebase this branch on master appropriately?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have any issues merging upstream into the branch, not sure what your problem was.

To deal with the linker error that Vulkan currently has with cmake, you can use this workaround patch:

diff --git a/examples/gguf/CMakeLists.txt b/examples/gguf/CMakeLists.txt
index 6481f087..85950145 100644
--- a/examples/gguf/CMakeLists.txt
+++ b/examples/gguf/CMakeLists.txt
@@ -1,5 +1,5 @@
 set(TARGET gguf)
 add_executable(${TARGET} gguf.cpp)
 install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE ggml ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE common ggml ${CMAKE_THREAD_LIBS_INIT})
 target_compile_features(${TARGET} PRIVATE cxx_std_11)

@philiptaron
Copy link
Collaborator

I intend to merge after CI completes.

@philiptaron philiptaron merged commit 633782b into ggerganov:master Feb 19, 2024
55 checks passed
@dokterbob dokterbob deleted the macos_vulkan branch February 21, 2024 11:03
@ggerganov
Copy link
Owner

Keep in mind to squash the commits when merging in the future 👍

@philiptaron
Copy link
Collaborator

Keep in mind to squash the commits when merging in the future 👍

If that's the style preferred in this repo, I'll make sure to do that next time.

@thewh1teagle
Copy link
Contributor

Any chance we should enable it on linux/windows as well? I'm getting similar error with whisper.cpp + Vulkan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants