Skip to content

Conversation

ThoreKoritzius
Copy link

This PR adds Metal backend support for POOL_1D operators.
Until now, only the POOL_2D was implemented for Metal.
With this change, models relying on 1D pooling can now run fully accelerated on Apple GPUs through Metal in an analog way.


Implementation Details

  • Added new Metal compute kernels in ggml-metal.metal:

    • kernel_pool_1d_max_f32
    • kernel_pool_1d_avg_f32
  • Implemented ggml_metal_op_pool_1d analogous to ggml_metal_op_pool_2d, including:

    • Argument struct (ggml_metal_kargs_pool_1d)x
    • Threadgroup size calculation
    • Encoding input/output buffers

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Oct 5, 2025
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Need to add tests
  • The implementation assumes contiguous data - either require this to be true, or extend the implementation to support non-cont data

int np;
int k0;
int s0;
int p0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of the arguments does not match the initialization in ggml-metal-ops.cpp.

Also, some of these arguments like np need to be 64-bit integers.

@tommarques56
Copy link

Hi @ThoreKoritzius, I can run an automated high-severity-only LLM review on this PR and post a single focused inline comment. Reply with "approve" or add a comment saying "@tommarques56 approve" to proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants