Skip to content

Commit

Permalink
coreml : fix ANE optimized encoder (#1716)
Browse files Browse the repository at this point in the history
  • Loading branch information
philloooo authored Jan 4, 2024
1 parent ab0a859 commit ba5bcde
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 17 deletions.
4 changes: 2 additions & 2 deletions coreml/whisper-encoder.mm
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@

// select which device to run the Core ML model on
MLModelConfiguration *config = [[MLModelConfiguration alloc] init];
config.computeUnits = MLComputeUnitsCPUAndGPU;
// config.computeUnits = MLComputeUnitsCPUAndGPU;
//config.computeUnits = MLComputeUnitsCPUAndNeuralEngine;
//config.computeUnits = MLComputeUnitsAll;
config.computeUnits = MLComputeUnitsAll;

const void * data = CFBridgingRetain([[whisper_encoder_impl alloc] initWithContentsOfURL:url_model configuration:config error:nil]);

Expand Down
15 changes: 1 addition & 14 deletions models/convert-whisper-to-coreml.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,20 +143,7 @@ def forward(self, x: Tensor):
x = block(x)

x = self.ln_post(x)

# """
# TODO:
# I think we need to transpose the result here to make it fit whisper.cpp memory order.
# However, even doing this, the results are still wrong. Kind of less wrong compared to
# not transposing, but still wrong.

# Also, I don't know why the original OpenAI implementation does not need to transpose

# transpose to (batch_size, n_ctx, n_state)
# x : torch.Tensor, shape = (batch_size, n_state, 1, n_ctx)

# """
# x = x.transpose(1,3)
x = x.squeeze(2).transpose(1, 2)

return x

Expand Down
2 changes: 1 addition & 1 deletion models/generate-coreml-model.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ if [[ $mname == "-h5" ]]; then
echo $mpath
python3 models/convert-h5-to-coreml.py --model-name $mname --model-path $mpath --encoder-only True
else
python3 models/convert-whisper-to-coreml.py --model $mname --encoder-only True
python3 models/convert-whisper-to-coreml.py --model $mname --encoder-only True --optimize-ane True
fi

xcrun coremlc compile models/coreml-encoder-${mname}.mlpackage models/
Expand Down

1 comment on commit ba5bcde

@Josscii
Copy link
Contributor

@Josscii Josscii commented on ba5bcde Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2024-01-05 09:24:22.023112+0800 [4575:1504107] Error: Transpose unit is not supported.
2024-01-05 09:24:22.023235+0800 [4575:1504107] Error: Transpose unit is not supported.
2024-01-05 09:24:22.023298+0800 [4575:1504107] Error: Transpose unit is not supported.
2024-01-05 09:24:22.034869+0800 [4575:1504107] Error: Transpose unit is not supported.
2024-01-05 09:24:22.035000+0800 [4575:1504107] Error: Transpose unit is not supported.
2024-01-05 09:24:22.035063+0800 [4575:1504107] Error: Transpose unit is not supported.
2024-01-05 09:24:27.377632+0800 [4575:1504107] [espresso] [Espresso::handle_ex_plan] exception=at at /ggml-base-encoder.mlmodelc/model.mil:14:12: In 'ios16.conv' operations, tensors parameter x[0], parameter weight[0], parameter bias[0], and output at index 0 must have the same data type.
2024-01-05 09:24:27.377830+0800 [4575:1504107] [coreml] Error plan build: -1.


with this update, I generated new coreml encoder and run on iPhone XR with iOS 16, it output above errors

Please sign in to comment.