-
Notifications
You must be signed in to change notification settings - Fork 538
[API] use softmax with length, and interleaved matmul for BERT #1136
[API] use softmax with length, and interleaved matmul for BERT #1136
Conversation
…1091) * use softmax with length, and interleaved matmul * push backward compatibility fix * fix failing unittests for output_all_encodings, and valid-len=None * fix lint * Update bert.py * amp patch * Update MXNet 1.6 pre-release version tested on CI * Update bert.py Co-authored-by: Leonard Lausen <leonard@lausen.nl>
Codecov Report
@@ Coverage Diff @@
## master #1136 +/- ##
==========================================
- Coverage 87.76% 87.34% -0.42%
==========================================
Files 67 67
Lines 6310 6386 +76
==========================================
+ Hits 5538 5578 +40
- Misses 772 808 +36
|
Hi @eric-haibin-lin, |
Job PR-1136/1 is complete. |
Hi @fhieber the main difference is in this code block:
I think the new ops assume the projection is done with interleaving weights for k/q/v. The concatenated weight should have shape |
Job PR-1136/2 is complete. |
Job PR-1136/3 is complete. |
Job PR-1136/4 is complete. |
replaced #1091 |
value_weight = value_weight.reshape(shape=(self._num_heads, -1, 0), reverse=True) | ||
in_weight = F.concat(query_weight, key_weight, value_weight, dim=-2) | ||
in_weight = in_weight.reshape(shape=(-1, 0), reverse=True) | ||
in_bias = F.concat(query_bias, key_bias, value_bias, dim=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to avoid concat for every iteration? Or at least for inference, we only need concat once, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For inference, yes that's true. It's similar to RNN. If we figure out a way to avoid the weight concat in RNN, we can apply that here, too. @TaoLv do you have any idea/suggestion?
Description
Checklist
Essentials
Changes
Comments