You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Typically, we use fp16 computation to improve inference speed
Because the weight value of conv2_1 is large, fp16 accumulation may cause numerical overflow, so fp16 needs to be disabled individually for conv2_1, while other layers continue to use fp16 mode
Add 31=1 i.e. (1<<0) as disabled bit to disable fp16
create a new param entry with id 31=uint
use bit for per-layer feature masking
Sample use case
Typically, we use fp16 computation to improve inference speed
Because the weight value of
conv2_1
is large, fp16 accumulation may cause numerical overflow, so fp16 needs to be disabled individually forconv2_1
, while other layers continue to use fp16 modeAdd
31=1
i.e.(1<<0)
as disabled bit to disable fp16It is also possible to control
num_threads
for each layer individually, but it is not very useful, so no more precious bits are usedThese masks will be implemented, and more bits can be used to achieve other needs in the future
The text was updated successfully, but these errors were encountered: