bit mask support for dropout #656

eric-haibin-lin · 2020-02-27T04:09:00Z

For dropout training, one can save the dropout mask with 1 bit per coordinate. Can we support that in DNNL? Memory is precious.

vpirogov · 2020-02-28T17:24:35Z

Hi @eric-haibin-lin,

Thank you for your question. Technically nothing prevents us from introducing dropout primitive in the library, including the 1-bit mask support. The main question we need to answer to make this happen is what API and behavior should look like to make the functionality generally useful. For dropout the main source of concern is the fact that it relies on random number generator, which may behave differently in different applications, so having random number generator as part of implementation would be a major source of incompatibility and thread safety issues.

A couple of follow up questions so that I can better understand what you are looking for:

What you expect from the DNNL implementation (vs implementing the functionality directly in C++)?
What API will make sense to you? Is a function that takes pre-computed mask and performs dropout viable?

TaoLv · 2020-03-02T06:37:28Z

I think the random number generator is taking a significant part in the execution of dropout. That's why we optimized it in MXNet with viRngBernoulli from VSL. But now viRngBernoulli cannot meet the requirement anymore for bit mask generation. So I would expect DNNL covers the RNG part and then the forward interface should look like:

input: source data, random seed, distribution type, mask type, p value
output: destination data, workspace for the mask

in which, the mask type can be bit mask, boolean mask, or integer mask. I'm not sure if boolean mask or integer mask has any advantage but they're used in frameworks.

@eric-haibin-lin @apeforest Could you please share more insights about the random seed distribution in MXNet and reproducibility of the operator?

apeforest · 2020-03-04T07:01:35Z

The random seed should be taken from MXNet so that if user specify a random seed in mxnet it should guarantee reproducibility. A similar approach has been done for cuDNN library: apache/mxnet#17547

TaoLv · 2020-06-20T02:47:14Z

I notice there is a RFC opened for this request. You may want to take a look. @eric-haibin-lin @apeforest @pengzhao-intel

eric-haibin-lin added the enhancement A feature or an optimization request label Feb 27, 2020

vpirogov self-assigned this Feb 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bit mask support for dropout #656

bit mask support for dropout #656

eric-haibin-lin commented Feb 27, 2020

vpirogov commented Feb 28, 2020

TaoLv commented Mar 2, 2020 •

edited

Loading

apeforest commented Mar 4, 2020

TaoLv commented Jun 20, 2020

bit mask support for dropout #656

bit mask support for dropout #656

Comments

eric-haibin-lin commented Feb 27, 2020

vpirogov commented Feb 28, 2020

TaoLv commented Mar 2, 2020 • edited Loading

apeforest commented Mar 4, 2020

TaoLv commented Jun 20, 2020

TaoLv commented Mar 2, 2020 •

edited

Loading