FAQ for TransNets

1. Training TransNets: why is the loss computed between a dropped out vector and non-dropped out vector

In Algorithm 1 for training TransNets, in Step 2 (line 15) the loss is computed between a dropped out version of the approximate latent representation ($\bar{z}_L$) and a non-dropped out version of the latent representation of the original review ($x_T$). This is because, $x_T$ is the ground truth (or final objective) for this particular section of training. As a sanity check, training TransNets with minimizing the loss with a dropped out version of $x_T$ gave a lower performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ for TransNets

Clone this wiki locally