You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
optimizer fix
progress bar comment out temporarily
some changes to train_tpu
int mask instead of float
pfpfpfpf
fix
printing device index per loop
bkpt to investigate resize_ call
attempting to init buffer size to 2*dim
bkpt
better print
do not drop records when computing loss
Changes that reduce graph compiles.
* Loss function replaced with an equivalent logic that doesn't resize
tensors.
* cli args changed to guarantee consistency
* collate_tokens function in fairseq/data/data_utils.py overwritten to
guarantee consistency
undoing some changes made while debugging
progress_bar implements len
some irrelevant changes to train_tpu.py
new xla changes
bug fix in enable_torch_version
removing the last batch that is of diferent size from the iterator
delete optimizer step in fairseq s trainer
Added `self.xla` flag that controls if Trainer includes optimizer step
+ Tried to include more explanation why skip optimizer step this time
deleted obsolete file
add norm clipping count back in (#4)
remove grad norm clip count (#5)
Change masked_fill_ input in loss in order to accomodate necessary pytorch changes (#6)
Adding tpu capabilities to train.py (#8)
* Adding tpu capabilities to train.py
* flush when printing for better user experience
* separated cli_main into parse_args, maingpu and maintpu
deleted unused line in datautils.py
Enumerate the loader in training and validation (#9)
* Adding tpu capabilities to train.py
* flush when printing for better user experience
* separated cli_main into parse_args, maingpu and maintpu
deleted unused line in datautils.py
* Enumerate the loader
* enumerate the loader
Add option to assert on training and/or validation loss (#10)
* Add option to assert on training and/or validation loss
* applied suggestion
None loss should be filled to inf (#11)
Enabling multiprocessing for fairseq training. (#12)
* initial commit for multiprocess api
* indentation fixes and import fix
* no need to softlink, fix save/load
* Remove the hacks to only save from master ordinal as xm.save takes care of that
* fix indentation; 3 -> 4 spaces
* Moved xu.eprints after spawn and dropping last batches better
trainers->trainer (#13)
fix bug in assert_on_losses
Replace usage of unsqueeze with transpose + broadcasting (#15)
remove attn mask + loss rewrite + save per host +
format
suppress loss report
allow usage of batch_by_size in translation.
attn_weights masked fill in place
Clean up the log output suppressing a bit
Revert multihead attn's in_proj code changes
non-rebased tpu branch is about 10% faster on TPUs
compared to the rebased branch. The regression is inside multihead
attn's in_proj mechanism. Reverting the relevant changes to preserve
performance.
Pass correct args to the new get_valid_stats function
Send meters to device in order not to fail training when resuming dfrom chkpt
0 commit comments